Training: 2021-03-17 15:16:57,543-rank_id: 0 Training: 2021-03-17 15:17:21,595-softmax weight init successfully! Training: 2021-03-17 15:17:21,595-softmax weight mom init successfully! Training: 2021-03-17 15:17:21,597-Total Step is: 333821 Training: 2021-03-17 15:18:04,342-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-17 15:18:30,549-Speed 4379.39 samples/sec Loss 47.1340 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 28 hours Training: 2021-03-17 15:18:46,867-Speed 3137.81 samples/sec Loss 45.3713 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 29 hours Training: 2021-03-17 15:18:57,842-Speed 4665.57 samples/sec Loss 44.2397 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 27 hours Training: 2021-03-17 15:19:09,009-Speed 4585.04 samples/sec Loss 42.8912 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 26 hours Training: 2021-03-17 15:19:20,213-Speed 4570.41 samples/sec Loss 41.4452 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 25 hours Training: 2021-03-17 15:19:30,903-Speed 4789.60 samples/sec Loss 40.5895 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 24 hours Training: 2021-03-17 15:19:42,620-Speed 4369.80 samples/sec Loss 39.7134 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 24 hours Training: 2021-03-17 15:19:57,680-Speed 3399.88 samples/sec Loss 39.2283 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 24 hours Training: 2021-03-17 15:20:09,343-Speed 4390.16 samples/sec Loss 38.8731 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 24 hours Training: 2021-03-17 15:20:20,121-Speed 4750.92 samples/sec Loss 38.4467 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 24 hours Training: 2021-03-17 15:20:31,142-Speed 4645.77 samples/sec Loss 38.0929 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 23 hours Training: 2021-03-17 15:20:42,976-Speed 4326.64 samples/sec Loss 37.7707 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 23 hours Training: 2021-03-17 15:20:54,025-Speed 4634.15 samples/sec Loss 37.4871 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:21:04,833-Speed 4737.94 samples/sec Loss 37.1898 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:21:15,988-Speed 4590.09 samples/sec Loss 36.8894 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:21:26,816-Speed 4729.03 samples/sec Loss 36.6130 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:21:37,199-Speed 4931.09 samples/sec Loss 36.3715 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:21:48,078-Speed 4706.89 samples/sec Loss 36.0795 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:21:58,764-Speed 4791.55 samples/sec Loss 35.7719 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:22:09,790-Speed 4643.78 samples/sec Loss 35.5271 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:22:20,723-Speed 4683.05 samples/sec Loss 35.2154 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:22:31,934-Speed 4567.39 samples/sec Loss 34.9557 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:22:43,030-Speed 4614.67 samples/sec Loss 34.6598 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:22:53,638-Speed 4826.89 samples/sec Loss 34.3489 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:23:04,252-Speed 4823.96 samples/sec Loss 34.0533 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:23:14,999-Speed 4764.26 samples/sec Loss 33.7734 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:23:25,615-Speed 4822.98 samples/sec Loss 33.4092 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:23:36,204-Speed 4835.90 samples/sec Loss 33.0992 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:23:46,983-Speed 4750.01 samples/sec Loss 32.7286 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:23:57,605-Speed 4820.29 samples/sec Loss 32.4543 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:24:08,250-Speed 4810.08 samples/sec Loss 32.0712 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:24:19,002-Speed 4762.55 samples/sec Loss 31.7612 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:24:29,959-Speed 4672.99 samples/sec Loss 31.4016 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:24:41,478-Speed 4445.02 samples/sec Loss 31.0789 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:24:52,179-Speed 4784.87 samples/sec Loss 30.7014 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:25:04,585-Speed 4127.16 samples/sec Loss 30.3852 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:25:17,233-Speed 4048.24 samples/sec Loss 30.0432 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:25:28,928-Speed 4378.16 samples/sec Loss 29.6926 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:25:40,241-Speed 4526.04 samples/sec Loss 29.4025 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 15:26:08,749-[lfw][2000]XNorm: 22.603132 Training: 2021-03-17 15:26:08,749-[lfw][2000]Accuracy-Flip: 0.95533+-0.01095 Training: 2021-03-17 15:26:08,749-[lfw][2000]Accuracy-Highest: 0.95533 Training: 2021-03-17 15:26:39,977-[cfp_fp][2000]XNorm: 20.525527 Training: 2021-03-17 15:26:39,978-[cfp_fp][2000]Accuracy-Flip: 0.74529+-0.01809 Training: 2021-03-17 15:26:39,978-[cfp_fp][2000]Accuracy-Highest: 0.74529 Training: 2021-03-17 15:27:04,965-[agedb_30][2000]XNorm: 21.807501 Training: 2021-03-17 15:27:04,965-[agedb_30][2000]Accuracy-Flip: 0.77817+-0.01417 Training: 2021-03-17 15:27:04,965-[agedb_30][2000]Accuracy-Highest: 0.77817 Training: 2021-03-17 15:27:15,672-Speed 536.52 samples/sec Loss 29.0489 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-17 15:27:27,113-Speed 4475.56 samples/sec Loss 28.7339 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-17 15:27:38,028-Speed 4690.74 samples/sec Loss 28.4732 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-17 15:27:48,889-Speed 4714.36 samples/sec Loss 28.1081 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-17 15:28:00,225-Speed 4517.06 samples/sec Loss 27.7247 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-17 15:28:11,021-Speed 4742.82 samples/sec Loss 27.4340 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:28:21,943-Speed 4687.80 samples/sec Loss 27.1067 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:28:32,753-Speed 4736.80 samples/sec Loss 26.7538 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:28:43,598-Speed 4721.15 samples/sec Loss 26.4600 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:28:54,232-Speed 4815.00 samples/sec Loss 26.1376 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:05,019-Speed 4746.40 samples/sec Loss 25.8663 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:15,681-Speed 4802.38 samples/sec Loss 25.6413 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:26,163-Speed 4884.99 samples/sec Loss 25.1535 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:36,882-Speed 4776.98 samples/sec Loss 24.9731 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:47,745-Speed 4713.62 samples/sec Loss 24.7384 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:29:58,496-Speed 4762.72 samples/sec Loss 24.4597 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:30:09,402-Speed 4694.97 samples/sec Loss 24.1453 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:30:20,055-Speed 4806.24 samples/sec Loss 23.8806 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:30:30,861-Speed 4738.61 samples/sec Loss 23.4866 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:30:41,694-Speed 4726.64 samples/sec Loss 23.1344 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:30:52,388-Speed 4787.84 samples/sec Loss 22.9995 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:03,366-Speed 4664.17 samples/sec Loss 22.6985 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:14,176-Speed 4736.82 samples/sec Loss 22.4667 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:24,979-Speed 4739.56 samples/sec Loss 22.2211 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:35,692-Speed 4779.46 samples/sec Loss 22.0672 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:46,471-Speed 4750.24 samples/sec Loss 21.7582 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:31:57,444-Speed 4666.36 samples/sec Loss 21.4974 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:32:08,368-Speed 4687.08 samples/sec Loss 21.2834 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:32:19,280-Speed 4692.43 samples/sec Loss 20.9427 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:32:30,201-Speed 4688.48 samples/sec Loss 20.7806 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:32:40,896-Speed 4787.43 samples/sec Loss 20.6119 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:32:51,826-Speed 4684.42 samples/sec Loss 20.3092 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:02,547-Speed 4776.11 samples/sec Loss 20.1465 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:13,347-Speed 4741.23 samples/sec Loss 19.9878 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:24,890-Speed 4435.70 samples/sec Loss 19.8396 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:35,613-Speed 4774.97 samples/sec Loss 19.6907 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:46,408-Speed 4743.15 samples/sec Loss 19.4906 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:33:57,026-Speed 4821.98 samples/sec Loss 19.2116 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:34:09,152-Speed 4222.93 samples/sec Loss 18.9715 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:34:21,598-Speed 4113.84 samples/sec Loss 18.8109 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 15:34:45,535-[lfw][4000]XNorm: 22.507231 Training: 2021-03-17 15:34:45,535-[lfw][4000]Accuracy-Flip: 0.98583+-0.00430 Training: 2021-03-17 15:34:45,535-[lfw][4000]Accuracy-Highest: 0.98583 Training: 2021-03-17 15:35:13,174-[cfp_fp][4000]XNorm: 18.912080 Training: 2021-03-17 15:35:13,174-[cfp_fp][4000]Accuracy-Flip: 0.85271+-0.00851 Training: 2021-03-17 15:35:13,174-[cfp_fp][4000]Accuracy-Highest: 0.85271 Training: 2021-03-17 15:35:36,895-[agedb_30][4000]XNorm: 21.235884 Training: 2021-03-17 15:35:36,896-[agedb_30][4000]Accuracy-Flip: 0.88633+-0.01318 Training: 2021-03-17 15:35:36,896-[agedb_30][4000]Accuracy-Highest: 0.88633 Training: 2021-03-17 15:35:47,362-Speed 596.99 samples/sec Loss 18.6730 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:35:59,056-Speed 4378.68 samples/sec Loss 18.3391 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:36:10,763-Speed 4373.65 samples/sec Loss 18.2300 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:36:21,723-Speed 4671.63 samples/sec Loss 18.1609 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:36:32,547-Speed 4730.50 samples/sec Loss 17.9474 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:36:43,549-Speed 4653.69 samples/sec Loss 17.8718 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:36:54,911-Speed 4506.51 samples/sec Loss 17.6046 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:05,569-Speed 4804.18 samples/sec Loss 17.4693 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:16,222-Speed 4806.21 samples/sec Loss 17.3624 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:27,264-Speed 4637.44 samples/sec Loss 17.2558 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:37,981-Speed 4777.76 samples/sec Loss 17.0482 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:48,846-Speed 4712.28 samples/sec Loss 16.7577 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:37:59,808-Speed 4671.32 samples/sec Loss 16.6648 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:38:10,647-Speed 4723.76 samples/sec Loss 16.5774 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:38:21,614-Speed 4668.73 samples/sec Loss 16.5285 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:38:32,651-Speed 4639.31 samples/sec Loss 16.3283 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:38:43,720-Speed 4625.71 samples/sec Loss 16.1802 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:38:54,749-Speed 4642.74 samples/sec Loss 16.0884 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:39:05,731-Speed 4662.25 samples/sec Loss 15.9854 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:39:16,969-Speed 4556.19 samples/sec Loss 15.8365 Epoch: 0 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:39:27,947-Speed 4664.05 samples/sec Loss 15.7278 Epoch: 0 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:39:38,940-Speed 4657.50 samples/sec Loss 15.6174 Epoch: 0 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:39:50,026-Speed 4618.84 samples/sec Loss 15.4652 Epoch: 0 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:01,094-Speed 4626.04 samples/sec Loss 15.4110 Epoch: 0 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:12,534-Speed 4475.76 samples/sec Loss 15.2914 Epoch: 0 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:23,581-Speed 4635.17 samples/sec Loss 15.1736 Epoch: 0 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:34,711-Speed 4600.55 samples/sec Loss 14.9783 Epoch: 0 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:45,450-Speed 4768.04 samples/sec Loss 14.8470 Epoch: 0 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:40:56,545-Speed 4614.85 samples/sec Loss 14.8004 Epoch: 0 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:41:07,191-Speed 4809.57 samples/sec Loss 14.7599 Epoch: 0 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:41:18,152-Speed 4671.06 samples/sec Loss 14.6666 Epoch: 0 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:41:29,296-Speed 4594.91 samples/sec Loss 14.5029 Epoch: 0 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:41:40,299-Speed 4653.27 samples/sec Loss 14.3765 Epoch: 0 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:41:51,175-Speed 4707.92 samples/sec Loss 14.3501 Epoch: 0 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:02,015-Speed 4723.48 samples/sec Loss 14.1645 Epoch: 0 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:12,851-Speed 4725.10 samples/sec Loss 14.1984 Epoch: 0 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:23,647-Speed 4743.04 samples/sec Loss 14.0694 Epoch: 0 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:34,535-Speed 4702.78 samples/sec Loss 14.0714 Epoch: 0 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:45,371-Speed 4725.39 samples/sec Loss 13.8419 Epoch: 0 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:42:56,261-Speed 4701.66 samples/sec Loss 13.7710 Epoch: 0 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:43:20,408-[lfw][6000]XNorm: 23.084155 Training: 2021-03-17 15:43:20,408-[lfw][6000]Accuracy-Flip: 0.99017+-0.00320 Training: 2021-03-17 15:43:20,408-[lfw][6000]Accuracy-Highest: 0.99017 Training: 2021-03-17 15:43:47,956-[cfp_fp][6000]XNorm: 19.691647 Training: 2021-03-17 15:43:47,957-[cfp_fp][6000]Accuracy-Flip: 0.88157+-0.01351 Training: 2021-03-17 15:43:47,957-[cfp_fp][6000]Accuracy-Highest: 0.88157 Training: 2021-03-17 15:44:11,611-[agedb_30][6000]XNorm: 22.714400 Training: 2021-03-17 15:44:11,612-[agedb_30][6000]Accuracy-Flip: 0.90533+-0.01310 Training: 2021-03-17 15:44:11,612-[agedb_30][6000]Accuracy-Highest: 0.90533 Training: 2021-03-17 15:44:23,356-Speed 587.87 samples/sec Loss 13.7176 Epoch: 0 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:44:34,475-Speed 4604.71 samples/sec Loss 13.5903 Epoch: 0 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:44:46,083-Speed 4411.31 samples/sec Loss 13.5684 Epoch: 0 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:44:57,070-Speed 4660.20 samples/sec Loss 13.4647 Epoch: 0 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:45:07,939-Speed 4711.08 samples/sec Loss 13.4140 Epoch: 0 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:45:19,829-Speed 4306.24 samples/sec Loss 13.3144 Epoch: 0 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:45:31,684-Speed 4318.89 samples/sec Loss 13.2084 Epoch: 0 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:45:42,761-Speed 4622.43 samples/sec Loss 13.1953 Epoch: 0 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:45:55,525-Speed 4011.46 samples/sec Loss 13.1069 Epoch: 0 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:46:06,223-Speed 4786.08 samples/sec Loss 13.0814 Epoch: 0 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:46:17,806-Speed 4420.58 samples/sec Loss 12.9237 Epoch: 0 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:46:28,648-Speed 4722.37 samples/sec Loss 12.8928 Epoch: 0 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:46:40,174-Speed 4442.47 samples/sec Loss 12.7937 Epoch: 0 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:46:51,039-Speed 4712.50 samples/sec Loss 12.7780 Epoch: 0 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:02,350-Speed 4526.76 samples/sec Loss 12.6723 Epoch: 0 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:13,345-Speed 4657.08 samples/sec Loss 12.5731 Epoch: 0 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:24,355-Speed 4650.22 samples/sec Loss 12.5650 Epoch: 0 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:35,113-Speed 4759.54 samples/sec Loss 12.5087 Epoch: 0 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:46,097-Speed 4661.70 samples/sec Loss 12.3974 Epoch: 0 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:47:56,720-Speed 4819.52 samples/sec Loss 12.4545 Epoch: 0 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:48:07,628-Speed 4694.30 samples/sec Loss 12.3604 Epoch: 0 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:48:18,515-Speed 4703.08 samples/sec Loss 12.3084 Epoch: 0 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:48:29,492-Speed 4664.44 samples/sec Loss 12.2642 Epoch: 0 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:48:40,762-Speed 4543.32 samples/sec Loss 12.1181 Epoch: 0 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:48:51,577-Speed 4734.43 samples/sec Loss 12.1053 Epoch: 0 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:02,527-Speed 4675.86 samples/sec Loss 12.0488 Epoch: 0 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:13,784-Speed 4548.60 samples/sec Loss 12.0514 Epoch: 0 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:25,033-Speed 4551.45 samples/sec Loss 11.9405 Epoch: 0 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:35,932-Speed 4697.97 samples/sec Loss 11.8598 Epoch: 0 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:46,889-Speed 4673.13 samples/sec Loss 11.8498 Epoch: 0 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:49:58,005-Speed 4606.12 samples/sec Loss 11.7889 Epoch: 0 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:50:09,192-Speed 4576.71 samples/sec Loss 11.7884 Epoch: 0 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:50:20,266-Speed 4624.07 samples/sec Loss 11.6004 Epoch: 0 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:50:31,318-Speed 4632.99 samples/sec Loss 11.6701 Epoch: 0 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:50:42,127-Speed 4736.85 samples/sec Loss 11.5994 Epoch: 0 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:50:52,980-Speed 4718.04 samples/sec Loss 11.5540 Epoch: 0 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:51:03,758-Speed 4750.80 samples/sec Loss 11.4812 Epoch: 0 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:51:14,832-Speed 4623.49 samples/sec Loss 11.4817 Epoch: 0 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:51:26,041-Speed 4567.94 samples/sec Loss 11.3409 Epoch: 0 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:51:37,258-Speed 4564.75 samples/sec Loss 11.3003 Epoch: 0 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:52:01,049-[lfw][8000]XNorm: 22.373793 Training: 2021-03-17 15:52:01,049-[lfw][8000]Accuracy-Flip: 0.99233+-0.00351 Training: 2021-03-17 15:52:01,049-[lfw][8000]Accuracy-Highest: 0.99233 Training: 2021-03-17 15:52:28,579-[cfp_fp][8000]XNorm: 19.814623 Training: 2021-03-17 15:52:28,579-[cfp_fp][8000]Accuracy-Flip: 0.87100+-0.01162 Training: 2021-03-17 15:52:28,579-[cfp_fp][8000]Accuracy-Highest: 0.88157 Training: 2021-03-17 15:52:52,364-[agedb_30][8000]XNorm: 21.387989 Training: 2021-03-17 15:52:52,364-[agedb_30][8000]Accuracy-Flip: 0.91867+-0.01901 Training: 2021-03-17 15:52:52,364-[agedb_30][8000]Accuracy-Highest: 0.91867 Training: 2021-03-17 15:53:02,921-Speed 597.69 samples/sec Loss 11.2335 Epoch: 0 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:53:13,622-Speed 4784.92 samples/sec Loss 11.3211 Epoch: 0 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:53:24,826-Speed 4570.03 samples/sec Loss 11.2500 Epoch: 0 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:53:35,929-Speed 4611.64 samples/sec Loss 11.1611 Epoch: 0 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:53:47,000-Speed 4625.22 samples/sec Loss 11.2188 Epoch: 0 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:53:57,952-Speed 4674.97 samples/sec Loss 11.1126 Epoch: 0 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-17 15:54:09,083-Speed 4600.00 samples/sec Loss 10.9889 Epoch: 0 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:54:20,568-Speed 4458.07 samples/sec Loss 11.0046 Epoch: 0 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:54:31,239-Speed 4798.45 samples/sec Loss 10.9309 Epoch: 0 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:54:42,417-Speed 4580.33 samples/sec Loss 10.9668 Epoch: 0 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:54:53,861-Speed 4474.15 samples/sec Loss 10.8705 Epoch: 0 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:55:04,908-Speed 4635.30 samples/sec Loss 10.8666 Epoch: 0 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:55:16,047-Speed 4596.74 samples/sec Loss 10.8510 Epoch: 0 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:55:27,537-Speed 4456.28 samples/sec Loss 10.7847 Epoch: 0 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:55:38,752-Speed 4565.49 samples/sec Loss 10.7080 Epoch: 0 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:55:49,674-Speed 4688.21 samples/sec Loss 10.6928 Epoch: 0 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:01,519-Speed 4322.75 samples/sec Loss 10.6618 Epoch: 0 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:13,522-Speed 4265.64 samples/sec Loss 10.5684 Epoch: 0 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:26,002-Speed 4102.70 samples/sec Loss 10.6605 Epoch: 0 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:37,665-Speed 4390.08 samples/sec Loss 10.5652 Epoch: 0 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:48,292-Speed 4818.35 samples/sec Loss 10.5298 Epoch: 0 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:56:59,321-Speed 4642.34 samples/sec Loss 10.5805 Epoch: 0 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:57:10,576-Speed 4549.49 samples/sec Loss 10.4131 Epoch: 0 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:57:21,514-Speed 4681.02 samples/sec Loss 10.5054 Epoch: 0 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:57:32,498-Speed 4661.64 samples/sec Loss 10.4518 Epoch: 0 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:57:43,440-Speed 4679.17 samples/sec Loss 10.4352 Epoch: 0 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:57:54,115-Speed 4796.62 samples/sec Loss 10.3725 Epoch: 0 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:04,829-Speed 4778.87 samples/sec Loss 10.3713 Epoch: 0 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:15,767-Speed 4681.34 samples/sec Loss 10.2146 Epoch: 0 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:26,856-Speed 4617.05 samples/sec Loss 10.2588 Epoch: 0 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:37,830-Speed 4666.16 samples/sec Loss 10.2553 Epoch: 0 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:48,585-Speed 4760.60 samples/sec Loss 10.2435 Epoch: 0 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:58:59,561-Speed 4665.11 samples/sec Loss 10.1804 Epoch: 0 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:59:10,410-Speed 4719.25 samples/sec Loss 10.1496 Epoch: 0 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:59:21,176-Speed 4756.28 samples/sec Loss 10.1734 Epoch: 0 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:59:32,265-Speed 4617.30 samples/sec Loss 10.1364 Epoch: 0 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:59:43,067-Speed 4739.85 samples/sec Loss 10.0752 Epoch: 0 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 15:59:53,863-Speed 4742.90 samples/sec Loss 10.0801 Epoch: 0 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:00:04,897-Speed 4640.22 samples/sec Loss 10.0256 Epoch: 0 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:00:16,016-Speed 4605.21 samples/sec Loss 9.9806 Epoch: 0 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:00:39,857-[lfw][10000]XNorm: 22.481834 Training: 2021-03-17 16:00:39,857-[lfw][10000]Accuracy-Flip: 0.99383+-0.00373 Training: 2021-03-17 16:00:39,857-[lfw][10000]Accuracy-Highest: 0.99383 Training: 2021-03-17 16:01:07,443-[cfp_fp][10000]XNorm: 18.891577 Training: 2021-03-17 16:01:07,443-[cfp_fp][10000]Accuracy-Flip: 0.90029+-0.02018 Training: 2021-03-17 16:01:07,443-[cfp_fp][10000]Accuracy-Highest: 0.90029 Training: 2021-03-17 16:01:31,194-[agedb_30][10000]XNorm: 21.472603 Training: 2021-03-17 16:01:31,194-[agedb_30][10000]Accuracy-Flip: 0.93150+-0.01097 Training: 2021-03-17 16:01:31,194-[agedb_30][10000]Accuracy-Highest: 0.93150 Training: 2021-03-17 16:01:42,175-Speed 594.25 samples/sec Loss 9.9579 Epoch: 0 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:01:53,157-Speed 4662.50 samples/sec Loss 9.9875 Epoch: 0 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:03,776-Speed 4821.79 samples/sec Loss 9.9825 Epoch: 0 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:14,768-Speed 4658.51 samples/sec Loss 9.8782 Epoch: 0 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:25,623-Speed 4716.96 samples/sec Loss 9.9008 Epoch: 0 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:36,331-Speed 4782.02 samples/sec Loss 9.9049 Epoch: 0 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:47,169-Speed 4724.00 samples/sec Loss 9.8162 Epoch: 0 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:02:57,949-Speed 4750.06 samples/sec Loss 9.8137 Epoch: 0 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:03:08,970-Speed 4645.88 samples/sec Loss 9.7910 Epoch: 0 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:03:20,078-Speed 4609.54 samples/sec Loss 9.7577 Epoch: 0 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:03:31,147-Speed 4625.76 samples/sec Loss 9.7231 Epoch: 0 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:03:42,175-Speed 4642.91 samples/sec Loss 9.7750 Epoch: 0 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:03:53,113-Speed 4681.17 samples/sec Loss 9.6658 Epoch: 0 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:04:04,104-Speed 4658.55 samples/sec Loss 9.6588 Epoch: 0 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:04:15,160-Speed 4631.12 samples/sec Loss 9.6508 Epoch: 0 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:04:26,154-Speed 4657.68 samples/sec Loss 9.6308 Epoch: 0 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:04:36,939-Speed 4747.33 samples/sec Loss 9.6190 Epoch: 0 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:04:48,481-Speed 4436.11 samples/sec Loss 9.5897 Epoch: 0 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:00,268-Speed 4344.09 samples/sec Loss 9.5925 Epoch: 0 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:11,023-Speed 4761.00 samples/sec Loss 9.5693 Epoch: 0 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:22,131-Speed 4609.32 samples/sec Loss 9.6020 Epoch: 0 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:32,882-Speed 4762.79 samples/sec Loss 9.4862 Epoch: 0 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:43,765-Speed 4704.45 samples/sec Loss 9.5030 Epoch: 0 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:05:54,601-Speed 4725.61 samples/sec Loss 9.5106 Epoch: 0 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:06:05,511-Speed 4693.04 samples/sec Loss 9.4488 Epoch: 0 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:06:17,386-Speed 4311.64 samples/sec Loss 9.4216 Epoch: 0 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:06:27,907-Speed 4866.90 samples/sec Loss 9.4604 Epoch: 0 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:06:39,586-Speed 4384.37 samples/sec Loss 9.4820 Epoch: 0 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:06:53,371-Speed 3714.30 samples/sec Loss 9.3972 Epoch: 0 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:07:05,131-Speed 4353.97 samples/sec Loss 9.4270 Epoch: 0 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:07:16,296-Speed 4585.80 samples/sec Loss 9.3684 Epoch: 0 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:07:27,500-Speed 4570.05 samples/sec Loss 9.3467 Epoch: 0 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:07:38,562-Speed 4628.70 samples/sec Loss 9.3141 Epoch: 0 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:07:49,868-Speed 4528.52 samples/sec Loss 9.3314 Epoch: 0 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:00,916-Speed 4634.48 samples/sec Loss 9.3399 Epoch: 0 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:11,551-Speed 4814.38 samples/sec Loss 9.3382 Epoch: 0 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:22,325-Speed 4752.71 samples/sec Loss 9.2843 Epoch: 0 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:33,260-Speed 4682.31 samples/sec Loss 9.2572 Epoch: 0 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:43,904-Speed 4810.80 samples/sec Loss 9.2394 Epoch: 0 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:08:54,869-Speed 4669.37 samples/sec Loss 9.2054 Epoch: 0 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:09:18,616-[lfw][12000]XNorm: 23.302502 Training: 2021-03-17 16:09:18,616-[lfw][12000]Accuracy-Flip: 0.99267+-0.00396 Training: 2021-03-17 16:09:18,616-[lfw][12000]Accuracy-Highest: 0.99383 Training: 2021-03-17 16:09:46,196-[cfp_fp][12000]XNorm: 19.164209 Training: 2021-03-17 16:09:46,196-[cfp_fp][12000]Accuracy-Flip: 0.91286+-0.01421 Training: 2021-03-17 16:09:46,196-[cfp_fp][12000]Accuracy-Highest: 0.91286 Training: 2021-03-17 16:10:09,963-[agedb_30][12000]XNorm: 22.511469 Training: 2021-03-17 16:10:09,963-[agedb_30][12000]Accuracy-Flip: 0.92950+-0.01232 Training: 2021-03-17 16:10:09,963-[agedb_30][12000]Accuracy-Highest: 0.93150 Training: 2021-03-17 16:10:21,077-Speed 593.92 samples/sec Loss 9.2078 Epoch: 0 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:10:31,909-Speed 4727.07 samples/sec Loss 9.2050 Epoch: 0 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:10:42,940-Speed 4641.68 samples/sec Loss 9.1621 Epoch: 0 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:10:54,092-Speed 4591.41 samples/sec Loss 9.1782 Epoch: 0 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:11:05,055-Speed 4670.52 samples/sec Loss 9.2017 Epoch: 0 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:11:16,073-Speed 4647.34 samples/sec Loss 9.1402 Epoch: 0 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:11:27,092-Speed 4646.74 samples/sec Loss 9.1414 Epoch: 0 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:11:37,937-Speed 4721.13 samples/sec Loss 9.0458 Epoch: 0 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:11:48,921-Speed 4661.52 samples/sec Loss 9.0839 Epoch: 0 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:00,015-Speed 4615.26 samples/sec Loss 9.1059 Epoch: 0 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:11,157-Speed 4595.45 samples/sec Loss 9.0450 Epoch: 0 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:22,237-Speed 4620.96 samples/sec Loss 9.0434 Epoch: 0 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:33,661-Speed 4482.16 samples/sec Loss 9.0664 Epoch: 0 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:44,905-Speed 4553.49 samples/sec Loss 9.0437 Epoch: 0 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:12:55,987-Speed 4620.49 samples/sec Loss 9.0552 Epoch: 0 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:13:06,937-Speed 4676.06 samples/sec Loss 9.0650 Epoch: 0 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:13:17,792-Speed 4716.81 samples/sec Loss 9.0145 Epoch: 0 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:13:28,640-Speed 4719.77 samples/sec Loss 8.9920 Epoch: 0 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:13:39,641-Speed 4654.49 samples/sec Loss 8.9912 Epoch: 0 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:13:50,968-Speed 4520.13 samples/sec Loss 8.9832 Epoch: 0 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:02,225-Speed 4548.52 samples/sec Loss 8.9110 Epoch: 0 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:13,077-Speed 4718.22 samples/sec Loss 8.8714 Epoch: 0 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:23,975-Speed 4698.20 samples/sec Loss 8.9045 Epoch: 0 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:35,172-Speed 4573.22 samples/sec Loss 8.8447 Epoch: 0 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:46,210-Speed 4638.67 samples/sec Loss 8.9094 Epoch: 0 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:14:57,471-Speed 4546.51 samples/sec Loss 8.9118 Epoch: 0 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:15:08,435-Speed 4670.09 samples/sec Loss 8.8802 Epoch: 0 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:15:20,270-Speed 4326.44 samples/sec Loss 8.8278 Epoch: 0 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:15:31,554-Speed 4537.82 samples/sec Loss 8.7992 Epoch: 0 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:15:43,396-Speed 4323.58 samples/sec Loss 8.7745 Epoch: 0 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:15:54,390-Speed 4657.32 samples/sec Loss 8.7959 Epoch: 0 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:16:05,397-Speed 4651.99 samples/sec Loss 8.8871 Epoch: 0 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:16:16,404-Speed 4651.93 samples/sec Loss 8.7806 Epoch: 0 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:16:27,062-Speed 4803.90 samples/sec Loss 8.8242 Epoch: 0 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:16:37,868-Speed 4738.73 samples/sec Loss 8.8232 Epoch: 0 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:16:48,477-Speed 4825.98 samples/sec Loss 8.8044 Epoch: 0 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:17:00,074-Speed 4415.28 samples/sec Loss 8.7764 Epoch: 0 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:17:12,463-Speed 4133.02 samples/sec Loss 8.8252 Epoch: 0 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:17:23,317-Speed 4717.02 samples/sec Loss 8.7433 Epoch: 0 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:17:35,882-Speed 4075.07 samples/sec Loss 8.7002 Epoch: 0 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:17:59,644-[lfw][14000]XNorm: 22.246747 Training: 2021-03-17 16:17:59,645-[lfw][14000]Accuracy-Flip: 0.99150+-0.00383 Training: 2021-03-17 16:17:59,645-[lfw][14000]Accuracy-Highest: 0.99383 Training: 2021-03-17 16:18:27,209-[cfp_fp][14000]XNorm: 18.951088 Training: 2021-03-17 16:18:27,209-[cfp_fp][14000]Accuracy-Flip: 0.89271+-0.01752 Training: 2021-03-17 16:18:27,209-[cfp_fp][14000]Accuracy-Highest: 0.91286 Training: 2021-03-17 16:18:50,950-[agedb_30][14000]XNorm: 21.297579 Training: 2021-03-17 16:18:50,951-[agedb_30][14000]Accuracy-Flip: 0.93600+-0.01259 Training: 2021-03-17 16:18:50,951-[agedb_30][14000]Accuracy-Highest: 0.93600 Training: 2021-03-17 16:19:01,897-Speed 595.25 samples/sec Loss 8.7265 Epoch: 0 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:19:13,606-Speed 4372.85 samples/sec Loss 8.6951 Epoch: 0 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:19:24,627-Speed 4645.84 samples/sec Loss 8.7447 Epoch: 0 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:19:35,419-Speed 4744.13 samples/sec Loss 8.7236 Epoch: 0 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:19:46,297-Speed 4707.39 samples/sec Loss 8.6873 Epoch: 0 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:19:57,036-Speed 4767.89 samples/sec Loss 8.6490 Epoch: 0 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:20:08,097-Speed 4628.91 samples/sec Loss 8.6433 Epoch: 0 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:20:19,247-Speed 4591.98 samples/sec Loss 8.6853 Epoch: 0 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:20:30,125-Speed 4707.08 samples/sec Loss 8.6489 Epoch: 0 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:20:41,195-Speed 4625.24 samples/sec Loss 8.7070 Epoch: 0 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:20:52,543-Speed 4512.26 samples/sec Loss 8.6684 Epoch: 0 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:03,933-Speed 4495.27 samples/sec Loss 8.6281 Epoch: 0 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:15,168-Speed 4557.36 samples/sec Loss 8.6164 Epoch: 0 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:26,063-Speed 4699.52 samples/sec Loss 8.6064 Epoch: 0 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:37,091-Speed 4642.95 samples/sec Loss 8.6537 Epoch: 0 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:47,999-Speed 4693.87 samples/sec Loss 8.6003 Epoch: 0 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:21:59,189-Speed 4575.87 samples/sec Loss 8.6355 Epoch: 0 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:22:10,363-Speed 4582.32 samples/sec Loss 8.6235 Epoch: 0 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:22:21,235-Speed 4709.58 samples/sec Loss 8.5413 Epoch: 0 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:22:32,177-Speed 4679.29 samples/sec Loss 8.5857 Epoch: 0 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:22:43,075-Speed 4698.46 samples/sec Loss 8.6051 Epoch: 0 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:22:53,845-Speed 4754.40 samples/sec Loss 8.5352 Epoch: 0 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:04,895-Speed 4633.65 samples/sec Loss 8.5154 Epoch: 0 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:15,883-Speed 4659.69 samples/sec Loss 8.4758 Epoch: 0 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:26,958-Speed 4623.38 samples/sec Loss 8.5034 Epoch: 0 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:38,117-Speed 4588.60 samples/sec Loss 8.4873 Epoch: 0 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:49,026-Speed 4693.74 samples/sec Loss 8.5624 Epoch: 0 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:23:59,853-Speed 4728.79 samples/sec Loss 8.4852 Epoch: 0 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:24:10,952-Speed 4613.62 samples/sec Loss 8.5061 Epoch: 0 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:24:22,015-Speed 4627.99 samples/sec Loss 8.4495 Epoch: 0 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:24:33,109-Speed 4615.37 samples/sec Loss 8.4337 Epoch: 0 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:24:44,339-Speed 4559.60 samples/sec Loss 8.4917 Epoch: 0 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:24:55,412-Speed 4624.16 samples/sec Loss 8.4779 Epoch: 0 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:25:06,501-Speed 4617.43 samples/sec Loss 8.4837 Epoch: 0 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:25:17,369-Speed 4711.42 samples/sec Loss 8.4298 Epoch: 0 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:25:28,127-Speed 4759.47 samples/sec Loss 8.3939 Epoch: 0 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:25:39,074-Speed 4677.29 samples/sec Loss 8.3861 Epoch: 0 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:25:50,001-Speed 4685.79 samples/sec Loss 8.4153 Epoch: 0 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:26:01,737-Speed 4362.97 samples/sec Loss 8.4148 Epoch: 0 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:26:12,552-Speed 4734.10 samples/sec Loss 8.4049 Epoch: 0 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:26:36,533-[lfw][16000]XNorm: 23.635812 Training: 2021-03-17 16:26:36,533-[lfw][16000]Accuracy-Flip: 0.99333+-0.00307 Training: 2021-03-17 16:26:36,533-[lfw][16000]Accuracy-Highest: 0.99383 Training: 2021-03-17 16:27:04,278-[cfp_fp][16000]XNorm: 19.527871 Training: 2021-03-17 16:27:04,279-[cfp_fp][16000]Accuracy-Flip: 0.91557+-0.01113 Training: 2021-03-17 16:27:04,279-[cfp_fp][16000]Accuracy-Highest: 0.91557 Training: 2021-03-17 16:27:28,139-[agedb_30][16000]XNorm: 22.970531 Training: 2021-03-17 16:27:28,139-[agedb_30][16000]Accuracy-Flip: 0.94083+-0.00923 Training: 2021-03-17 16:27:28,139-[agedb_30][16000]Accuracy-Highest: 0.94083 Training: 2021-03-17 16:27:39,883-Speed 586.28 samples/sec Loss 8.3556 Epoch: 0 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:27:50,759-Speed 4707.69 samples/sec Loss 8.4109 Epoch: 0 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:01,829-Speed 4625.70 samples/sec Loss 8.3979 Epoch: 0 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:13,109-Speed 4539.21 samples/sec Loss 8.3773 Epoch: 0 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:24,365-Speed 4549.04 samples/sec Loss 8.3753 Epoch: 0 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:35,431-Speed 4626.64 samples/sec Loss 8.3823 Epoch: 0 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:48,042-Speed 4060.15 samples/sec Loss 8.3306 Epoch: 0 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:28:58,987-Speed 4678.45 samples/sec Loss 8.2955 Epoch: 0 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:29:10,327-Speed 4514.99 samples/sec Loss 8.3399 Epoch: 0 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:29:21,147-Speed 4732.52 samples/sec Loss 8.3470 Epoch: 0 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:29:32,264-Speed 4605.80 samples/sec Loss 8.3900 Epoch: 0 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:29:44,901-Speed 4051.75 samples/sec Loss 8.2809 Epoch: 0 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:29:56,733-Speed 4327.18 samples/sec Loss 8.3160 Epoch: 0 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:30:13,440-Speed 3064.74 samples/sec Loss 8.2023 Epoch: 1 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:30:24,551-Speed 4608.55 samples/sec Loss 7.5734 Epoch: 1 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:30:35,437-Speed 4703.56 samples/sec Loss 7.5912 Epoch: 1 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:30:46,319-Speed 4705.52 samples/sec Loss 7.5521 Epoch: 1 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:30:57,392-Speed 4624.05 samples/sec Loss 7.6316 Epoch: 1 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:31:08,033-Speed 4812.21 samples/sec Loss 7.6638 Epoch: 1 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:31:18,744-Speed 4780.09 samples/sec Loss 7.6513 Epoch: 1 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:31:29,247-Speed 4875.23 samples/sec Loss 7.6214 Epoch: 1 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:31:39,967-Speed 4776.17 samples/sec Loss 7.6577 Epoch: 1 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:31:50,792-Speed 4730.20 samples/sec Loss 7.7273 Epoch: 1 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:01,528-Speed 4769.11 samples/sec Loss 7.6973 Epoch: 1 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:12,225-Speed 4786.80 samples/sec Loss 7.6420 Epoch: 1 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:22,877-Speed 4806.76 samples/sec Loss 7.7064 Epoch: 1 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:33,798-Speed 4688.51 samples/sec Loss 7.7826 Epoch: 1 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:44,514-Speed 4778.13 samples/sec Loss 7.7313 Epoch: 1 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:32:55,365-Speed 4718.61 samples/sec Loss 7.7129 Epoch: 1 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:05,896-Speed 4862.17 samples/sec Loss 7.7806 Epoch: 1 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:16,536-Speed 4812.42 samples/sec Loss 7.7647 Epoch: 1 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:27,222-Speed 4791.50 samples/sec Loss 7.7472 Epoch: 1 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:37,840-Speed 4822.73 samples/sec Loss 7.7509 Epoch: 1 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:48,753-Speed 4691.82 samples/sec Loss 7.7453 Epoch: 1 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:33:59,485-Speed 4771.25 samples/sec Loss 7.7411 Epoch: 1 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:34:10,183-Speed 4786.02 samples/sec Loss 7.7829 Epoch: 1 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:34:20,621-Speed 4905.44 samples/sec Loss 7.8283 Epoch: 1 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:34:31,531-Speed 4693.29 samples/sec Loss 7.7835 Epoch: 1 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:34:42,186-Speed 4805.30 samples/sec Loss 7.7508 Epoch: 1 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:34:52,759-Speed 4843.16 samples/sec Loss 7.7691 Epoch: 1 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:35:17,269-[lfw][18000]XNorm: 24.026878 Training: 2021-03-17 16:35:17,269-[lfw][18000]Accuracy-Flip: 0.99483+-0.00369 Training: 2021-03-17 16:35:17,270-[lfw][18000]Accuracy-Highest: 0.99483 Training: 2021-03-17 16:35:44,976-[cfp_fp][18000]XNorm: 20.217565 Training: 2021-03-17 16:35:44,976-[cfp_fp][18000]Accuracy-Flip: 0.92529+-0.01656 Training: 2021-03-17 16:35:44,977-[cfp_fp][18000]Accuracy-Highest: 0.92529 Training: 2021-03-17 16:36:08,726-[agedb_30][18000]XNorm: 23.261873 Training: 2021-03-17 16:36:08,727-[agedb_30][18000]Accuracy-Flip: 0.94333+-0.01140 Training: 2021-03-17 16:36:08,727-[agedb_30][18000]Accuracy-Highest: 0.94333 Training: 2021-03-17 16:36:19,359-Speed 591.23 samples/sec Loss 7.8376 Epoch: 1 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:36:30,039-Speed 4794.33 samples/sec Loss 7.8131 Epoch: 1 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:36:40,665-Speed 4818.81 samples/sec Loss 7.8436 Epoch: 1 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:36:51,368-Speed 4783.75 samples/sec Loss 7.7512 Epoch: 1 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:02,157-Speed 4746.10 samples/sec Loss 7.8080 Epoch: 1 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:12,829-Speed 4797.89 samples/sec Loss 7.8093 Epoch: 1 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:23,556-Speed 4773.31 samples/sec Loss 7.8081 Epoch: 1 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:34,488-Speed 4683.73 samples/sec Loss 7.8300 Epoch: 1 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:45,159-Speed 4798.56 samples/sec Loss 7.7952 Epoch: 1 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:37:56,787-Speed 4403.29 samples/sec Loss 7.8363 Epoch: 1 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:38:07,448-Speed 4803.14 samples/sec Loss 7.8220 Epoch: 1 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:38:17,910-Speed 4894.19 samples/sec Loss 7.8775 Epoch: 1 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:38:28,352-Speed 4903.46 samples/sec Loss 7.8384 Epoch: 1 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:38:39,959-Speed 4411.13 samples/sec Loss 7.8298 Epoch: 1 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:38:51,025-Speed 4627.36 samples/sec Loss 7.7931 Epoch: 1 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:02,025-Speed 4655.02 samples/sec Loss 7.8357 Epoch: 1 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:12,827-Speed 4740.38 samples/sec Loss 7.8007 Epoch: 1 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:24,367-Speed 4436.95 samples/sec Loss 7.8983 Epoch: 1 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:35,118-Speed 4762.73 samples/sec Loss 7.8377 Epoch: 1 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:47,029-Speed 4298.98 samples/sec Loss 7.8149 Epoch: 1 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:39:57,513-Speed 4884.12 samples/sec Loss 7.8553 Epoch: 1 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:40:07,879-Speed 4939.55 samples/sec Loss 7.8403 Epoch: 1 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:40:18,816-Speed 4681.40 samples/sec Loss 7.8310 Epoch: 1 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:40:29,503-Speed 4791.27 samples/sec Loss 7.8754 Epoch: 1 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:40:40,892-Speed 4495.77 samples/sec Loss 7.8915 Epoch: 1 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:40:54,341-Speed 3807.23 samples/sec Loss 7.8634 Epoch: 1 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:41:04,704-Speed 4940.79 samples/sec Loss 7.8294 Epoch: 1 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:41:15,398-Speed 4788.03 samples/sec Loss 7.8587 Epoch: 1 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:41:26,107-Speed 4781.29 samples/sec Loss 7.7842 Epoch: 1 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:41:36,579-Speed 4889.85 samples/sec Loss 7.8369 Epoch: 1 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:41:47,258-Speed 4794.68 samples/sec Loss 7.8733 Epoch: 1 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:41:58,121-Speed 4713.46 samples/sec Loss 7.7990 Epoch: 1 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:42:08,730-Speed 4826.36 samples/sec Loss 7.8423 Epoch: 1 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:42:19,510-Speed 4749.83 samples/sec Loss 7.8740 Epoch: 1 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:42:29,999-Speed 4881.61 samples/sec Loss 7.8482 Epoch: 1 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:42:40,574-Speed 4841.98 samples/sec Loss 7.8653 Epoch: 1 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:42:51,095-Speed 4866.28 samples/sec Loss 7.8432 Epoch: 1 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:43:01,915-Speed 4732.25 samples/sec Loss 7.8393 Epoch: 1 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:43:12,724-Speed 4737.46 samples/sec Loss 7.8127 Epoch: 1 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:43:23,367-Speed 4810.90 samples/sec Loss 7.8182 Epoch: 1 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:43:47,735-[lfw][20000]XNorm: 22.516334 Training: 2021-03-17 16:43:47,735-[lfw][20000]Accuracy-Flip: 0.99267+-0.00367 Training: 2021-03-17 16:43:47,735-[lfw][20000]Accuracy-Highest: 0.99483 Training: 2021-03-17 16:44:15,266-[cfp_fp][20000]XNorm: 18.939619 Training: 2021-03-17 16:44:15,266-[cfp_fp][20000]Accuracy-Flip: 0.91357+-0.01353 Training: 2021-03-17 16:44:15,266-[cfp_fp][20000]Accuracy-Highest: 0.92529 Training: 2021-03-17 16:44:39,139-[agedb_30][20000]XNorm: 21.918054 Training: 2021-03-17 16:44:39,139-[agedb_30][20000]Accuracy-Flip: 0.93783+-0.01261 Training: 2021-03-17 16:44:39,139-[agedb_30][20000]Accuracy-Highest: 0.94333 Training: 2021-03-17 16:44:49,769-Speed 592.58 samples/sec Loss 7.8603 Epoch: 1 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:00,389-Speed 4821.63 samples/sec Loss 7.7863 Epoch: 1 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:11,155-Speed 4755.65 samples/sec Loss 7.8560 Epoch: 1 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:21,867-Speed 4780.06 samples/sec Loss 7.8843 Epoch: 1 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:32,579-Speed 4780.17 samples/sec Loss 7.8579 Epoch: 1 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:43,319-Speed 4767.33 samples/sec Loss 7.8412 Epoch: 1 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:45:53,793-Speed 4888.40 samples/sec Loss 7.8275 Epoch: 1 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:04,491-Speed 4786.20 samples/sec Loss 7.8561 Epoch: 1 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:15,287-Speed 4742.75 samples/sec Loss 7.8465 Epoch: 1 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:25,804-Speed 4868.73 samples/sec Loss 7.9192 Epoch: 1 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:36,622-Speed 4733.25 samples/sec Loss 7.8405 Epoch: 1 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:47,538-Speed 4690.56 samples/sec Loss 7.8817 Epoch: 1 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-17 16:46:58,507-Speed 4668.24 samples/sec Loss 7.8827 Epoch: 1 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:47:09,175-Speed 4799.49 samples/sec Loss 7.8085 Epoch: 1 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:47:19,974-Speed 4741.38 samples/sec Loss 7.8437 Epoch: 1 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:47:30,770-Speed 4742.66 samples/sec Loss 7.8624 Epoch: 1 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:47:41,541-Speed 4754.11 samples/sec Loss 7.8596 Epoch: 1 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:47:52,210-Speed 4799.13 samples/sec Loss 7.8480 Epoch: 1 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:02,850-Speed 4812.26 samples/sec Loss 7.8361 Epoch: 1 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:13,629-Speed 4750.69 samples/sec Loss 7.8284 Epoch: 1 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:24,313-Speed 4792.21 samples/sec Loss 7.8401 Epoch: 1 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:35,075-Speed 4757.60 samples/sec Loss 7.7322 Epoch: 1 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:45,783-Speed 4781.91 samples/sec Loss 7.8058 Epoch: 1 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:48:56,385-Speed 4829.36 samples/sec Loss 7.8018 Epoch: 1 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:49:07,949-Speed 4427.85 samples/sec Loss 7.7882 Epoch: 1 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:49:19,433-Speed 4458.87 samples/sec Loss 7.8588 Epoch: 1 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:49:30,196-Speed 4757.17 samples/sec Loss 7.8276 Epoch: 1 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:49:40,750-Speed 4851.23 samples/sec Loss 7.7940 Epoch: 1 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:49:51,419-Speed 4799.37 samples/sec Loss 7.7606 Epoch: 1 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:02,103-Speed 4792.57 samples/sec Loss 7.8350 Epoch: 1 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:13,730-Speed 4403.68 samples/sec Loss 7.7774 Epoch: 1 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:24,516-Speed 4747.35 samples/sec Loss 7.7531 Epoch: 1 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:35,778-Speed 4546.58 samples/sec Loss 7.8433 Epoch: 1 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:46,173-Speed 4925.52 samples/sec Loss 7.7921 Epoch: 1 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:50:56,988-Speed 4734.73 samples/sec Loss 7.7879 Epoch: 1 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:51:07,499-Speed 4871.28 samples/sec Loss 7.7901 Epoch: 1 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:51:18,047-Speed 4854.44 samples/sec Loss 7.7832 Epoch: 1 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:51:28,625-Speed 4840.43 samples/sec Loss 7.8213 Epoch: 1 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:51:39,297-Speed 4798.01 samples/sec Loss 7.8537 Epoch: 1 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:51:51,513-Speed 4191.48 samples/sec Loss 7.7880 Epoch: 1 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:52:15,761-[lfw][22000]XNorm: 24.016285 Training: 2021-03-17 16:52:15,762-[lfw][22000]Accuracy-Flip: 0.99617+-0.00299 Training: 2021-03-17 16:52:15,762-[lfw][22000]Accuracy-Highest: 0.99617 Training: 2021-03-17 16:52:43,351-[cfp_fp][22000]XNorm: 19.903059 Training: 2021-03-17 16:52:43,351-[cfp_fp][22000]Accuracy-Flip: 0.91800+-0.01419 Training: 2021-03-17 16:52:43,352-[cfp_fp][22000]Accuracy-Highest: 0.92529 Training: 2021-03-17 16:53:07,134-[agedb_30][22000]XNorm: 22.930588 Training: 2021-03-17 16:53:07,134-[agedb_30][22000]Accuracy-Flip: 0.94700+-0.01414 Training: 2021-03-17 16:53:07,135-[agedb_30][22000]Accuracy-Highest: 0.94700 Training: 2021-03-17 16:53:19,544-Speed 581.62 samples/sec Loss 7.8182 Epoch: 1 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:53:30,499-Speed 4673.91 samples/sec Loss 7.8387 Epoch: 1 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:53:41,291-Speed 4744.74 samples/sec Loss 7.8170 Epoch: 1 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:53:51,884-Speed 4833.44 samples/sec Loss 7.7845 Epoch: 1 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:03,025-Speed 4596.25 samples/sec Loss 7.7596 Epoch: 1 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:13,792-Speed 4755.34 samples/sec Loss 7.7488 Epoch: 1 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:24,641-Speed 4719.50 samples/sec Loss 7.7413 Epoch: 1 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:35,329-Speed 4790.76 samples/sec Loss 7.7739 Epoch: 1 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:46,320-Speed 4658.54 samples/sec Loss 7.7392 Epoch: 1 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:54:56,916-Speed 4832.27 samples/sec Loss 7.8073 Epoch: 1 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:55:07,723-Speed 4738.35 samples/sec Loss 7.7930 Epoch: 1 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:55:18,454-Speed 4771.45 samples/sec Loss 7.7699 Epoch: 1 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:55:29,187-Speed 4770.57 samples/sec Loss 7.7905 Epoch: 1 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:55:39,988-Speed 4740.66 samples/sec Loss 7.7555 Epoch: 1 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:55:50,842-Speed 4717.30 samples/sec Loss 7.7291 Epoch: 1 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:01,498-Speed 4805.28 samples/sec Loss 7.6571 Epoch: 1 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:12,293-Speed 4743.30 samples/sec Loss 7.7843 Epoch: 1 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:23,121-Speed 4728.90 samples/sec Loss 7.7213 Epoch: 1 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:34,051-Speed 4684.71 samples/sec Loss 7.7878 Epoch: 1 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:44,650-Speed 4830.65 samples/sec Loss 7.7024 Epoch: 1 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:56:55,362-Speed 4780.01 samples/sec Loss 7.8066 Epoch: 1 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:05,760-Speed 4924.49 samples/sec Loss 7.7472 Epoch: 1 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:16,619-Speed 4715.13 samples/sec Loss 7.7513 Epoch: 1 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:27,289-Speed 4798.85 samples/sec Loss 7.7540 Epoch: 1 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:38,036-Speed 4764.31 samples/sec Loss 7.7082 Epoch: 1 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:48,566-Speed 4862.53 samples/sec Loss 7.7171 Epoch: 1 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:57:59,150-Speed 4838.10 samples/sec Loss 7.7492 Epoch: 1 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:58:09,821-Speed 4798.41 samples/sec Loss 7.7886 Epoch: 1 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:58:20,549-Speed 4772.88 samples/sec Loss 7.7298 Epoch: 1 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:58:31,391-Speed 4722.72 samples/sec Loss 7.7269 Epoch: 1 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:58:42,107-Speed 4777.76 samples/sec Loss 7.7229 Epoch: 1 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:58:52,835-Speed 4772.99 samples/sec Loss 7.7644 Epoch: 1 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:03,521-Speed 4791.49 samples/sec Loss 7.7349 Epoch: 1 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:14,275-Speed 4761.58 samples/sec Loss 7.6190 Epoch: 1 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:24,992-Speed 4777.90 samples/sec Loss 7.7124 Epoch: 1 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:35,640-Speed 4808.37 samples/sec Loss 7.7327 Epoch: 1 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:46,202-Speed 4847.87 samples/sec Loss 7.7132 Epoch: 1 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 16:59:56,659-Speed 4896.84 samples/sec Loss 7.7433 Epoch: 1 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:00:07,157-Speed 4877.51 samples/sec Loss 7.7123 Epoch: 1 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:00:18,515-Speed 4507.83 samples/sec Loss 7.7026 Epoch: 1 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:00:42,681-[lfw][24000]XNorm: 22.387755 Training: 2021-03-17 17:00:42,681-[lfw][24000]Accuracy-Flip: 0.99500+-0.00298 Training: 2021-03-17 17:00:42,681-[lfw][24000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:01:10,065-[cfp_fp][24000]XNorm: 18.553972 Training: 2021-03-17 17:01:10,065-[cfp_fp][24000]Accuracy-Flip: 0.92300+-0.01481 Training: 2021-03-17 17:01:10,065-[cfp_fp][24000]Accuracy-Highest: 0.92529 Training: 2021-03-17 17:01:33,717-[agedb_30][24000]XNorm: 21.707369 Training: 2021-03-17 17:01:33,718-[agedb_30][24000]Accuracy-Flip: 0.94783+-0.00983 Training: 2021-03-17 17:01:33,718-[agedb_30][24000]Accuracy-Highest: 0.94783 Training: 2021-03-17 17:01:44,873-Speed 592.89 samples/sec Loss 7.6770 Epoch: 1 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:01:55,252-Speed 4933.40 samples/sec Loss 7.7206 Epoch: 1 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:02:06,698-Speed 4473.49 samples/sec Loss 7.7268 Epoch: 1 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:02:17,204-Speed 4873.49 samples/sec Loss 7.6451 Epoch: 1 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:02:27,661-Speed 4896.92 samples/sec Loss 7.6199 Epoch: 1 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:02:38,134-Speed 4888.71 samples/sec Loss 7.6957 Epoch: 1 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:02:49,657-Speed 4443.64 samples/sec Loss 7.6903 Epoch: 1 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:00,354-Speed 4786.60 samples/sec Loss 7.6778 Epoch: 1 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:11,075-Speed 4775.87 samples/sec Loss 7.6948 Epoch: 1 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:21,813-Speed 4768.32 samples/sec Loss 7.6684 Epoch: 1 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:32,442-Speed 4817.30 samples/sec Loss 7.6885 Epoch: 1 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:43,190-Speed 4764.33 samples/sec Loss 7.6967 Epoch: 1 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:03:54,687-Speed 4453.67 samples/sec Loss 7.6741 Epoch: 1 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:04:06,146-Speed 4468.17 samples/sec Loss 7.7209 Epoch: 1 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:04:18,239-Speed 4234.14 samples/sec Loss 7.6844 Epoch: 1 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:04:28,919-Speed 4794.22 samples/sec Loss 7.6665 Epoch: 1 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:04:39,673-Speed 4761.41 samples/sec Loss 7.7009 Epoch: 1 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:04:50,465-Speed 4744.73 samples/sec Loss 7.6470 Epoch: 1 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:01,346-Speed 4705.78 samples/sec Loss 7.6899 Epoch: 1 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:11,923-Speed 4840.82 samples/sec Loss 7.7131 Epoch: 1 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:22,661-Speed 4768.27 samples/sec Loss 7.6789 Epoch: 1 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:33,004-Speed 4950.66 samples/sec Loss 7.6363 Epoch: 1 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:43,743-Speed 4767.90 samples/sec Loss 7.5988 Epoch: 1 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:05:54,223-Speed 4885.81 samples/sec Loss 7.6325 Epoch: 1 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:04,679-Speed 4896.87 samples/sec Loss 7.6443 Epoch: 1 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:15,202-Speed 4866.03 samples/sec Loss 7.6554 Epoch: 1 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:25,882-Speed 4794.28 samples/sec Loss 7.6357 Epoch: 1 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:36,200-Speed 4962.48 samples/sec Loss 7.6400 Epoch: 1 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:46,894-Speed 4788.17 samples/sec Loss 7.7077 Epoch: 1 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:06:57,644-Speed 4763.12 samples/sec Loss 7.7157 Epoch: 1 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:07:08,378-Speed 4770.13 samples/sec Loss 7.6574 Epoch: 1 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:07:18,900-Speed 4866.24 samples/sec Loss 7.6533 Epoch: 1 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:07:29,459-Speed 4849.48 samples/sec Loss 7.6754 Epoch: 1 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:07:40,211-Speed 4762.10 samples/sec Loss 7.6465 Epoch: 1 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:07:50,764-Speed 4852.22 samples/sec Loss 7.6474 Epoch: 1 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:08:01,269-Speed 4874.21 samples/sec Loss 7.6271 Epoch: 1 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:08:12,294-Speed 4644.20 samples/sec Loss 7.6251 Epoch: 1 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:08:22,795-Speed 4876.17 samples/sec Loss 7.6558 Epoch: 1 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:08:33,442-Speed 4808.99 samples/sec Loss 7.6382 Epoch: 1 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:08:44,095-Speed 4806.71 samples/sec Loss 7.6165 Epoch: 1 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:09:08,497-[lfw][26000]XNorm: 24.095266 Training: 2021-03-17 17:09:08,498-[lfw][26000]Accuracy-Flip: 0.99450+-0.00325 Training: 2021-03-17 17:09:08,498-[lfw][26000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:09:36,052-[cfp_fp][26000]XNorm: 19.731058 Training: 2021-03-17 17:09:36,053-[cfp_fp][26000]Accuracy-Flip: 0.91629+-0.01556 Training: 2021-03-17 17:09:36,053-[cfp_fp][26000]Accuracy-Highest: 0.92529 Training: 2021-03-17 17:09:59,850-[agedb_30][26000]XNorm: 23.190867 Training: 2021-03-17 17:09:59,850-[agedb_30][26000]Accuracy-Flip: 0.94100+-0.01430 Training: 2021-03-17 17:09:59,850-[agedb_30][26000]Accuracy-Highest: 0.94783 Training: 2021-03-17 17:10:10,369-Speed 593.46 samples/sec Loss 7.5938 Epoch: 1 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:10:21,058-Speed 4790.56 samples/sec Loss 7.6343 Epoch: 1 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:10:31,749-Speed 4789.17 samples/sec Loss 7.6325 Epoch: 1 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:10:42,412-Speed 4802.16 samples/sec Loss 7.6642 Epoch: 1 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:10:52,940-Speed 4863.49 samples/sec Loss 7.5985 Epoch: 1 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:03,297-Speed 4943.71 samples/sec Loss 7.6329 Epoch: 1 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:13,792-Speed 4879.12 samples/sec Loss 7.5759 Epoch: 1 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:24,346-Speed 4851.38 samples/sec Loss 7.5704 Epoch: 1 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:35,092-Speed 4765.07 samples/sec Loss 7.5685 Epoch: 1 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:45,866-Speed 4752.47 samples/sec Loss 7.6632 Epoch: 1 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:11:56,555-Speed 4790.35 samples/sec Loss 7.6014 Epoch: 1 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:12:07,246-Speed 4789.15 samples/sec Loss 7.6489 Epoch: 1 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:12:17,854-Speed 4826.89 samples/sec Loss 7.6219 Epoch: 1 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:12:29,258-Speed 4490.02 samples/sec Loss 7.6228 Epoch: 1 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:12:40,779-Speed 4444.27 samples/sec Loss 7.5707 Epoch: 1 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:12:51,304-Speed 4864.65 samples/sec Loss 7.6484 Epoch: 1 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:02,656-Speed 4510.42 samples/sec Loss 7.6308 Epoch: 1 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:13,337-Speed 4794.19 samples/sec Loss 7.6323 Epoch: 1 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:24,228-Speed 4701.38 samples/sec Loss 7.5804 Epoch: 1 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:34,862-Speed 4814.80 samples/sec Loss 7.6393 Epoch: 1 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:46,282-Speed 4483.58 samples/sec Loss 7.5830 Epoch: 1 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:13:57,050-Speed 4755.24 samples/sec Loss 7.5959 Epoch: 1 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:14:07,938-Speed 4702.73 samples/sec Loss 7.5862 Epoch: 1 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:14:18,734-Speed 4742.81 samples/sec Loss 7.6033 Epoch: 1 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:14:29,425-Speed 4789.46 samples/sec Loss 7.5900 Epoch: 1 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:14:39,983-Speed 4849.44 samples/sec Loss 7.6072 Epoch: 1 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:14:51,678-Speed 4378.12 samples/sec Loss 7.6026 Epoch: 1 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:02,861-Speed 4579.19 samples/sec Loss 7.5866 Epoch: 1 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:14,193-Speed 4518.47 samples/sec Loss 7.5795 Epoch: 1 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:24,708-Speed 4869.22 samples/sec Loss 7.6025 Epoch: 1 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:36,035-Speed 4520.80 samples/sec Loss 7.6477 Epoch: 1 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:46,545-Speed 4871.50 samples/sec Loss 7.6342 Epoch: 1 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:15:57,502-Speed 4673.23 samples/sec Loss 7.5484 Epoch: 1 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:16:08,023-Speed 4866.73 samples/sec Loss 7.5616 Epoch: 1 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:16:18,560-Speed 4859.48 samples/sec Loss 7.5196 Epoch: 1 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:16:29,464-Speed 4695.54 samples/sec Loss 7.5926 Epoch: 1 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:16:40,090-Speed 4819.05 samples/sec Loss 7.5568 Epoch: 1 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:16:50,748-Speed 4804.24 samples/sec Loss 7.5825 Epoch: 1 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:17:01,358-Speed 4825.76 samples/sec Loss 7.5461 Epoch: 1 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:17:12,186-Speed 4728.94 samples/sec Loss 7.6039 Epoch: 1 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:17:36,679-[lfw][28000]XNorm: 22.742904 Training: 2021-03-17 17:17:36,679-[lfw][28000]Accuracy-Flip: 0.99450+-0.00289 Training: 2021-03-17 17:17:36,680-[lfw][28000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:18:04,216-[cfp_fp][28000]XNorm: 19.011305 Training: 2021-03-17 17:18:04,217-[cfp_fp][28000]Accuracy-Flip: 0.92100+-0.01234 Training: 2021-03-17 17:18:04,217-[cfp_fp][28000]Accuracy-Highest: 0.92529 Training: 2021-03-17 17:18:28,004-[agedb_30][28000]XNorm: 21.760292 Training: 2021-03-17 17:18:28,004-[agedb_30][28000]Accuracy-Flip: 0.94200+-0.01067 Training: 2021-03-17 17:18:28,004-[agedb_30][28000]Accuracy-Highest: 0.94783 Training: 2021-03-17 17:18:38,513-Speed 593.10 samples/sec Loss 7.5481 Epoch: 1 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:18:49,227-Speed 4779.20 samples/sec Loss 7.5541 Epoch: 1 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:18:59,765-Speed 4858.57 samples/sec Loss 7.5561 Epoch: 1 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:19:10,535-Speed 4754.55 samples/sec Loss 7.5299 Epoch: 1 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:19:20,913-Speed 4933.97 samples/sec Loss 7.5730 Epoch: 1 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:19:31,723-Speed 4736.57 samples/sec Loss 7.5592 Epoch: 1 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:19:42,378-Speed 4805.71 samples/sec Loss 7.5758 Epoch: 1 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:19:52,970-Speed 4833.90 samples/sec Loss 7.5429 Epoch: 1 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:03,420-Speed 4899.84 samples/sec Loss 7.5261 Epoch: 1 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:14,283-Speed 4713.64 samples/sec Loss 7.5210 Epoch: 1 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:24,917-Speed 4815.12 samples/sec Loss 7.5474 Epoch: 1 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:35,785-Speed 4711.05 samples/sec Loss 7.5011 Epoch: 1 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:46,245-Speed 4895.13 samples/sec Loss 7.5825 Epoch: 1 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:20:57,066-Speed 4732.08 samples/sec Loss 7.5309 Epoch: 1 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:21:07,864-Speed 4741.94 samples/sec Loss 7.5229 Epoch: 1 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:21:18,485-Speed 4820.59 samples/sec Loss 7.5312 Epoch: 1 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:21:28,877-Speed 4927.27 samples/sec Loss 7.4455 Epoch: 1 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:21:39,453-Speed 4841.67 samples/sec Loss 7.5634 Epoch: 1 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:21:50,030-Speed 4840.95 samples/sec Loss 7.5436 Epoch: 1 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:00,876-Speed 4721.06 samples/sec Loss 7.5169 Epoch: 1 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:11,624-Speed 4764.05 samples/sec Loss 7.5661 Epoch: 1 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:22,213-Speed 4835.64 samples/sec Loss 7.5241 Epoch: 1 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:32,574-Speed 4941.72 samples/sec Loss 7.4817 Epoch: 1 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:43,288-Speed 4778.93 samples/sec Loss 7.5097 Epoch: 1 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:22:53,809-Speed 4867.09 samples/sec Loss 7.5312 Epoch: 1 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:04,230-Speed 4913.41 samples/sec Loss 7.5626 Epoch: 1 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:14,699-Speed 4890.92 samples/sec Loss 7.5539 Epoch: 1 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:25,329-Speed 4817.11 samples/sec Loss 7.5511 Epoch: 1 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:36,718-Speed 4495.46 samples/sec Loss 7.4671 Epoch: 1 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:47,224-Speed 4874.07 samples/sec Loss 7.5615 Epoch: 1 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:23:59,708-Speed 4101.47 samples/sec Loss 7.4952 Epoch: 1 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:24:10,482-Speed 4752.24 samples/sec Loss 7.5097 Epoch: 1 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:24:21,183-Speed 4784.85 samples/sec Loss 7.5283 Epoch: 1 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:24:31,828-Speed 4809.99 samples/sec Loss 7.4828 Epoch: 1 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:24:43,415-Speed 4419.04 samples/sec Loss 7.4992 Epoch: 1 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:24:54,002-Speed 4836.28 samples/sec Loss 7.4755 Epoch: 1 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:25:04,431-Speed 4909.54 samples/sec Loss 7.4952 Epoch: 1 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:25:15,144-Speed 4779.55 samples/sec Loss 7.5044 Epoch: 1 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:25:25,862-Speed 4777.74 samples/sec Loss 7.5128 Epoch: 1 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:25:36,570-Speed 4781.76 samples/sec Loss 7.5471 Epoch: 1 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:26:01,115-[lfw][30000]XNorm: 23.161613 Training: 2021-03-17 17:26:01,115-[lfw][30000]Accuracy-Flip: 0.99467+-0.00287 Training: 2021-03-17 17:26:01,116-[lfw][30000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:26:28,612-[cfp_fp][30000]XNorm: 19.380397 Training: 2021-03-17 17:26:28,613-[cfp_fp][30000]Accuracy-Flip: 0.92771+-0.00939 Training: 2021-03-17 17:26:28,613-[cfp_fp][30000]Accuracy-Highest: 0.92771 Training: 2021-03-17 17:26:52,372-[agedb_30][30000]XNorm: 22.526557 Training: 2021-03-17 17:26:52,372-[agedb_30][30000]Accuracy-Flip: 0.94783+-0.00785 Training: 2021-03-17 17:26:52,372-[agedb_30][30000]Accuracy-Highest: 0.94783 Training: 2021-03-17 17:27:03,600-Speed 588.30 samples/sec Loss 7.5216 Epoch: 1 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:27:14,512-Speed 4692.15 samples/sec Loss 7.4948 Epoch: 1 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:27:26,780-Speed 4173.73 samples/sec Loss 7.5213 Epoch: 1 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:27:37,138-Speed 4943.60 samples/sec Loss 7.4567 Epoch: 1 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:27:48,909-Speed 4349.92 samples/sec Loss 7.4927 Epoch: 1 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:27:59,682-Speed 4752.88 samples/sec Loss 7.5252 Epoch: 1 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:28:10,448-Speed 4756.22 samples/sec Loss 7.4819 Epoch: 1 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:28:21,338-Speed 4701.74 samples/sec Loss 7.4364 Epoch: 1 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:28:31,907-Speed 4845.11 samples/sec Loss 7.5106 Epoch: 1 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:28:42,431-Speed 4865.06 samples/sec Loss 7.5053 Epoch: 1 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:28:52,919-Speed 4882.07 samples/sec Loss 7.4334 Epoch: 1 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:03,696-Speed 4751.32 samples/sec Loss 7.4813 Epoch: 1 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:14,603-Speed 4694.34 samples/sec Loss 7.4543 Epoch: 1 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:25,267-Speed 4801.46 samples/sec Loss 7.4744 Epoch: 1 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:35,789-Speed 4866.37 samples/sec Loss 7.4880 Epoch: 1 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:46,457-Speed 4799.65 samples/sec Loss 7.5497 Epoch: 1 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:29:57,137-Speed 4794.55 samples/sec Loss 7.4901 Epoch: 1 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:30:08,037-Speed 4697.51 samples/sec Loss 7.5080 Epoch: 1 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:30:18,822-Speed 4747.40 samples/sec Loss 7.4855 Epoch: 1 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:30:29,393-Speed 4844.09 samples/sec Loss 7.4454 Epoch: 1 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:30:40,405-Speed 4649.70 samples/sec Loss 7.4720 Epoch: 1 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:30:51,150-Speed 4765.29 samples/sec Loss 7.4434 Epoch: 1 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:01,865-Speed 4778.78 samples/sec Loss 7.4011 Epoch: 1 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:12,675-Speed 4736.55 samples/sec Loss 7.4540 Epoch: 1 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:23,579-Speed 4695.99 samples/sec Loss 7.4660 Epoch: 1 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:34,303-Speed 4774.34 samples/sec Loss 7.4870 Epoch: 1 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:44,988-Speed 4792.34 samples/sec Loss 7.4616 Epoch: 1 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:31:55,664-Speed 4796.02 samples/sec Loss 7.4549 Epoch: 1 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:06,435-Speed 4753.63 samples/sec Loss 7.4795 Epoch: 1 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:17,365-Speed 4684.48 samples/sec Loss 7.4618 Epoch: 1 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:27,933-Speed 4845.18 samples/sec Loss 7.4487 Epoch: 1 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:38,434-Speed 4876.27 samples/sec Loss 7.4585 Epoch: 1 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:49,047-Speed 4824.45 samples/sec Loss 7.4497 Epoch: 1 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:32:59,504-Speed 4896.61 samples/sec Loss 7.4177 Epoch: 1 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:33:10,250-Speed 4764.89 samples/sec Loss 7.4369 Epoch: 1 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:33:20,965-Speed 4778.60 samples/sec Loss 7.4576 Epoch: 1 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:33:31,642-Speed 4795.63 samples/sec Loss 7.4356 Epoch: 1 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:33:42,311-Speed 4799.30 samples/sec Loss 7.4866 Epoch: 1 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:33:52,750-Speed 4904.74 samples/sec Loss 7.4712 Epoch: 1 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:34:03,450-Speed 4785.58 samples/sec Loss 7.4137 Epoch: 1 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:34:27,784-[lfw][32000]XNorm: 22.158111 Training: 2021-03-17 17:34:27,784-[lfw][32000]Accuracy-Flip: 0.99433+-0.00359 Training: 2021-03-17 17:34:27,784-[lfw][32000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:34:55,324-[cfp_fp][32000]XNorm: 17.893760 Training: 2021-03-17 17:34:55,325-[cfp_fp][32000]Accuracy-Flip: 0.92686+-0.01164 Training: 2021-03-17 17:34:55,325-[cfp_fp][32000]Accuracy-Highest: 0.92771 Training: 2021-03-17 17:35:18,994-[agedb_30][32000]XNorm: 21.557018 Training: 2021-03-17 17:35:18,994-[agedb_30][32000]Accuracy-Flip: 0.94467+-0.01040 Training: 2021-03-17 17:35:18,994-[agedb_30][32000]Accuracy-Highest: 0.94783 Training: 2021-03-17 17:35:29,651-Speed 593.96 samples/sec Loss 7.4788 Epoch: 1 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:35:40,509-Speed 4715.49 samples/sec Loss 7.4749 Epoch: 1 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:35:51,231-Speed 4775.80 samples/sec Loss 7.4679 Epoch: 1 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:01,878-Speed 4809.20 samples/sec Loss 7.3997 Epoch: 1 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:14,304-Speed 4120.45 samples/sec Loss 7.4544 Epoch: 1 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:25,824-Speed 4444.90 samples/sec Loss 7.4148 Epoch: 1 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:36,469-Speed 4810.05 samples/sec Loss 7.4016 Epoch: 1 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:47,161-Speed 4789.10 samples/sec Loss 7.4968 Epoch: 1 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:36:57,936-Speed 4752.08 samples/sec Loss 7.4510 Epoch: 1 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:37:09,228-Speed 4534.27 samples/sec Loss 7.4624 Epoch: 1 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-17 17:37:20,142-Speed 4691.78 samples/sec Loss 7.4603 Epoch: 1 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:37:30,698-Speed 4850.29 samples/sec Loss 7.3795 Epoch: 1 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:37:41,545-Speed 4720.90 samples/sec Loss 7.3834 Epoch: 1 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:37:52,304-Speed 4759.00 samples/sec Loss 7.4369 Epoch: 1 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:02,840-Speed 4859.60 samples/sec Loss 7.4523 Epoch: 1 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:14,601-Speed 4353.82 samples/sec Loss 7.3612 Epoch: 1 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:25,353-Speed 4762.04 samples/sec Loss 7.3956 Epoch: 1 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:37,695-Speed 4148.82 samples/sec Loss 7.3900 Epoch: 1 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:48,488-Speed 4743.87 samples/sec Loss 7.3798 Epoch: 1 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:38:59,243-Speed 4761.25 samples/sec Loss 7.4111 Epoch: 1 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:39:10,905-Speed 4390.28 samples/sec Loss 7.4005 Epoch: 1 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:39:21,769-Speed 4713.00 samples/sec Loss 7.3334 Epoch: 1 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:39:32,527-Speed 4759.90 samples/sec Loss 7.3996 Epoch: 1 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:39:43,354-Speed 4729.15 samples/sec Loss 7.4717 Epoch: 1 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:39:54,419-Speed 4627.60 samples/sec Loss 7.4686 Epoch: 1 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:40:05,022-Speed 4829.16 samples/sec Loss 7.4364 Epoch: 1 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:40:15,830-Speed 4737.56 samples/sec Loss 7.4115 Epoch: 1 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:40:39,078-Speed 2202.35 samples/sec Loss 7.1167 Epoch: 2 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:40:49,866-Speed 4746.53 samples/sec Loss 6.6722 Epoch: 2 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:00,539-Speed 4797.85 samples/sec Loss 6.7347 Epoch: 2 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:11,559-Speed 4646.24 samples/sec Loss 6.7586 Epoch: 2 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:22,477-Speed 4689.77 samples/sec Loss 6.7634 Epoch: 2 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:33,340-Speed 4713.90 samples/sec Loss 6.7953 Epoch: 2 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:44,240-Speed 4697.58 samples/sec Loss 6.7456 Epoch: 2 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:41:54,930-Speed 4789.61 samples/sec Loss 6.7751 Epoch: 2 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:42:05,739-Speed 4737.27 samples/sec Loss 6.7856 Epoch: 2 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:42:16,387-Speed 4809.00 samples/sec Loss 6.8453 Epoch: 2 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:42:26,994-Speed 4826.99 samples/sec Loss 6.8570 Epoch: 2 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:42:37,638-Speed 4810.76 samples/sec Loss 6.9007 Epoch: 2 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:42:48,136-Speed 4877.57 samples/sec Loss 6.8758 Epoch: 2 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:43:12,405-[lfw][34000]XNorm: 23.791254 Training: 2021-03-17 17:43:12,405-[lfw][34000]Accuracy-Flip: 0.99500+-0.00325 Training: 2021-03-17 17:43:12,405-[lfw][34000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:43:39,867-[cfp_fp][34000]XNorm: 20.198287 Training: 2021-03-17 17:43:39,867-[cfp_fp][34000]Accuracy-Flip: 0.92600+-0.01220 Training: 2021-03-17 17:43:39,867-[cfp_fp][34000]Accuracy-Highest: 0.92771 Training: 2021-03-17 17:44:03,569-[agedb_30][34000]XNorm: 23.134279 Training: 2021-03-17 17:44:03,569-[agedb_30][34000]Accuracy-Flip: 0.95317+-0.01338 Training: 2021-03-17 17:44:03,569-[agedb_30][34000]Accuracy-Highest: 0.95317 Training: 2021-03-17 17:44:14,109-Speed 595.54 samples/sec Loss 6.9182 Epoch: 2 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:44:24,660-Speed 4853.08 samples/sec Loss 6.9054 Epoch: 2 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:44:35,356-Speed 4787.33 samples/sec Loss 6.8441 Epoch: 2 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:44:46,144-Speed 4746.31 samples/sec Loss 6.9411 Epoch: 2 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:44:56,919-Speed 4751.65 samples/sec Loss 6.9472 Epoch: 2 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:45:07,477-Speed 4849.78 samples/sec Loss 6.9349 Epoch: 2 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:45:18,104-Speed 4818.12 samples/sec Loss 6.9227 Epoch: 2 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:45:28,887-Speed 4748.56 samples/sec Loss 6.9479 Epoch: 2 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:45:39,980-Speed 4615.92 samples/sec Loss 7.0373 Epoch: 2 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:45:50,483-Speed 4874.95 samples/sec Loss 7.0422 Epoch: 2 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:01,295-Speed 4735.87 samples/sec Loss 7.0193 Epoch: 2 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:12,112-Speed 4733.48 samples/sec Loss 7.0110 Epoch: 2 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:23,070-Speed 4672.74 samples/sec Loss 7.0386 Epoch: 2 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:33,825-Speed 4760.70 samples/sec Loss 7.0050 Epoch: 2 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:44,731-Speed 4694.74 samples/sec Loss 7.0814 Epoch: 2 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:46:55,702-Speed 4667.46 samples/sec Loss 7.0647 Epoch: 2 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:47:06,659-Speed 4673.06 samples/sec Loss 7.0549 Epoch: 2 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:47:18,494-Speed 4326.32 samples/sec Loss 7.0604 Epoch: 2 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:47:29,112-Speed 4822.28 samples/sec Loss 6.9829 Epoch: 2 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:47:39,816-Speed 4783.78 samples/sec Loss 7.0830 Epoch: 2 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:47:51,317-Speed 4451.86 samples/sec Loss 7.0458 Epoch: 2 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:02,936-Speed 4407.03 samples/sec Loss 7.0878 Epoch: 2 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:13,575-Speed 4812.56 samples/sec Loss 7.1145 Epoch: 2 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:24,129-Speed 4851.67 samples/sec Loss 7.0565 Epoch: 2 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:35,850-Speed 4368.38 samples/sec Loss 7.0143 Epoch: 2 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:46,481-Speed 4816.71 samples/sec Loss 7.0750 Epoch: 2 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:48:57,076-Speed 4832.52 samples/sec Loss 7.0931 Epoch: 2 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:49:07,880-Speed 4739.33 samples/sec Loss 7.1439 Epoch: 2 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:49:18,655-Speed 4751.92 samples/sec Loss 7.1191 Epoch: 2 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:49:29,482-Speed 4729.39 samples/sec Loss 7.1620 Epoch: 2 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:49:40,095-Speed 4824.20 samples/sec Loss 7.1927 Epoch: 2 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:49:50,766-Speed 4798.60 samples/sec Loss 7.1473 Epoch: 2 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:02,422-Speed 4392.96 samples/sec Loss 7.1331 Epoch: 2 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:13,331-Speed 4693.42 samples/sec Loss 7.1641 Epoch: 2 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:25,436-Speed 4229.83 samples/sec Loss 7.1059 Epoch: 2 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:36,245-Speed 4737.28 samples/sec Loss 7.1092 Epoch: 2 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:47,036-Speed 4744.68 samples/sec Loss 7.1297 Epoch: 2 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:50:58,074-Speed 4638.96 samples/sec Loss 7.1733 Epoch: 2 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:51:08,938-Speed 4713.01 samples/sec Loss 7.1494 Epoch: 2 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:51:19,647-Speed 4781.49 samples/sec Loss 7.2065 Epoch: 2 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:51:43,619-[lfw][36000]XNorm: 22.275454 Training: 2021-03-17 17:51:43,620-[lfw][36000]Accuracy-Flip: 0.99367+-0.00306 Training: 2021-03-17 17:51:43,620-[lfw][36000]Accuracy-Highest: 0.99617 Training: 2021-03-17 17:52:11,503-[cfp_fp][36000]XNorm: 18.357355 Training: 2021-03-17 17:52:11,504-[cfp_fp][36000]Accuracy-Flip: 0.91800+-0.01123 Training: 2021-03-17 17:52:11,504-[cfp_fp][36000]Accuracy-Highest: 0.92771 Training: 2021-03-17 17:52:35,315-[agedb_30][36000]XNorm: 21.200215 Training: 2021-03-17 17:52:35,315-[agedb_30][36000]Accuracy-Flip: 0.94517+-0.01037 Training: 2021-03-17 17:52:35,315-[agedb_30][36000]Accuracy-Highest: 0.95317 Training: 2021-03-17 17:52:45,638-Speed 595.41 samples/sec Loss 7.1656 Epoch: 2 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:52:56,520-Speed 4705.35 samples/sec Loss 7.1606 Epoch: 2 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:53:07,211-Speed 4789.63 samples/sec Loss 7.1602 Epoch: 2 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:53:17,822-Speed 4825.21 samples/sec Loss 7.2205 Epoch: 2 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:53:28,682-Speed 4714.94 samples/sec Loss 7.2033 Epoch: 2 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:53:39,414-Speed 4771.12 samples/sec Loss 7.2546 Epoch: 2 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:53:50,238-Speed 4730.33 samples/sec Loss 7.1840 Epoch: 2 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:00,841-Speed 4829.44 samples/sec Loss 7.1711 Epoch: 2 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:11,581-Speed 4767.35 samples/sec Loss 7.1784 Epoch: 2 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:22,267-Speed 4791.75 samples/sec Loss 7.2042 Epoch: 2 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:32,723-Speed 4897.29 samples/sec Loss 7.1259 Epoch: 2 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:43,461-Speed 4768.18 samples/sec Loss 7.1910 Epoch: 2 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:54:54,085-Speed 4819.80 samples/sec Loss 7.2680 Epoch: 2 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:04,717-Speed 4815.84 samples/sec Loss 7.2478 Epoch: 2 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:15,565-Speed 4720.25 samples/sec Loss 7.2736 Epoch: 2 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:26,406-Speed 4722.96 samples/sec Loss 7.1961 Epoch: 2 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:37,205-Speed 4741.84 samples/sec Loss 7.1979 Epoch: 2 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:48,049-Speed 4721.60 samples/sec Loss 7.2285 Epoch: 2 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:55:58,974-Speed 4686.96 samples/sec Loss 7.2552 Epoch: 2 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:56:09,795-Speed 4731.83 samples/sec Loss 7.1741 Epoch: 2 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:56:20,761-Speed 4668.95 samples/sec Loss 7.1942 Epoch: 2 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:56:31,541-Speed 4750.02 samples/sec Loss 7.2224 Epoch: 2 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:56:42,219-Speed 4795.07 samples/sec Loss 7.2203 Epoch: 2 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:56:52,837-Speed 4822.17 samples/sec Loss 7.2919 Epoch: 2 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:03,688-Speed 4718.74 samples/sec Loss 7.2028 Epoch: 2 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:14,220-Speed 4862.06 samples/sec Loss 7.2411 Epoch: 2 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:24,778-Speed 4849.37 samples/sec Loss 7.2019 Epoch: 2 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:35,919-Speed 4595.77 samples/sec Loss 7.2466 Epoch: 2 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:46,641-Speed 4775.87 samples/sec Loss 7.2470 Epoch: 2 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:57:58,442-Speed 4338.85 samples/sec Loss 7.3091 Epoch: 2 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:58:09,155-Speed 4779.50 samples/sec Loss 7.2339 Epoch: 2 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:58:19,901-Speed 4764.70 samples/sec Loss 7.2013 Epoch: 2 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:58:30,424-Speed 4865.70 samples/sec Loss 7.2120 Epoch: 2 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:58:41,170-Speed 4764.87 samples/sec Loss 7.2276 Epoch: 2 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:58:52,327-Speed 4589.42 samples/sec Loss 7.2215 Epoch: 2 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:59:02,967-Speed 4812.48 samples/sec Loss 7.2824 Epoch: 2 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:59:14,700-Speed 4363.78 samples/sec Loss 7.1565 Epoch: 2 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:59:26,339-Speed 4399.34 samples/sec Loss 7.2802 Epoch: 2 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:59:36,798-Speed 4896.00 samples/sec Loss 7.2350 Epoch: 2 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 17:59:48,039-Speed 4555.00 samples/sec Loss 7.2301 Epoch: 2 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:00:11,921-[lfw][38000]XNorm: 23.363289 Training: 2021-03-17 18:00:11,921-[lfw][38000]Accuracy-Flip: 0.99467+-0.00314 Training: 2021-03-17 18:00:11,921-[lfw][38000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:00:39,566-[cfp_fp][38000]XNorm: 19.490556 Training: 2021-03-17 18:00:39,566-[cfp_fp][38000]Accuracy-Flip: 0.91486+-0.01390 Training: 2021-03-17 18:00:39,566-[cfp_fp][38000]Accuracy-Highest: 0.92771 Training: 2021-03-17 18:01:03,466-[agedb_30][38000]XNorm: 22.987992 Training: 2021-03-17 18:01:03,466-[agedb_30][38000]Accuracy-Flip: 0.95450+-0.01036 Training: 2021-03-17 18:01:03,467-[agedb_30][38000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:01:13,967-Speed 595.85 samples/sec Loss 7.2021 Epoch: 2 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:01:24,663-Speed 4787.06 samples/sec Loss 7.2680 Epoch: 2 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:01:35,475-Speed 4735.54 samples/sec Loss 7.2270 Epoch: 2 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:01:46,290-Speed 4734.72 samples/sec Loss 7.2761 Epoch: 2 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:01:57,927-Speed 4399.86 samples/sec Loss 7.2345 Epoch: 2 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:02:08,711-Speed 4748.11 samples/sec Loss 7.2287 Epoch: 2 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:02:19,710-Speed 4655.23 samples/sec Loss 7.2684 Epoch: 2 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:02:31,371-Speed 4391.13 samples/sec Loss 7.2941 Epoch: 2 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:02:41,994-Speed 4820.09 samples/sec Loss 7.2575 Epoch: 2 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:02:53,591-Speed 4415.26 samples/sec Loss 7.2430 Epoch: 2 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:04,993-Speed 4490.57 samples/sec Loss 7.2288 Epoch: 2 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:15,786-Speed 4744.13 samples/sec Loss 7.2538 Epoch: 2 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:26,650-Speed 4713.05 samples/sec Loss 7.2372 Epoch: 2 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:37,346-Speed 4787.12 samples/sec Loss 7.3095 Epoch: 2 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:48,028-Speed 4793.20 samples/sec Loss 7.3057 Epoch: 2 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:03:58,763-Speed 4769.70 samples/sec Loss 7.2523 Epoch: 2 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:04:09,562-Speed 4741.71 samples/sec Loss 7.2519 Epoch: 2 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:04:20,311-Speed 4763.38 samples/sec Loss 7.2052 Epoch: 2 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:04:31,259-Speed 4676.81 samples/sec Loss 7.1807 Epoch: 2 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:04:42,050-Speed 4745.07 samples/sec Loss 7.2884 Epoch: 2 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:04:52,824-Speed 4752.23 samples/sec Loss 7.2769 Epoch: 2 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:03,721-Speed 4699.11 samples/sec Loss 7.2872 Epoch: 2 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:14,586-Speed 4712.52 samples/sec Loss 7.2061 Epoch: 2 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:25,276-Speed 4790.09 samples/sec Loss 7.2533 Epoch: 2 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:36,018-Speed 4766.46 samples/sec Loss 7.3093 Epoch: 2 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:46,725-Speed 4782.43 samples/sec Loss 7.2837 Epoch: 2 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:05:57,275-Speed 4853.47 samples/sec Loss 7.3055 Epoch: 2 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:06:07,682-Speed 4919.95 samples/sec Loss 7.2369 Epoch: 2 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:06:18,173-Speed 4880.77 samples/sec Loss 7.2790 Epoch: 2 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:06:28,699-Speed 4864.15 samples/sec Loss 7.2441 Epoch: 2 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:06:39,323-Speed 4819.56 samples/sec Loss 7.2299 Epoch: 2 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:06:49,962-Speed 4812.95 samples/sec Loss 7.2167 Epoch: 2 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:00,448-Speed 4883.21 samples/sec Loss 7.2708 Epoch: 2 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:11,022-Speed 4842.29 samples/sec Loss 7.1977 Epoch: 2 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:21,478-Speed 4897.11 samples/sec Loss 7.2544 Epoch: 2 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:32,326-Speed 4719.80 samples/sec Loss 7.2078 Epoch: 2 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:42,934-Speed 4826.77 samples/sec Loss 7.2542 Epoch: 2 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:07:53,481-Speed 4854.79 samples/sec Loss 7.2942 Epoch: 2 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:08:04,144-Speed 4801.76 samples/sec Loss 7.3001 Epoch: 2 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:08:14,791-Speed 4809.64 samples/sec Loss 7.2631 Epoch: 2 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:08:39,236-[lfw][40000]XNorm: 23.456249 Training: 2021-03-17 18:08:39,237-[lfw][40000]Accuracy-Flip: 0.99467+-0.00287 Training: 2021-03-17 18:08:39,237-[lfw][40000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:09:06,818-[cfp_fp][40000]XNorm: 19.420148 Training: 2021-03-17 18:09:06,819-[cfp_fp][40000]Accuracy-Flip: 0.91986+-0.01307 Training: 2021-03-17 18:09:06,819-[cfp_fp][40000]Accuracy-Highest: 0.92771 Training: 2021-03-17 18:09:30,624-[agedb_30][40000]XNorm: 22.205159 Training: 2021-03-17 18:09:30,624-[agedb_30][40000]Accuracy-Flip: 0.95200+-0.00906 Training: 2021-03-17 18:09:30,624-[agedb_30][40000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:09:41,242-Speed 592.25 samples/sec Loss 7.2727 Epoch: 2 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:09:51,879-Speed 4813.48 samples/sec Loss 7.2910 Epoch: 2 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:02,517-Speed 4813.18 samples/sec Loss 7.2235 Epoch: 2 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:13,091-Speed 4842.64 samples/sec Loss 7.2358 Epoch: 2 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:24,511-Speed 4483.34 samples/sec Loss 7.2560 Epoch: 2 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:35,702-Speed 4575.59 samples/sec Loss 7.2119 Epoch: 2 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:46,341-Speed 4812.59 samples/sec Loss 7.2496 Epoch: 2 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:10:57,120-Speed 4750.28 samples/sec Loss 7.2638 Epoch: 2 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:11:08,035-Speed 4691.06 samples/sec Loss 7.2605 Epoch: 2 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:11:19,544-Speed 4448.80 samples/sec Loss 7.2936 Epoch: 2 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:11:30,255-Speed 4780.52 samples/sec Loss 7.2628 Epoch: 2 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:11:41,231-Speed 4665.17 samples/sec Loss 7.2699 Epoch: 2 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:11:52,843-Speed 4409.21 samples/sec Loss 7.3077 Epoch: 2 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:04,498-Speed 4393.52 samples/sec Loss 7.3001 Epoch: 2 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:15,255-Speed 4759.77 samples/sec Loss 7.3083 Epoch: 2 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:26,178-Speed 4687.66 samples/sec Loss 7.2048 Epoch: 2 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:36,693-Speed 4869.84 samples/sec Loss 7.2818 Epoch: 2 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:47,488-Speed 4743.43 samples/sec Loss 7.2555 Epoch: 2 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:12:58,408-Speed 4688.83 samples/sec Loss 7.2796 Epoch: 2 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:13:09,958-Speed 4432.91 samples/sec Loss 7.2621 Epoch: 2 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:13:20,571-Speed 4824.53 samples/sec Loss 7.2768 Epoch: 2 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:13:31,029-Speed 4896.38 samples/sec Loss 7.2356 Epoch: 2 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:13:41,896-Speed 4711.83 samples/sec Loss 7.2574 Epoch: 2 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:13:52,623-Speed 4773.13 samples/sec Loss 7.2824 Epoch: 2 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:05,895-Speed 3857.93 samples/sec Loss 7.2613 Epoch: 2 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:16,531-Speed 4814.06 samples/sec Loss 7.2945 Epoch: 2 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:27,141-Speed 4825.92 samples/sec Loss 7.2615 Epoch: 2 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:37,797-Speed 4805.08 samples/sec Loss 7.2723 Epoch: 2 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:48,566-Speed 4754.96 samples/sec Loss 7.2398 Epoch: 2 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:14:59,252-Speed 4791.66 samples/sec Loss 7.2490 Epoch: 2 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:15:10,025-Speed 4752.74 samples/sec Loss 7.2566 Epoch: 2 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:15:20,641-Speed 4823.24 samples/sec Loss 7.2918 Epoch: 2 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:15:31,408-Speed 4755.70 samples/sec Loss 7.2340 Epoch: 2 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:15:42,124-Speed 4778.35 samples/sec Loss 7.2794 Epoch: 2 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:15:52,753-Speed 4817.07 samples/sec Loss 7.2722 Epoch: 2 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:16:03,208-Speed 4897.66 samples/sec Loss 7.2484 Epoch: 2 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:16:14,113-Speed 4695.50 samples/sec Loss 7.2304 Epoch: 2 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:16:25,108-Speed 4656.82 samples/sec Loss 7.3315 Epoch: 2 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:16:35,981-Speed 4709.15 samples/sec Loss 7.3106 Epoch: 2 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:16:46,667-Speed 4791.54 samples/sec Loss 7.2440 Epoch: 2 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:17:10,907-[lfw][42000]XNorm: 22.200451 Training: 2021-03-17 18:17:10,907-[lfw][42000]Accuracy-Flip: 0.99517+-0.00252 Training: 2021-03-17 18:17:10,907-[lfw][42000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:17:38,473-[cfp_fp][42000]XNorm: 18.058868 Training: 2021-03-17 18:17:38,473-[cfp_fp][42000]Accuracy-Flip: 0.91286+-0.01385 Training: 2021-03-17 18:17:38,473-[cfp_fp][42000]Accuracy-Highest: 0.92771 Training: 2021-03-17 18:18:02,266-[agedb_30][42000]XNorm: 21.188542 Training: 2021-03-17 18:18:02,266-[agedb_30][42000]Accuracy-Flip: 0.94683+-0.01045 Training: 2021-03-17 18:18:02,266-[agedb_30][42000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:18:13,013-Speed 592.97 samples/sec Loss 7.2530 Epoch: 2 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:18:23,618-Speed 4828.13 samples/sec Loss 7.2833 Epoch: 2 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:18:34,316-Speed 4786.47 samples/sec Loss 7.3145 Epoch: 2 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:18:45,128-Speed 4735.71 samples/sec Loss 7.2465 Epoch: 2 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:18:55,752-Speed 4819.32 samples/sec Loss 7.2147 Epoch: 2 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:06,271-Speed 4867.79 samples/sec Loss 7.2543 Epoch: 2 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:16,775-Speed 4874.65 samples/sec Loss 7.2234 Epoch: 2 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:27,564-Speed 4745.89 samples/sec Loss 7.3027 Epoch: 2 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:38,097-Speed 4861.24 samples/sec Loss 7.2291 Epoch: 2 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:48,813-Speed 4778.49 samples/sec Loss 7.2072 Epoch: 2 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:19:59,671-Speed 4715.72 samples/sec Loss 7.2232 Epoch: 2 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:20:10,195-Speed 4865.21 samples/sec Loss 7.2576 Epoch: 2 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:20:20,910-Speed 4778.63 samples/sec Loss 7.2958 Epoch: 2 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:20:31,666-Speed 4760.74 samples/sec Loss 7.2618 Epoch: 2 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:20:42,408-Speed 4766.58 samples/sec Loss 7.2028 Epoch: 2 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:20:53,372-Speed 4670.07 samples/sec Loss 7.2282 Epoch: 2 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:04,121-Speed 4763.61 samples/sec Loss 7.2374 Epoch: 2 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:15,614-Speed 4455.53 samples/sec Loss 7.2372 Epoch: 2 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:26,269-Speed 4805.37 samples/sec Loss 7.2343 Epoch: 2 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:37,045-Speed 4751.70 samples/sec Loss 7.2710 Epoch: 2 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:47,641-Speed 4832.42 samples/sec Loss 7.2636 Epoch: 2 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:21:58,691-Speed 4633.65 samples/sec Loss 7.2234 Epoch: 2 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:22:09,757-Speed 4627.11 samples/sec Loss 7.2552 Epoch: 2 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:22:20,745-Speed 4659.55 samples/sec Loss 7.2162 Epoch: 2 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:22:32,210-Speed 4466.04 samples/sec Loss 7.2830 Epoch: 2 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:22:42,858-Speed 4808.66 samples/sec Loss 7.1873 Epoch: 2 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:22:53,628-Speed 4754.34 samples/sec Loss 7.2774 Epoch: 2 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:05,185-Speed 4430.66 samples/sec Loss 7.2929 Epoch: 2 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:16,551-Speed 4504.91 samples/sec Loss 7.2248 Epoch: 2 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:27,132-Speed 4839.26 samples/sec Loss 7.2514 Epoch: 2 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:37,562-Speed 4909.23 samples/sec Loss 7.2760 Epoch: 2 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:48,118-Speed 4850.81 samples/sec Loss 7.2335 Epoch: 2 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:23:58,713-Speed 4832.93 samples/sec Loss 7.2935 Epoch: 2 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:24:09,335-Speed 4820.20 samples/sec Loss 7.3165 Epoch: 2 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:24:20,898-Speed 4428.26 samples/sec Loss 7.2403 Epoch: 2 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:24:31,722-Speed 4730.70 samples/sec Loss 7.2527 Epoch: 2 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:24:42,344-Speed 4820.19 samples/sec Loss 7.2157 Epoch: 2 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:24:53,010-Speed 4800.95 samples/sec Loss 7.2605 Epoch: 2 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:25:03,573-Speed 4847.70 samples/sec Loss 7.2595 Epoch: 2 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:25:15,357-Speed 4344.84 samples/sec Loss 7.1827 Epoch: 2 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:25:39,215-[lfw][44000]XNorm: 22.208169 Training: 2021-03-17 18:25:39,215-[lfw][44000]Accuracy-Flip: 0.99467+-0.00340 Training: 2021-03-17 18:25:39,215-[lfw][44000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:26:06,646-[cfp_fp][44000]XNorm: 18.239233 Training: 2021-03-17 18:26:06,646-[cfp_fp][44000]Accuracy-Flip: 0.92843+-0.01397 Training: 2021-03-17 18:26:06,647-[cfp_fp][44000]Accuracy-Highest: 0.92843 Training: 2021-03-17 18:26:30,355-[agedb_30][44000]XNorm: 21.428413 Training: 2021-03-17 18:26:30,355-[agedb_30][44000]Accuracy-Flip: 0.95050+-0.01080 Training: 2021-03-17 18:26:30,355-[agedb_30][44000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:26:42,394-Speed 588.26 samples/sec Loss 7.2380 Epoch: 2 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:26:53,473-Speed 4621.47 samples/sec Loss 7.2442 Epoch: 2 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:04,279-Speed 4738.36 samples/sec Loss 7.2706 Epoch: 2 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:14,954-Speed 4796.79 samples/sec Loss 7.1895 Epoch: 2 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:25,602-Speed 4808.53 samples/sec Loss 7.2243 Epoch: 2 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:36,336-Speed 4770.55 samples/sec Loss 7.2646 Epoch: 2 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:47,177-Speed 4722.97 samples/sec Loss 7.2938 Epoch: 2 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:27:57,798-Speed 4820.85 samples/sec Loss 7.2279 Epoch: 2 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:28:08,507-Speed 4781.47 samples/sec Loss 7.2540 Epoch: 2 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:28:19,278-Speed 4753.81 samples/sec Loss 7.2127 Epoch: 2 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:28:29,913-Speed 4814.37 samples/sec Loss 7.2218 Epoch: 2 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:28:40,542-Speed 4817.32 samples/sec Loss 7.2229 Epoch: 2 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:28:51,316-Speed 4752.43 samples/sec Loss 7.2273 Epoch: 2 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:02,292-Speed 4665.27 samples/sec Loss 7.2541 Epoch: 2 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:12,858-Speed 4846.08 samples/sec Loss 7.2194 Epoch: 2 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:23,422-Speed 4846.68 samples/sec Loss 7.1804 Epoch: 2 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:34,038-Speed 4823.33 samples/sec Loss 7.2271 Epoch: 2 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:44,877-Speed 4724.27 samples/sec Loss 7.1918 Epoch: 2 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:29:55,481-Speed 4828.53 samples/sec Loss 7.1998 Epoch: 2 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:06,333-Speed 4718.27 samples/sec Loss 7.2884 Epoch: 2 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:17,017-Speed 4792.52 samples/sec Loss 7.2679 Epoch: 2 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:27,847-Speed 4727.83 samples/sec Loss 7.1784 Epoch: 2 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:38,425-Speed 4840.57 samples/sec Loss 7.2088 Epoch: 2 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:49,167-Speed 4766.63 samples/sec Loss 7.2549 Epoch: 2 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:30:59,652-Speed 4883.72 samples/sec Loss 7.2469 Epoch: 2 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:31:10,109-Speed 4896.39 samples/sec Loss 7.1687 Epoch: 2 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:31:20,791-Speed 4793.53 samples/sec Loss 7.2360 Epoch: 2 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:31:31,551-Speed 4758.51 samples/sec Loss 7.2497 Epoch: 2 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:31:42,027-Speed 4887.49 samples/sec Loss 7.2361 Epoch: 2 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:31:52,771-Speed 4765.98 samples/sec Loss 7.2055 Epoch: 2 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:03,474-Speed 4784.21 samples/sec Loss 7.2453 Epoch: 2 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:14,314-Speed 4723.48 samples/sec Loss 7.2313 Epoch: 2 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:25,864-Speed 4433.07 samples/sec Loss 7.2198 Epoch: 2 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:36,616-Speed 4762.48 samples/sec Loss 7.2470 Epoch: 2 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:47,285-Speed 4799.23 samples/sec Loss 7.1761 Epoch: 2 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:32:58,094-Speed 4736.77 samples/sec Loss 7.1674 Epoch: 2 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:33:08,972-Speed 4707.09 samples/sec Loss 7.1871 Epoch: 2 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:33:19,771-Speed 4741.57 samples/sec Loss 7.2363 Epoch: 2 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:33:30,389-Speed 4822.36 samples/sec Loss 7.2880 Epoch: 2 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:33:41,040-Speed 4807.58 samples/sec Loss 7.2625 Epoch: 2 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:34:05,685-[lfw][46000]XNorm: 23.589080 Training: 2021-03-17 18:34:05,685-[lfw][46000]Accuracy-Flip: 0.99583+-0.00300 Training: 2021-03-17 18:34:05,685-[lfw][46000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:34:33,350-[cfp_fp][46000]XNorm: 19.488645 Training: 2021-03-17 18:34:33,351-[cfp_fp][46000]Accuracy-Flip: 0.92943+-0.01563 Training: 2021-03-17 18:34:33,351-[cfp_fp][46000]Accuracy-Highest: 0.92943 Training: 2021-03-17 18:34:57,238-[agedb_30][46000]XNorm: 22.747841 Training: 2021-03-17 18:34:57,238-[agedb_30][46000]Accuracy-Flip: 0.93883+-0.00995 Training: 2021-03-17 18:34:57,238-[agedb_30][46000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:35:07,861-Speed 589.72 samples/sec Loss 7.1731 Epoch: 2 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:35:19,381-Speed 4444.40 samples/sec Loss 7.1572 Epoch: 2 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:35:30,838-Speed 4469.16 samples/sec Loss 7.1549 Epoch: 2 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:35:41,547-Speed 4781.61 samples/sec Loss 7.2639 Epoch: 2 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:35:52,254-Speed 4782.24 samples/sec Loss 7.2450 Epoch: 2 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-17 18:36:03,978-Speed 4367.46 samples/sec Loss 7.2331 Epoch: 2 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:36:14,769-Speed 4744.93 samples/sec Loss 7.1792 Epoch: 2 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:36:25,391-Speed 4820.46 samples/sec Loss 7.1666 Epoch: 2 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:36:36,151-Speed 4758.49 samples/sec Loss 7.2216 Epoch: 2 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:36:47,656-Speed 4450.60 samples/sec Loss 7.2034 Epoch: 2 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:36:58,349-Speed 4788.18 samples/sec Loss 7.2143 Epoch: 2 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:37:09,150-Speed 4740.60 samples/sec Loss 7.1930 Epoch: 2 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:37:19,906-Speed 4760.71 samples/sec Loss 7.2127 Epoch: 2 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:37:30,781-Speed 4708.37 samples/sec Loss 7.1817 Epoch: 2 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:37:42,310-Speed 4441.14 samples/sec Loss 7.1717 Epoch: 2 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:37:53,772-Speed 4466.95 samples/sec Loss 7.2753 Epoch: 2 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:04,984-Speed 4566.87 samples/sec Loss 7.2142 Epoch: 2 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:15,721-Speed 4768.82 samples/sec Loss 7.1535 Epoch: 2 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:26,366-Speed 4810.15 samples/sec Loss 7.1931 Epoch: 2 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:37,070-Speed 4783.62 samples/sec Loss 7.2210 Epoch: 2 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:47,897-Speed 4729.50 samples/sec Loss 7.2630 Epoch: 2 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:38:58,526-Speed 4817.55 samples/sec Loss 7.2168 Epoch: 2 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:39:09,200-Speed 4796.96 samples/sec Loss 7.1771 Epoch: 2 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:39:19,855-Speed 4805.63 samples/sec Loss 7.1979 Epoch: 2 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:39:30,662-Speed 4737.60 samples/sec Loss 7.2681 Epoch: 2 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:39:41,285-Speed 4820.12 samples/sec Loss 7.2414 Epoch: 2 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:39:51,957-Speed 4797.87 samples/sec Loss 7.2593 Epoch: 2 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:02,853-Speed 4699.25 samples/sec Loss 7.2389 Epoch: 2 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:13,506-Speed 4806.41 samples/sec Loss 7.2065 Epoch: 2 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:24,216-Speed 4780.67 samples/sec Loss 7.1887 Epoch: 2 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:35,003-Speed 4746.74 samples/sec Loss 7.1790 Epoch: 2 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:45,524-Speed 4866.81 samples/sec Loss 7.2049 Epoch: 2 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:40:56,108-Speed 4838.01 samples/sec Loss 7.2342 Epoch: 2 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:41:06,732-Speed 4819.33 samples/sec Loss 7.2150 Epoch: 2 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:41:17,378-Speed 4809.90 samples/sec Loss 7.2431 Epoch: 2 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:41:28,065-Speed 4791.07 samples/sec Loss 7.2527 Epoch: 2 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:41:38,728-Speed 4802.05 samples/sec Loss 7.2213 Epoch: 2 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:41:49,529-Speed 4740.27 samples/sec Loss 7.2034 Epoch: 2 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:42:00,294-Speed 4756.76 samples/sec Loss 7.2327 Epoch: 2 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:42:10,948-Speed 4806.10 samples/sec Loss 7.1264 Epoch: 2 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:42:35,342-[lfw][48000]XNorm: 22.611620 Training: 2021-03-17 18:42:35,342-[lfw][48000]Accuracy-Flip: 0.99467+-0.00314 Training: 2021-03-17 18:42:35,342-[lfw][48000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:43:02,798-[cfp_fp][48000]XNorm: 18.370053 Training: 2021-03-17 18:43:02,798-[cfp_fp][48000]Accuracy-Flip: 0.92029+-0.01468 Training: 2021-03-17 18:43:02,798-[cfp_fp][48000]Accuracy-Highest: 0.92943 Training: 2021-03-17 18:43:26,748-[agedb_30][48000]XNorm: 21.908684 Training: 2021-03-17 18:43:26,748-[agedb_30][48000]Accuracy-Flip: 0.94883+-0.00928 Training: 2021-03-17 18:43:26,748-[agedb_30][48000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:43:37,228-Speed 593.42 samples/sec Loss 7.1589 Epoch: 2 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:43:47,922-Speed 4787.84 samples/sec Loss 7.1784 Epoch: 2 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:43:58,797-Speed 4708.26 samples/sec Loss 7.1674 Epoch: 2 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:44:09,525-Speed 4773.13 samples/sec Loss 7.2596 Epoch: 2 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:44:20,217-Speed 4788.87 samples/sec Loss 7.1688 Epoch: 2 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:44:31,218-Speed 4654.57 samples/sec Loss 7.2552 Epoch: 2 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:44:42,182-Speed 4669.99 samples/sec Loss 7.1870 Epoch: 2 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:44:53,984-Speed 4338.77 samples/sec Loss 7.1800 Epoch: 2 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:04,657-Speed 4797.22 samples/sec Loss 7.1921 Epoch: 2 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:15,421-Speed 4757.13 samples/sec Loss 7.2226 Epoch: 2 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:26,293-Speed 4709.70 samples/sec Loss 7.2033 Epoch: 2 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:37,023-Speed 4771.79 samples/sec Loss 7.1828 Epoch: 2 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:47,670-Speed 4808.99 samples/sec Loss 7.2497 Epoch: 2 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:45:58,422-Speed 4762.18 samples/sec Loss 7.1550 Epoch: 2 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:46:09,051-Speed 4817.67 samples/sec Loss 7.1851 Epoch: 2 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:46:20,669-Speed 4407.16 samples/sec Loss 7.1266 Epoch: 2 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:46:31,477-Speed 4737.50 samples/sec Loss 7.1944 Epoch: 2 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:46:43,032-Speed 4431.13 samples/sec Loss 7.1703 Epoch: 2 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:46:53,835-Speed 4739.90 samples/sec Loss 7.1975 Epoch: 2 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:04,573-Speed 4768.40 samples/sec Loss 7.1816 Epoch: 2 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:15,182-Speed 4826.62 samples/sec Loss 7.1754 Epoch: 2 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:26,100-Speed 4689.65 samples/sec Loss 7.1845 Epoch: 2 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:37,777-Speed 4385.28 samples/sec Loss 7.2102 Epoch: 2 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:48,334-Speed 4850.04 samples/sec Loss 7.1651 Epoch: 2 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:47:58,873-Speed 4858.22 samples/sec Loss 7.1489 Epoch: 2 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:48:10,408-Speed 4439.11 samples/sec Loss 7.2263 Epoch: 2 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:48:21,169-Speed 4758.12 samples/sec Loss 7.1890 Epoch: 2 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:48:31,837-Speed 4799.38 samples/sec Loss 7.2242 Epoch: 2 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:48:42,568-Speed 4771.64 samples/sec Loss 7.2313 Epoch: 2 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:48:53,295-Speed 4773.30 samples/sec Loss 7.1629 Epoch: 2 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:04,648-Speed 4509.96 samples/sec Loss 7.2047 Epoch: 2 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:15,833-Speed 4578.01 samples/sec Loss 7.1908 Epoch: 2 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:27,195-Speed 4506.60 samples/sec Loss 7.2088 Epoch: 2 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:38,205-Speed 4650.44 samples/sec Loss 7.2214 Epoch: 2 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:48,859-Speed 4805.76 samples/sec Loss 7.2097 Epoch: 2 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:49:59,597-Speed 4768.52 samples/sec Loss 7.2455 Epoch: 2 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:50:10,237-Speed 4812.43 samples/sec Loss 7.1573 Epoch: 2 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:50:21,283-Speed 4635.42 samples/sec Loss 7.1334 Epoch: 2 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:50:31,890-Speed 4827.24 samples/sec Loss 7.1844 Epoch: 2 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:50:42,491-Speed 4830.01 samples/sec Loss 7.2049 Epoch: 2 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:51:06,409-[lfw][50000]XNorm: 23.175113 Training: 2021-03-17 18:51:06,409-[lfw][50000]Accuracy-Flip: 0.99533+-0.00356 Training: 2021-03-17 18:51:06,409-[lfw][50000]Accuracy-Highest: 0.99617 Training: 2021-03-17 18:51:34,161-[cfp_fp][50000]XNorm: 19.753257 Training: 2021-03-17 18:51:34,162-[cfp_fp][50000]Accuracy-Flip: 0.93086+-0.00933 Training: 2021-03-17 18:51:34,162-[cfp_fp][50000]Accuracy-Highest: 0.93086 Training: 2021-03-17 18:51:58,105-[agedb_30][50000]XNorm: 22.428776 Training: 2021-03-17 18:51:58,105-[agedb_30][50000]Accuracy-Flip: 0.94850+-0.01081 Training: 2021-03-17 18:51:58,106-[agedb_30][50000]Accuracy-Highest: 0.95450 Training: 2021-03-17 18:52:08,824-Speed 593.06 samples/sec Loss 7.2311 Epoch: 2 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:52:32,678-Speed 2146.43 samples/sec Loss 6.7677 Epoch: 3 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:52:44,217-Speed 4437.49 samples/sec Loss 6.4385 Epoch: 3 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:52:55,669-Speed 4471.25 samples/sec Loss 6.4889 Epoch: 3 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:53:07,159-Speed 4456.43 samples/sec Loss 6.5000 Epoch: 3 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:53:18,470-Speed 4526.87 samples/sec Loss 6.4858 Epoch: 3 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:53:29,569-Speed 4613.36 samples/sec Loss 6.5040 Epoch: 3 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:53:40,739-Speed 4584.22 samples/sec Loss 6.5972 Epoch: 3 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:53:52,261-Speed 4443.98 samples/sec Loss 6.5885 Epoch: 3 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:03,404-Speed 4594.98 samples/sec Loss 6.5887 Epoch: 3 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:14,550-Speed 4593.82 samples/sec Loss 6.6206 Epoch: 3 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:25,564-Speed 4649.26 samples/sec Loss 6.6562 Epoch: 3 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:36,881-Speed 4524.51 samples/sec Loss 6.6682 Epoch: 3 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:48,205-Speed 4521.74 samples/sec Loss 6.6025 Epoch: 3 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:54:59,459-Speed 4549.60 samples/sec Loss 6.6596 Epoch: 3 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:55:10,621-Speed 4587.52 samples/sec Loss 6.6567 Epoch: 3 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:55:21,781-Speed 4587.80 samples/sec Loss 6.7126 Epoch: 3 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:55:33,001-Speed 4563.72 samples/sec Loss 6.7100 Epoch: 3 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:55:44,260-Speed 4547.91 samples/sec Loss 6.7146 Epoch: 3 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:55:55,639-Speed 4499.92 samples/sec Loss 6.7651 Epoch: 3 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:56:08,005-Speed 4140.48 samples/sec Loss 6.7222 Epoch: 3 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:56:19,325-Speed 4523.45 samples/sec Loss 6.7563 Epoch: 3 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:56:30,595-Speed 4543.39 samples/sec Loss 6.7999 Epoch: 3 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:56:41,934-Speed 4515.81 samples/sec Loss 6.7813 Epoch: 3 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:56:53,255-Speed 4522.53 samples/sec Loss 6.8069 Epoch: 3 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:57:04,307-Speed 4633.09 samples/sec Loss 6.7674 Epoch: 3 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:57:15,419-Speed 4607.73 samples/sec Loss 6.7818 Epoch: 3 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:57:26,559-Speed 4596.36 samples/sec Loss 6.8242 Epoch: 3 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:57:37,851-Speed 4534.75 samples/sec Loss 6.8211 Epoch: 3 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:57:48,835-Speed 4661.65 samples/sec Loss 6.8756 Epoch: 3 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:00,472-Speed 4399.95 samples/sec Loss 6.8003 Epoch: 3 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:11,509-Speed 4639.09 samples/sec Loss 6.8605 Epoch: 3 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:23,585-Speed 4240.20 samples/sec Loss 6.8717 Epoch: 3 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:34,574-Speed 4659.48 samples/sec Loss 6.8348 Epoch: 3 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:45,936-Speed 4506.65 samples/sec Loss 6.8300 Epoch: 3 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:58:57,293-Speed 4508.56 samples/sec Loss 6.8436 Epoch: 3 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:59:08,500-Speed 4568.59 samples/sec Loss 6.9132 Epoch: 3 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:59:19,619-Speed 4605.09 samples/sec Loss 6.9044 Epoch: 3 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:59:30,763-Speed 4594.58 samples/sec Loss 6.8962 Epoch: 3 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 18:59:42,903-Speed 4217.92 samples/sec Loss 6.8164 Epoch: 3 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:00:07,460-[lfw][52000]XNorm: 22.704713 Training: 2021-03-17 19:00:07,460-[lfw][52000]Accuracy-Flip: 0.99450+-0.00289 Training: 2021-03-17 19:00:07,462-[lfw][52000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:00:34,959-[cfp_fp][52000]XNorm: 18.485281 Training: 2021-03-17 19:00:34,959-[cfp_fp][52000]Accuracy-Flip: 0.91514+-0.01112 Training: 2021-03-17 19:00:34,959-[cfp_fp][52000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:00:58,735-[agedb_30][52000]XNorm: 21.813977 Training: 2021-03-17 19:00:58,735-[agedb_30][52000]Accuracy-Flip: 0.94933+-0.00995 Training: 2021-03-17 19:00:58,735-[agedb_30][52000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:01:10,546-Speed 584.19 samples/sec Loss 6.9038 Epoch: 3 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:01:21,467-Speed 4688.66 samples/sec Loss 6.9321 Epoch: 3 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:01:32,671-Speed 4569.77 samples/sec Loss 6.8834 Epoch: 3 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:01:43,608-Speed 4682.02 samples/sec Loss 6.9308 Epoch: 3 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:01:55,304-Speed 4377.64 samples/sec Loss 6.8753 Epoch: 3 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:02:06,876-Speed 4424.96 samples/sec Loss 6.9013 Epoch: 3 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:02:18,726-Speed 4321.10 samples/sec Loss 6.9152 Epoch: 3 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:02:29,793-Speed 4626.43 samples/sec Loss 6.8888 Epoch: 3 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:02:40,843-Speed 4633.76 samples/sec Loss 6.8825 Epoch: 3 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:02:51,787-Speed 4678.59 samples/sec Loss 6.8934 Epoch: 3 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:02,694-Speed 4694.50 samples/sec Loss 6.8996 Epoch: 3 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:13,692-Speed 4656.05 samples/sec Loss 6.9139 Epoch: 3 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:24,701-Speed 4650.76 samples/sec Loss 6.9809 Epoch: 3 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:35,629-Speed 4685.41 samples/sec Loss 6.9658 Epoch: 3 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:46,871-Speed 4554.89 samples/sec Loss 6.9615 Epoch: 3 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:03:57,641-Speed 4754.40 samples/sec Loss 6.9458 Epoch: 3 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:04:08,612-Speed 4667.07 samples/sec Loss 6.9919 Epoch: 3 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:04:19,677-Speed 4627.34 samples/sec Loss 6.9261 Epoch: 3 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:04:30,887-Speed 4567.67 samples/sec Loss 6.9857 Epoch: 3 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:04:42,088-Speed 4571.60 samples/sec Loss 6.9370 Epoch: 3 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:04:53,136-Speed 4634.54 samples/sec Loss 6.9691 Epoch: 3 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:04,225-Speed 4617.50 samples/sec Loss 6.9314 Epoch: 3 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:15,376-Speed 4591.60 samples/sec Loss 6.9820 Epoch: 3 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:26,696-Speed 4523.65 samples/sec Loss 6.9896 Epoch: 3 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:37,405-Speed 4781.17 samples/sec Loss 6.9776 Epoch: 3 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:48,397-Speed 4658.41 samples/sec Loss 7.0029 Epoch: 3 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:05:59,316-Speed 4689.35 samples/sec Loss 7.0414 Epoch: 3 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:06:10,277-Speed 4671.64 samples/sec Loss 7.0443 Epoch: 3 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:06:21,317-Speed 4637.91 samples/sec Loss 7.0274 Epoch: 3 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:06:32,116-Speed 4741.54 samples/sec Loss 7.0331 Epoch: 3 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:06:43,010-Speed 4699.92 samples/sec Loss 7.0504 Epoch: 3 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:06:53,996-Speed 4660.89 samples/sec Loss 6.9835 Epoch: 3 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:07:06,272-Speed 4171.01 samples/sec Loss 7.0451 Epoch: 3 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:07:17,018-Speed 4764.58 samples/sec Loss 6.9716 Epoch: 3 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:07:27,991-Speed 4666.45 samples/sec Loss 6.9864 Epoch: 3 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:07:38,884-Speed 4700.45 samples/sec Loss 6.9031 Epoch: 3 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:07:50,086-Speed 4570.91 samples/sec Loss 6.9977 Epoch: 3 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:08:01,097-Speed 4650.61 samples/sec Loss 6.9933 Epoch: 3 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:08:12,006-Speed 4693.34 samples/sec Loss 7.0029 Epoch: 3 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:08:23,159-Speed 4591.07 samples/sec Loss 7.0164 Epoch: 3 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:08:47,648-[lfw][54000]XNorm: 22.731956 Training: 2021-03-17 19:08:47,649-[lfw][54000]Accuracy-Flip: 0.99533+-0.00348 Training: 2021-03-17 19:08:47,649-[lfw][54000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:09:15,564-[cfp_fp][54000]XNorm: 18.409866 Training: 2021-03-17 19:09:15,565-[cfp_fp][54000]Accuracy-Flip: 0.91986+-0.01199 Training: 2021-03-17 19:09:15,565-[cfp_fp][54000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:09:39,696-[agedb_30][54000]XNorm: 21.403002 Training: 2021-03-17 19:09:39,696-[agedb_30][54000]Accuracy-Flip: 0.95033+-0.01140 Training: 2021-03-17 19:09:39,696-[agedb_30][54000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:09:50,393-Speed 586.93 samples/sec Loss 7.0364 Epoch: 3 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:01,473-Speed 4621.55 samples/sec Loss 7.0218 Epoch: 3 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:12,219-Speed 4765.11 samples/sec Loss 7.0478 Epoch: 3 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:23,108-Speed 4702.08 samples/sec Loss 7.0028 Epoch: 3 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:34,420-Speed 4526.64 samples/sec Loss 7.0300 Epoch: 3 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:45,084-Speed 4801.12 samples/sec Loss 6.9859 Epoch: 3 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:10:57,908-Speed 3992.68 samples/sec Loss 7.0202 Epoch: 3 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:11:09,135-Speed 4560.73 samples/sec Loss 6.9977 Epoch: 3 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:11:20,127-Speed 4658.52 samples/sec Loss 7.0354 Epoch: 3 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:11:31,010-Speed 4704.67 samples/sec Loss 7.0469 Epoch: 3 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:11:42,111-Speed 4612.84 samples/sec Loss 7.0200 Epoch: 3 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:11:53,045-Speed 4683.07 samples/sec Loss 7.0841 Epoch: 3 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:12:03,822-Speed 4751.01 samples/sec Loss 7.0465 Epoch: 3 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:12:14,946-Speed 4603.16 samples/sec Loss 7.0278 Epoch: 3 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:12:26,774-Speed 4328.99 samples/sec Loss 7.0524 Epoch: 3 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:12:37,906-Speed 4599.85 samples/sec Loss 7.0109 Epoch: 3 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:12:49,967-Speed 4245.46 samples/sec Loss 6.9786 Epoch: 3 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:00,896-Speed 4684.95 samples/sec Loss 7.0272 Epoch: 3 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:13,525-Speed 4054.64 samples/sec Loss 7.0052 Epoch: 3 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:24,604-Speed 4621.36 samples/sec Loss 7.0817 Epoch: 3 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:36,592-Speed 4271.29 samples/sec Loss 7.0087 Epoch: 3 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:47,550-Speed 4672.79 samples/sec Loss 7.0928 Epoch: 3 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:13:58,459-Speed 4693.54 samples/sec Loss 7.0390 Epoch: 3 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:14:09,545-Speed 4618.98 samples/sec Loss 7.0163 Epoch: 3 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:14:20,347-Speed 4740.03 samples/sec Loss 7.1011 Epoch: 3 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:14:31,132-Speed 4747.63 samples/sec Loss 7.0696 Epoch: 3 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:14:42,002-Speed 4710.51 samples/sec Loss 7.1482 Epoch: 3 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:14:52,859-Speed 4716.18 samples/sec Loss 7.0382 Epoch: 3 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:03,829-Speed 4667.76 samples/sec Loss 7.0857 Epoch: 3 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:14,911-Speed 4620.41 samples/sec Loss 7.0712 Epoch: 3 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:25,610-Speed 4785.90 samples/sec Loss 7.0660 Epoch: 3 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:36,622-Speed 4649.75 samples/sec Loss 7.0659 Epoch: 3 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:47,758-Speed 4598.02 samples/sec Loss 7.0740 Epoch: 3 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:15:58,758-Speed 4654.59 samples/sec Loss 6.9956 Epoch: 3 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:16:09,641-Speed 4705.05 samples/sec Loss 7.0686 Epoch: 3 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:16:20,569-Speed 4685.60 samples/sec Loss 7.0645 Epoch: 3 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:16:32,041-Speed 4463.22 samples/sec Loss 7.0385 Epoch: 3 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:16:43,153-Speed 4607.89 samples/sec Loss 7.0971 Epoch: 3 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:16:53,983-Speed 4727.93 samples/sec Loss 7.0424 Epoch: 3 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:17:04,944-Speed 4671.24 samples/sec Loss 7.0992 Epoch: 3 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:17:29,587-[lfw][56000]XNorm: 22.468015 Training: 2021-03-17 19:17:29,587-[lfw][56000]Accuracy-Flip: 0.99450+-0.00435 Training: 2021-03-17 19:17:29,587-[lfw][56000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:17:57,321-[cfp_fp][56000]XNorm: 18.685604 Training: 2021-03-17 19:17:57,321-[cfp_fp][56000]Accuracy-Flip: 0.91429+-0.01183 Training: 2021-03-17 19:17:57,321-[cfp_fp][56000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:18:21,438-[agedb_30][56000]XNorm: 21.892714 Training: 2021-03-17 19:18:21,438-[agedb_30][56000]Accuracy-Flip: 0.94933+-0.01031 Training: 2021-03-17 19:18:21,438-[agedb_30][56000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:18:32,525-Speed 584.61 samples/sec Loss 7.0844 Epoch: 3 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:18:43,797-Speed 4542.43 samples/sec Loss 7.0941 Epoch: 3 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:18:54,963-Speed 4585.49 samples/sec Loss 7.0811 Epoch: 3 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:19:06,029-Speed 4627.50 samples/sec Loss 7.0257 Epoch: 3 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:19:16,897-Speed 4711.08 samples/sec Loss 7.1084 Epoch: 3 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:19:27,770-Speed 4709.32 samples/sec Loss 7.0321 Epoch: 3 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:19:38,891-Speed 4604.42 samples/sec Loss 7.0683 Epoch: 3 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:19:50,166-Speed 4541.08 samples/sec Loss 7.0642 Epoch: 3 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:01,004-Speed 4724.78 samples/sec Loss 7.0778 Epoch: 3 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:13,530-Speed 4087.58 samples/sec Loss 7.0813 Epoch: 3 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:24,485-Speed 4674.24 samples/sec Loss 7.0756 Epoch: 3 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:35,485-Speed 4654.57 samples/sec Loss 7.1060 Epoch: 3 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:46,586-Speed 4612.77 samples/sec Loss 7.1023 Epoch: 3 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:20:57,441-Speed 4716.85 samples/sec Loss 7.0454 Epoch: 3 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:21:08,197-Speed 4760.71 samples/sec Loss 7.0344 Epoch: 3 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:21:19,304-Speed 4609.84 samples/sec Loss 7.1250 Epoch: 3 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:21:30,364-Speed 4629.41 samples/sec Loss 7.0971 Epoch: 3 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:21:41,347-Speed 4662.11 samples/sec Loss 7.1450 Epoch: 3 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:21:52,584-Speed 4556.81 samples/sec Loss 7.0869 Epoch: 3 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:22:03,839-Speed 4549.55 samples/sec Loss 7.0700 Epoch: 3 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:22:14,732-Speed 4700.65 samples/sec Loss 7.0839 Epoch: 3 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:22:26,538-Speed 4337.03 samples/sec Loss 7.1008 Epoch: 3 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:22:38,192-Speed 4393.39 samples/sec Loss 7.0959 Epoch: 3 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:22:49,077-Speed 4704.08 samples/sec Loss 7.0931 Epoch: 3 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:00,020-Speed 4679.16 samples/sec Loss 7.0811 Epoch: 3 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:11,073-Speed 4632.21 samples/sec Loss 7.0660 Epoch: 3 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:21,989-Speed 4690.59 samples/sec Loss 7.1508 Epoch: 3 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:33,079-Speed 4617.29 samples/sec Loss 7.0785 Epoch: 3 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:44,149-Speed 4625.23 samples/sec Loss 7.0572 Epoch: 3 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:23:55,068-Speed 4689.47 samples/sec Loss 7.1029 Epoch: 3 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:24:06,943-Speed 4311.67 samples/sec Loss 7.0661 Epoch: 3 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:24:18,821-Speed 4311.02 samples/sec Loss 7.1224 Epoch: 3 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:24:30,471-Speed 4395.16 samples/sec Loss 7.0698 Epoch: 3 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:24:42,305-Speed 4326.75 samples/sec Loss 7.0930 Epoch: 3 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:24:53,987-Speed 4383.19 samples/sec Loss 7.1036 Epoch: 3 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:25:04,806-Speed 4732.68 samples/sec Loss 7.0712 Epoch: 3 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:25:15,816-Speed 4650.83 samples/sec Loss 7.0836 Epoch: 3 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:25:27,139-Speed 4521.77 samples/sec Loss 7.0810 Epoch: 3 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:25:38,118-Speed 4664.10 samples/sec Loss 7.0455 Epoch: 3 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:25:49,091-Speed 4666.16 samples/sec Loss 7.0181 Epoch: 3 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:26:13,067-[lfw][58000]XNorm: 21.955866 Training: 2021-03-17 19:26:13,068-[lfw][58000]Accuracy-Flip: 0.99567+-0.00200 Training: 2021-03-17 19:26:13,068-[lfw][58000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:26:40,652-[cfp_fp][58000]XNorm: 18.326392 Training: 2021-03-17 19:26:40,653-[cfp_fp][58000]Accuracy-Flip: 0.92000+-0.01340 Training: 2021-03-17 19:26:40,653-[cfp_fp][58000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:27:04,517-[agedb_30][58000]XNorm: 21.134455 Training: 2021-03-17 19:27:04,517-[agedb_30][58000]Accuracy-Flip: 0.94450+-0.01293 Training: 2021-03-17 19:27:04,517-[agedb_30][58000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:27:15,486-Speed 592.63 samples/sec Loss 7.0678 Epoch: 3 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:27:26,260-Speed 4752.20 samples/sec Loss 7.1854 Epoch: 3 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:27:37,508-Speed 4552.41 samples/sec Loss 7.0946 Epoch: 3 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:27:48,525-Speed 4647.68 samples/sec Loss 7.0743 Epoch: 3 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:27:59,514-Speed 4659.60 samples/sec Loss 7.1076 Epoch: 3 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:28:10,648-Speed 4598.49 samples/sec Loss 7.0455 Epoch: 3 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:28:21,746-Speed 4613.81 samples/sec Loss 7.1343 Epoch: 3 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:28:32,982-Speed 4557.11 samples/sec Loss 7.0976 Epoch: 3 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:28:43,782-Speed 4741.16 samples/sec Loss 7.1331 Epoch: 3 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:28:54,689-Speed 4694.53 samples/sec Loss 7.0414 Epoch: 3 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:29:05,667-Speed 4664.44 samples/sec Loss 7.0714 Epoch: 3 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:29:16,651-Speed 4661.39 samples/sec Loss 7.1146 Epoch: 3 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:29:27,364-Speed 4779.58 samples/sec Loss 7.1147 Epoch: 3 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:29:38,124-Speed 4758.71 samples/sec Loss 7.1344 Epoch: 3 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:29:49,288-Speed 4586.22 samples/sec Loss 7.0814 Epoch: 3 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:00,552-Speed 4545.92 samples/sec Loss 7.1016 Epoch: 3 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:11,863-Speed 4526.56 samples/sec Loss 7.0371 Epoch: 3 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:22,967-Speed 4611.31 samples/sec Loss 7.0691 Epoch: 3 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:33,953-Speed 4660.79 samples/sec Loss 7.1005 Epoch: 3 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:44,907-Speed 4674.46 samples/sec Loss 7.0910 Epoch: 3 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:30:55,930-Speed 4645.14 samples/sec Loss 7.0845 Epoch: 3 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:31:06,908-Speed 4664.30 samples/sec Loss 7.1339 Epoch: 3 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:31:17,951-Speed 4636.60 samples/sec Loss 7.0620 Epoch: 3 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:31:29,008-Speed 4631.00 samples/sec Loss 7.0930 Epoch: 3 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:31:41,072-Speed 4244.10 samples/sec Loss 7.1104 Epoch: 3 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:31:51,912-Speed 4723.58 samples/sec Loss 7.1330 Epoch: 3 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:02,889-Speed 4664.46 samples/sec Loss 7.1154 Epoch: 3 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:13,624-Speed 4770.07 samples/sec Loss 7.0840 Epoch: 3 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:24,281-Speed 4804.90 samples/sec Loss 7.0988 Epoch: 3 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:35,138-Speed 4716.34 samples/sec Loss 7.0844 Epoch: 3 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:45,949-Speed 4736.04 samples/sec Loss 7.1462 Epoch: 3 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:32:56,863-Speed 4691.45 samples/sec Loss 7.0963 Epoch: 3 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:33:07,981-Speed 4605.48 samples/sec Loss 7.0737 Epoch: 3 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:33:19,008-Speed 4643.73 samples/sec Loss 7.0917 Epoch: 3 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:33:29,940-Speed 4683.75 samples/sec Loss 7.0456 Epoch: 3 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:33:41,017-Speed 4622.56 samples/sec Loss 7.1383 Epoch: 3 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:33:51,741-Speed 4774.55 samples/sec Loss 7.0922 Epoch: 3 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:34:04,220-Speed 4103.20 samples/sec Loss 7.0698 Epoch: 3 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:34:15,253-Speed 4640.71 samples/sec Loss 7.1102 Epoch: 3 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:34:26,067-Speed 4735.06 samples/sec Loss 7.1264 Epoch: 3 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:34:50,281-[lfw][60000]XNorm: 21.288025 Training: 2021-03-17 19:34:50,281-[lfw][60000]Accuracy-Flip: 0.99467+-0.00323 Training: 2021-03-17 19:34:50,281-[lfw][60000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:35:17,803-[cfp_fp][60000]XNorm: 17.389225 Training: 2021-03-17 19:35:17,803-[cfp_fp][60000]Accuracy-Flip: 0.92871+-0.01137 Training: 2021-03-17 19:35:17,803-[cfp_fp][60000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:35:42,013-[agedb_30][60000]XNorm: 20.349388 Training: 2021-03-17 19:35:42,013-[agedb_30][60000]Accuracy-Flip: 0.95217+-0.01003 Training: 2021-03-17 19:35:42,013-[agedb_30][60000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:35:53,160-Speed 587.89 samples/sec Loss 7.0607 Epoch: 3 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:36:04,344-Speed 4578.09 samples/sec Loss 7.0726 Epoch: 3 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:36:15,644-Speed 4531.20 samples/sec Loss 7.0278 Epoch: 3 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:36:26,634-Speed 4658.97 samples/sec Loss 7.0982 Epoch: 3 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:36:37,487-Speed 4717.94 samples/sec Loss 7.1076 Epoch: 3 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:36:49,040-Speed 4432.26 samples/sec Loss 7.0508 Epoch: 3 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:01,699-Speed 4044.82 samples/sec Loss 7.0343 Epoch: 3 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:13,335-Speed 4400.58 samples/sec Loss 7.0338 Epoch: 3 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:25,202-Speed 4314.59 samples/sec Loss 7.0635 Epoch: 3 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:35,921-Speed 4777.40 samples/sec Loss 7.0839 Epoch: 3 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:46,761-Speed 4723.26 samples/sec Loss 7.0420 Epoch: 3 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:37:57,574-Speed 4735.68 samples/sec Loss 7.1251 Epoch: 3 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:38:08,370-Speed 4742.69 samples/sec Loss 7.0618 Epoch: 3 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:38:19,454-Speed 4619.56 samples/sec Loss 7.0757 Epoch: 3 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:38:30,613-Speed 4588.63 samples/sec Loss 7.0922 Epoch: 3 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:38:42,028-Speed 4485.57 samples/sec Loss 7.0705 Epoch: 3 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:38:53,005-Speed 4664.46 samples/sec Loss 7.0916 Epoch: 3 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-17 19:39:04,079-Speed 4623.98 samples/sec Loss 7.0493 Epoch: 3 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:39:15,056-Speed 4664.31 samples/sec Loss 7.0804 Epoch: 3 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:39:25,798-Speed 4766.90 samples/sec Loss 7.1345 Epoch: 3 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:39:36,834-Speed 4639.55 samples/sec Loss 7.1070 Epoch: 3 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:39:47,816-Speed 4662.57 samples/sec Loss 7.1171 Epoch: 3 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:39:58,888-Speed 4624.26 samples/sec Loss 7.0672 Epoch: 3 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:40:10,007-Speed 4605.07 samples/sec Loss 7.0962 Epoch: 3 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:40:20,680-Speed 4797.55 samples/sec Loss 7.0775 Epoch: 3 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:40:31,628-Speed 4677.25 samples/sec Loss 6.9813 Epoch: 3 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:40:42,675-Speed 4634.95 samples/sec Loss 7.0927 Epoch: 3 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:40:53,972-Speed 4532.62 samples/sec Loss 7.1084 Epoch: 3 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:41:04,800-Speed 4728.56 samples/sec Loss 7.1052 Epoch: 3 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:41:15,807-Speed 4652.07 samples/sec Loss 7.0434 Epoch: 3 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:41:27,054-Speed 4552.60 samples/sec Loss 7.0944 Epoch: 3 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:41:38,102-Speed 4634.33 samples/sec Loss 7.0640 Epoch: 3 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:41:49,250-Speed 4593.19 samples/sec Loss 7.1114 Epoch: 3 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:00,272-Speed 4645.63 samples/sec Loss 7.1054 Epoch: 3 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:11,373-Speed 4612.56 samples/sec Loss 7.1219 Epoch: 3 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:22,576-Speed 4570.56 samples/sec Loss 7.1270 Epoch: 3 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:33,523-Speed 4677.47 samples/sec Loss 7.0789 Epoch: 3 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:44,661-Speed 4596.90 samples/sec Loss 7.1023 Epoch: 3 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:42:55,393-Speed 4771.32 samples/sec Loss 7.0517 Epoch: 3 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:43:06,264-Speed 4710.04 samples/sec Loss 7.0321 Epoch: 3 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:43:30,757-[lfw][62000]XNorm: 24.116111 Training: 2021-03-17 19:43:30,758-[lfw][62000]Accuracy-Flip: 0.99583+-0.00291 Training: 2021-03-17 19:43:30,758-[lfw][62000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:43:58,505-[cfp_fp][62000]XNorm: 20.285276 Training: 2021-03-17 19:43:58,505-[cfp_fp][62000]Accuracy-Flip: 0.91214+-0.01252 Training: 2021-03-17 19:43:58,505-[cfp_fp][62000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:44:22,434-[agedb_30][62000]XNorm: 22.808564 Training: 2021-03-17 19:44:22,435-[agedb_30][62000]Accuracy-Flip: 0.95033+-0.00948 Training: 2021-03-17 19:44:22,435-[agedb_30][62000]Accuracy-Highest: 0.95450 Training: 2021-03-17 19:44:33,586-Speed 586.34 samples/sec Loss 7.0314 Epoch: 3 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:44:45,476-Speed 4306.21 samples/sec Loss 7.0826 Epoch: 3 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:44:56,635-Speed 4588.49 samples/sec Loss 7.0813 Epoch: 3 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:45:08,024-Speed 4495.79 samples/sec Loss 7.0562 Epoch: 3 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:45:19,148-Speed 4603.20 samples/sec Loss 7.1270 Epoch: 3 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:45:30,104-Speed 4673.67 samples/sec Loss 7.1116 Epoch: 3 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:45:41,111-Speed 4651.95 samples/sec Loss 7.0744 Epoch: 3 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:45:51,865-Speed 4761.42 samples/sec Loss 7.0910 Epoch: 3 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:03,113-Speed 4552.06 samples/sec Loss 7.0506 Epoch: 3 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:14,077-Speed 4670.15 samples/sec Loss 7.0926 Epoch: 3 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:25,094-Speed 4647.75 samples/sec Loss 7.0844 Epoch: 3 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:36,201-Speed 4610.08 samples/sec Loss 7.0280 Epoch: 3 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:47,268-Speed 4626.40 samples/sec Loss 7.0981 Epoch: 3 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:46:59,150-Speed 4309.45 samples/sec Loss 7.1064 Epoch: 3 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:47:11,021-Speed 4313.19 samples/sec Loss 7.0522 Epoch: 3 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:47:22,145-Speed 4603.08 samples/sec Loss 7.1255 Epoch: 3 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:47:32,999-Speed 4717.13 samples/sec Loss 7.1305 Epoch: 3 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:47:44,061-Speed 4629.17 samples/sec Loss 7.0317 Epoch: 3 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:47:54,944-Speed 4704.63 samples/sec Loss 6.9872 Epoch: 3 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:48:06,938-Speed 4269.04 samples/sec Loss 7.0460 Epoch: 3 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:48:18,486-Speed 4433.98 samples/sec Loss 7.0686 Epoch: 3 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:48:29,693-Speed 4569.16 samples/sec Loss 7.0697 Epoch: 3 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:48:42,296-Speed 4062.90 samples/sec Loss 7.0848 Epoch: 3 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:48:53,228-Speed 4683.79 samples/sec Loss 7.0779 Epoch: 3 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:49:04,859-Speed 4402.14 samples/sec Loss 7.0316 Epoch: 3 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:49:15,843-Speed 4661.62 samples/sec Loss 7.0147 Epoch: 3 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:49:26,726-Speed 4705.14 samples/sec Loss 7.1123 Epoch: 3 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:49:37,857-Speed 4600.02 samples/sec Loss 7.0803 Epoch: 3 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:49:48,895-Speed 4638.77 samples/sec Loss 7.0923 Epoch: 3 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:00,001-Speed 4610.27 samples/sec Loss 7.0409 Epoch: 3 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:11,393-Speed 4494.85 samples/sec Loss 7.0548 Epoch: 3 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:22,208-Speed 4734.33 samples/sec Loss 7.1351 Epoch: 3 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:33,365-Speed 4589.28 samples/sec Loss 7.0994 Epoch: 3 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:44,569-Speed 4569.96 samples/sec Loss 7.0729 Epoch: 3 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:50:55,633-Speed 4628.01 samples/sec Loss 7.0581 Epoch: 3 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:51:06,471-Speed 4724.59 samples/sec Loss 7.0707 Epoch: 3 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:51:17,235-Speed 4756.87 samples/sec Loss 7.0977 Epoch: 3 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:51:28,189-Speed 4674.31 samples/sec Loss 7.0554 Epoch: 3 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:51:39,090-Speed 4697.11 samples/sec Loss 7.0112 Epoch: 3 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:51:49,947-Speed 4716.03 samples/sec Loss 7.0379 Epoch: 3 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:52:14,150-[lfw][64000]XNorm: 23.228415 Training: 2021-03-17 19:52:14,150-[lfw][64000]Accuracy-Flip: 0.99467+-0.00348 Training: 2021-03-17 19:52:14,150-[lfw][64000]Accuracy-Highest: 0.99617 Training: 2021-03-17 19:52:41,387-[cfp_fp][64000]XNorm: 19.158894 Training: 2021-03-17 19:52:41,387-[cfp_fp][64000]Accuracy-Flip: 0.92400+-0.01022 Training: 2021-03-17 19:52:41,387-[cfp_fp][64000]Accuracy-Highest: 0.93086 Training: 2021-03-17 19:53:05,292-[agedb_30][64000]XNorm: 22.500153 Training: 2021-03-17 19:53:05,292-[agedb_30][64000]Accuracy-Flip: 0.95467+-0.00826 Training: 2021-03-17 19:53:05,292-[agedb_30][64000]Accuracy-Highest: 0.95467 Training: 2021-03-17 19:53:16,233-Speed 593.38 samples/sec Loss 7.0403 Epoch: 3 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:53:27,265-Speed 4641.48 samples/sec Loss 7.0547 Epoch: 3 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:53:38,368-Speed 4611.57 samples/sec Loss 7.0666 Epoch: 3 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:53:49,491-Speed 4603.52 samples/sec Loss 7.0495 Epoch: 3 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:00,491-Speed 4655.06 samples/sec Loss 6.9721 Epoch: 3 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:11,516-Speed 4644.30 samples/sec Loss 7.0883 Epoch: 3 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:22,357-Speed 4723.52 samples/sec Loss 7.0586 Epoch: 3 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:33,361-Speed 4653.06 samples/sec Loss 7.0724 Epoch: 3 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:44,317-Speed 4673.35 samples/sec Loss 7.0799 Epoch: 3 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:54:54,954-Speed 4813.81 samples/sec Loss 7.0699 Epoch: 3 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:55:05,905-Speed 4675.84 samples/sec Loss 7.0947 Epoch: 3 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:55:16,891-Speed 4660.54 samples/sec Loss 7.0877 Epoch: 3 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:55:27,901-Speed 4650.47 samples/sec Loss 7.0804 Epoch: 3 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:55:38,921-Speed 4646.75 samples/sec Loss 7.1024 Epoch: 3 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:55:49,688-Speed 4755.62 samples/sec Loss 7.1011 Epoch: 3 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:01,733-Speed 4250.76 samples/sec Loss 7.1114 Epoch: 3 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:12,760-Speed 4643.84 samples/sec Loss 7.1123 Epoch: 3 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:23,985-Speed 4561.52 samples/sec Loss 7.0529 Epoch: 3 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:34,952-Speed 4668.90 samples/sec Loss 7.0331 Epoch: 3 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:46,058-Speed 4610.04 samples/sec Loss 7.0991 Epoch: 3 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:56:57,062-Speed 4653.14 samples/sec Loss 7.0745 Epoch: 3 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:57:08,084-Speed 4645.68 samples/sec Loss 6.9701 Epoch: 3 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:57:19,245-Speed 4587.91 samples/sec Loss 7.0450 Epoch: 3 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:57:30,067-Speed 4731.05 samples/sec Loss 7.0457 Epoch: 3 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:57:40,957-Speed 4702.30 samples/sec Loss 7.0572 Epoch: 3 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:57:51,724-Speed 4755.55 samples/sec Loss 7.0585 Epoch: 3 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:02,617-Speed 4700.37 samples/sec Loss 7.0945 Epoch: 3 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:13,472-Speed 4717.12 samples/sec Loss 7.0220 Epoch: 3 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:25,471-Speed 4267.21 samples/sec Loss 7.0475 Epoch: 3 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:36,341-Speed 4710.52 samples/sec Loss 7.1140 Epoch: 3 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:47,571-Speed 4559.45 samples/sec Loss 7.0374 Epoch: 3 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:58:59,345-Speed 4348.84 samples/sec Loss 7.0244 Epoch: 3 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:59:10,129-Speed 4748.00 samples/sec Loss 7.0722 Epoch: 3 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:59:20,974-Speed 4721.58 samples/sec Loss 7.1216 Epoch: 3 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:59:33,465-Speed 4099.15 samples/sec Loss 7.1620 Epoch: 3 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:59:44,979-Speed 4447.33 samples/sec Loss 7.0244 Epoch: 3 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 19:59:55,697-Speed 4777.30 samples/sec Loss 7.0513 Epoch: 3 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:00:06,514-Speed 4733.35 samples/sec Loss 7.0689 Epoch: 3 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:00:18,898-Speed 4134.67 samples/sec Loss 7.0602 Epoch: 3 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:00:30,064-Speed 4585.53 samples/sec Loss 7.0390 Epoch: 3 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:00:54,345-[lfw][66000]XNorm: 22.637630 Training: 2021-03-17 20:00:54,346-[lfw][66000]Accuracy-Flip: 0.99567+-0.00260 Training: 2021-03-17 20:00:54,346-[lfw][66000]Accuracy-Highest: 0.99617 Training: 2021-03-17 20:01:21,960-[cfp_fp][66000]XNorm: 18.749096 Training: 2021-03-17 20:01:21,960-[cfp_fp][66000]Accuracy-Flip: 0.90971+-0.01230 Training: 2021-03-17 20:01:21,960-[cfp_fp][66000]Accuracy-Highest: 0.93086 Training: 2021-03-17 20:01:45,716-[agedb_30][66000]XNorm: 21.613135 Training: 2021-03-17 20:01:45,716-[agedb_30][66000]Accuracy-Flip: 0.95367+-0.01110 Training: 2021-03-17 20:01:45,716-[agedb_30][66000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:01:56,824-Speed 590.14 samples/sec Loss 7.0432 Epoch: 3 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:02:07,696-Speed 4709.47 samples/sec Loss 7.1072 Epoch: 3 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:02:18,869-Speed 4582.70 samples/sec Loss 7.0586 Epoch: 3 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:02:29,913-Speed 4636.53 samples/sec Loss 6.9977 Epoch: 3 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:02:40,704-Speed 4744.98 samples/sec Loss 7.0427 Epoch: 3 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:02:52,015-Speed 4526.57 samples/sec Loss 6.9986 Epoch: 3 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:03,301-Speed 4537.10 samples/sec Loss 7.0405 Epoch: 3 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:14,302-Speed 4654.35 samples/sec Loss 7.0595 Epoch: 3 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:25,429-Speed 4601.80 samples/sec Loss 7.0537 Epoch: 3 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:36,415-Speed 4660.44 samples/sec Loss 7.0680 Epoch: 3 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:47,335-Speed 4689.22 samples/sec Loss 7.0636 Epoch: 3 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:03:58,495-Speed 4588.10 samples/sec Loss 7.0432 Epoch: 3 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:04:09,494-Speed 4655.32 samples/sec Loss 7.0492 Epoch: 3 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:04:20,516-Speed 4645.61 samples/sec Loss 6.9941 Epoch: 3 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:04:31,799-Speed 4537.95 samples/sec Loss 7.0652 Epoch: 3 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:04:55,667-Speed 2145.23 samples/sec Loss 6.5863 Epoch: 4 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:05:06,808-Speed 4595.98 samples/sec Loss 6.3633 Epoch: 4 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:05:17,734-Speed 4686.58 samples/sec Loss 6.3872 Epoch: 4 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:05:28,371-Speed 4813.63 samples/sec Loss 6.3765 Epoch: 4 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:05:39,285-Speed 4691.62 samples/sec Loss 6.3946 Epoch: 4 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:05:50,342-Speed 4631.03 samples/sec Loss 6.4280 Epoch: 4 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:01,287-Speed 4678.33 samples/sec Loss 6.4318 Epoch: 4 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:12,251-Speed 4669.89 samples/sec Loss 6.4467 Epoch: 4 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:23,253-Speed 4654.12 samples/sec Loss 6.5000 Epoch: 4 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:34,031-Speed 4750.81 samples/sec Loss 6.5300 Epoch: 4 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:44,846-Speed 4734.41 samples/sec Loss 6.4949 Epoch: 4 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:06:55,687-Speed 4723.20 samples/sec Loss 6.5207 Epoch: 4 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:07:06,687-Speed 4654.52 samples/sec Loss 6.4967 Epoch: 4 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:07:17,558-Speed 4710.11 samples/sec Loss 6.5451 Epoch: 4 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:07:28,450-Speed 4701.09 samples/sec Loss 6.5469 Epoch: 4 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:07:40,670-Speed 4189.90 samples/sec Loss 6.5463 Epoch: 4 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:07:51,724-Speed 4632.01 samples/sec Loss 6.5941 Epoch: 4 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:02,594-Speed 4710.85 samples/sec Loss 6.5930 Epoch: 4 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:13,405-Speed 4736.05 samples/sec Loss 6.6023 Epoch: 4 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:24,351-Speed 4677.64 samples/sec Loss 6.6052 Epoch: 4 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:35,215-Speed 4713.27 samples/sec Loss 6.6224 Epoch: 4 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:46,037-Speed 4731.35 samples/sec Loss 6.6195 Epoch: 4 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:08:56,932-Speed 4699.37 samples/sec Loss 6.6693 Epoch: 4 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:09:07,816-Speed 4704.34 samples/sec Loss 6.5952 Epoch: 4 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:09:18,736-Speed 4689.31 samples/sec Loss 6.6569 Epoch: 4 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:09:43,205-[lfw][68000]XNorm: 23.623714 Training: 2021-03-17 20:09:43,206-[lfw][68000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-17 20:09:43,206-[lfw][68000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:10:10,914-[cfp_fp][68000]XNorm: 19.500993 Training: 2021-03-17 20:10:10,914-[cfp_fp][68000]Accuracy-Flip: 0.93229+-0.00902 Training: 2021-03-17 20:10:10,914-[cfp_fp][68000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:10:34,867-[agedb_30][68000]XNorm: 22.785664 Training: 2021-03-17 20:10:34,867-[agedb_30][68000]Accuracy-Flip: 0.94450+-0.01011 Training: 2021-03-17 20:10:34,867-[agedb_30][68000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:10:45,722-Speed 588.61 samples/sec Loss 6.7158 Epoch: 4 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:10:56,645-Speed 4687.40 samples/sec Loss 6.7360 Epoch: 4 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:11:07,644-Speed 4655.13 samples/sec Loss 6.7039 Epoch: 4 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:11:19,193-Speed 4433.63 samples/sec Loss 6.6975 Epoch: 4 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:11:29,809-Speed 4823.35 samples/sec Loss 6.6798 Epoch: 4 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:11:40,679-Speed 4710.71 samples/sec Loss 6.6993 Epoch: 4 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:11:51,976-Speed 4532.48 samples/sec Loss 6.6959 Epoch: 4 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:12:02,906-Speed 4684.65 samples/sec Loss 6.7473 Epoch: 4 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:12:14,367-Speed 4467.56 samples/sec Loss 6.7194 Epoch: 4 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:12:25,225-Speed 4715.79 samples/sec Loss 6.7778 Epoch: 4 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:12:36,867-Speed 4398.03 samples/sec Loss 6.7510 Epoch: 4 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:12:48,493-Speed 4403.92 samples/sec Loss 6.7157 Epoch: 4 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:00,196-Speed 4375.21 samples/sec Loss 6.7163 Epoch: 4 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:11,707-Speed 4448.53 samples/sec Loss 6.8049 Epoch: 4 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:22,352-Speed 4810.09 samples/sec Loss 6.7475 Epoch: 4 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:33,139-Speed 4746.58 samples/sec Loss 6.8021 Epoch: 4 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:44,140-Speed 4654.23 samples/sec Loss 6.8039 Epoch: 4 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:13:54,886-Speed 4765.01 samples/sec Loss 6.7585 Epoch: 4 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:14:05,509-Speed 4819.97 samples/sec Loss 6.8091 Epoch: 4 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:14:16,219-Speed 4780.84 samples/sec Loss 6.8378 Epoch: 4 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:14:27,268-Speed 4634.28 samples/sec Loss 6.7719 Epoch: 4 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:14:38,297-Speed 4642.30 samples/sec Loss 6.8104 Epoch: 4 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:14:49,227-Speed 4684.91 samples/sec Loss 6.8232 Epoch: 4 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:00,004-Speed 4751.14 samples/sec Loss 6.7833 Epoch: 4 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:10,950-Speed 4677.39 samples/sec Loss 6.8082 Epoch: 4 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:21,863-Speed 4692.08 samples/sec Loss 6.8358 Epoch: 4 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:32,766-Speed 4696.55 samples/sec Loss 6.8442 Epoch: 4 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:43,667-Speed 4696.91 samples/sec Loss 6.8407 Epoch: 4 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:15:54,514-Speed 4720.43 samples/sec Loss 6.8300 Epoch: 4 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:05,254-Speed 4767.54 samples/sec Loss 6.8521 Epoch: 4 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:15,867-Speed 4824.40 samples/sec Loss 6.8386 Epoch: 4 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:26,640-Speed 4753.23 samples/sec Loss 6.7908 Epoch: 4 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:37,592-Speed 4674.97 samples/sec Loss 6.8797 Epoch: 4 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:48,301-Speed 4781.75 samples/sec Loss 6.8578 Epoch: 4 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:16:59,128-Speed 4729.33 samples/sec Loss 6.9034 Epoch: 4 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:17:10,260-Speed 4599.44 samples/sec Loss 6.9460 Epoch: 4 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:17:21,059-Speed 4741.40 samples/sec Loss 6.8760 Epoch: 4 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:17:31,768-Speed 4781.46 samples/sec Loss 6.9195 Epoch: 4 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:17:42,600-Speed 4726.94 samples/sec Loss 6.8705 Epoch: 4 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:17:53,562-Speed 4670.94 samples/sec Loss 6.8432 Epoch: 4 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:18:17,503-[lfw][70000]XNorm: 23.308796 Training: 2021-03-17 20:18:17,503-[lfw][70000]Accuracy-Flip: 0.99450+-0.00289 Training: 2021-03-17 20:18:17,503-[lfw][70000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:18:45,253-[cfp_fp][70000]XNorm: 19.372478 Training: 2021-03-17 20:18:45,254-[cfp_fp][70000]Accuracy-Flip: 0.92500+-0.01229 Training: 2021-03-17 20:18:45,254-[cfp_fp][70000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:19:09,020-[agedb_30][70000]XNorm: 22.300525 Training: 2021-03-17 20:19:09,021-[agedb_30][70000]Accuracy-Flip: 0.95350+-0.00920 Training: 2021-03-17 20:19:09,021-[agedb_30][70000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:19:19,665-Speed 594.64 samples/sec Loss 6.9006 Epoch: 4 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:19:30,435-Speed 4754.10 samples/sec Loss 6.8887 Epoch: 4 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:19:41,087-Speed 4806.64 samples/sec Loss 6.8830 Epoch: 4 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:19:52,050-Speed 4670.68 samples/sec Loss 6.8540 Epoch: 4 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:04,007-Speed 4282.23 samples/sec Loss 6.9696 Epoch: 4 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:14,882-Speed 4708.42 samples/sec Loss 6.8459 Epoch: 4 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:25,602-Speed 4776.15 samples/sec Loss 6.8951 Epoch: 4 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:36,499-Speed 4698.88 samples/sec Loss 6.8424 Epoch: 4 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:47,460-Speed 4671.56 samples/sec Loss 6.8662 Epoch: 4 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:20:57,896-Speed 4906.12 samples/sec Loss 6.8960 Epoch: 4 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:21:08,630-Speed 4770.36 samples/sec Loss 6.9166 Epoch: 4 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:21:19,338-Speed 4781.92 samples/sec Loss 6.9270 Epoch: 4 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:21:30,247-Speed 4693.74 samples/sec Loss 6.8910 Epoch: 4 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:21:40,964-Speed 4777.93 samples/sec Loss 6.9100 Epoch: 4 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:21:51,908-Speed 4678.43 samples/sec Loss 6.9381 Epoch: 4 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:02,486-Speed 4840.82 samples/sec Loss 6.9294 Epoch: 4 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:13,370-Speed 4704.38 samples/sec Loss 6.9382 Epoch: 4 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:24,790-Speed 4483.50 samples/sec Loss 6.9476 Epoch: 4 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:35,336-Speed 4855.06 samples/sec Loss 6.9519 Epoch: 4 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:45,850-Speed 4870.10 samples/sec Loss 6.9300 Epoch: 4 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:22:56,603-Speed 4761.78 samples/sec Loss 6.9190 Epoch: 4 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:23:08,129-Speed 4442.43 samples/sec Loss 6.9327 Epoch: 4 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:23:18,786-Speed 4804.41 samples/sec Loss 6.8986 Epoch: 4 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:23:29,626-Speed 4723.43 samples/sec Loss 6.9459 Epoch: 4 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:23:40,401-Speed 4752.04 samples/sec Loss 6.9005 Epoch: 4 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:23:50,977-Speed 4841.95 samples/sec Loss 6.9324 Epoch: 4 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:01,749-Speed 4753.29 samples/sec Loss 6.8989 Epoch: 4 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:12,220-Speed 4889.83 samples/sec Loss 6.8845 Epoch: 4 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:23,460-Speed 4555.78 samples/sec Loss 6.9175 Epoch: 4 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:35,027-Speed 4426.72 samples/sec Loss 6.9134 Epoch: 4 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:46,589-Speed 4428.40 samples/sec Loss 6.9394 Epoch: 4 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:24:57,190-Speed 4830.28 samples/sec Loss 6.9470 Epoch: 4 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:25:07,810-Speed 4821.40 samples/sec Loss 6.9833 Epoch: 4 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:25:18,519-Speed 4781.15 samples/sec Loss 6.9790 Epoch: 4 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:25:29,252-Speed 4770.60 samples/sec Loss 6.9341 Epoch: 4 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:25:39,919-Speed 4800.31 samples/sec Loss 6.9121 Epoch: 4 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:25:50,836-Speed 4689.89 samples/sec Loss 6.9326 Epoch: 4 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:26:01,425-Speed 4835.79 samples/sec Loss 6.9528 Epoch: 4 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:26:12,272-Speed 4720.42 samples/sec Loss 7.0419 Epoch: 4 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:26:23,231-Speed 4672.40 samples/sec Loss 7.0003 Epoch: 4 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:26:47,392-[lfw][72000]XNorm: 22.976534 Training: 2021-03-17 20:26:47,392-[lfw][72000]Accuracy-Flip: 0.99433+-0.00343 Training: 2021-03-17 20:26:47,392-[lfw][72000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:27:14,862-[cfp_fp][72000]XNorm: 18.926596 Training: 2021-03-17 20:27:14,862-[cfp_fp][72000]Accuracy-Flip: 0.91671+-0.01505 Training: 2021-03-17 20:27:14,863-[cfp_fp][72000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:27:38,741-[agedb_30][72000]XNorm: 21.778032 Training: 2021-03-17 20:27:38,741-[agedb_30][72000]Accuracy-Flip: 0.94883+-0.01065 Training: 2021-03-17 20:27:38,741-[agedb_30][72000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:27:49,247-Speed 595.24 samples/sec Loss 6.9734 Epoch: 4 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:27:59,950-Speed 4784.10 samples/sec Loss 6.9886 Epoch: 4 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:28:10,602-Speed 4806.95 samples/sec Loss 7.0041 Epoch: 4 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:28:21,192-Speed 4834.71 samples/sec Loss 6.9810 Epoch: 4 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:28:31,964-Speed 4753.33 samples/sec Loss 6.9634 Epoch: 4 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:28:43,067-Speed 4611.64 samples/sec Loss 6.9827 Epoch: 4 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:28:53,794-Speed 4773.17 samples/sec Loss 6.9288 Epoch: 4 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:04,610-Speed 4734.17 samples/sec Loss 6.9800 Epoch: 4 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:15,358-Speed 4764.07 samples/sec Loss 6.9652 Epoch: 4 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:25,888-Speed 4862.85 samples/sec Loss 6.9802 Epoch: 4 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:36,545-Speed 4804.69 samples/sec Loss 6.8776 Epoch: 4 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:47,408-Speed 4713.64 samples/sec Loss 6.9554 Epoch: 4 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:29:58,050-Speed 4811.03 samples/sec Loss 6.9865 Epoch: 4 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:30:09,121-Speed 4625.09 samples/sec Loss 6.9142 Epoch: 4 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:30:19,816-Speed 4787.71 samples/sec Loss 6.9901 Epoch: 4 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:30:30,391-Speed 4841.65 samples/sec Loss 6.9323 Epoch: 4 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:30:41,025-Speed 4815.10 samples/sec Loss 7.0129 Epoch: 4 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:30:51,769-Speed 4765.55 samples/sec Loss 6.9805 Epoch: 4 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:03,557-Speed 4343.76 samples/sec Loss 6.9762 Epoch: 4 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:14,279-Speed 4775.72 samples/sec Loss 7.0245 Epoch: 4 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:25,118-Speed 4723.93 samples/sec Loss 6.9142 Epoch: 4 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:35,988-Speed 4710.53 samples/sec Loss 6.9163 Epoch: 4 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:46,702-Speed 4779.16 samples/sec Loss 6.9481 Epoch: 4 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:31:57,426-Speed 4774.30 samples/sec Loss 7.0135 Epoch: 4 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:32:08,267-Speed 4723.38 samples/sec Loss 6.9881 Epoch: 4 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:32:18,989-Speed 4775.52 samples/sec Loss 6.9449 Epoch: 4 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:32:29,759-Speed 4754.12 samples/sec Loss 7.0035 Epoch: 4 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:32:40,396-Speed 4813.41 samples/sec Loss 6.9550 Epoch: 4 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:32:50,955-Speed 4849.41 samples/sec Loss 6.9294 Epoch: 4 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:01,592-Speed 4813.79 samples/sec Loss 7.0118 Epoch: 4 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:12,356-Speed 4756.94 samples/sec Loss 6.9711 Epoch: 4 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:23,964-Speed 4410.92 samples/sec Loss 6.9814 Epoch: 4 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:34,714-Speed 4763.11 samples/sec Loss 7.0195 Epoch: 4 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:45,364-Speed 4807.89 samples/sec Loss 6.9568 Epoch: 4 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:33:55,938-Speed 4842.33 samples/sec Loss 6.9741 Epoch: 4 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:34:06,560-Speed 4820.53 samples/sec Loss 6.9802 Epoch: 4 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:34:17,338-Speed 4750.79 samples/sec Loss 6.9494 Epoch: 4 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:34:28,756-Speed 4484.12 samples/sec Loss 6.9752 Epoch: 4 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:34:39,484-Speed 4773.20 samples/sec Loss 6.9792 Epoch: 4 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:34:50,404-Speed 4688.87 samples/sec Loss 6.9518 Epoch: 4 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:35:14,692-[lfw][74000]XNorm: 22.184951 Training: 2021-03-17 20:35:14,692-[lfw][74000]Accuracy-Flip: 0.99533+-0.00364 Training: 2021-03-17 20:35:14,692-[lfw][74000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:35:42,364-[cfp_fp][74000]XNorm: 18.215297 Training: 2021-03-17 20:35:42,364-[cfp_fp][74000]Accuracy-Flip: 0.91486+-0.01311 Training: 2021-03-17 20:35:42,365-[cfp_fp][74000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:36:06,090-[agedb_30][74000]XNorm: 21.516900 Training: 2021-03-17 20:36:06,090-[agedb_30][74000]Accuracy-Flip: 0.95400+-0.01265 Training: 2021-03-17 20:36:06,090-[agedb_30][74000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:36:16,736-Speed 593.06 samples/sec Loss 6.9840 Epoch: 4 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:36:27,336-Speed 4830.74 samples/sec Loss 6.9832 Epoch: 4 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:36:38,065-Speed 4772.09 samples/sec Loss 7.0389 Epoch: 4 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:36:49,491-Speed 4481.28 samples/sec Loss 6.9680 Epoch: 4 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:01,020-Speed 4441.56 samples/sec Loss 7.0482 Epoch: 4 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:12,481-Speed 4467.45 samples/sec Loss 6.9674 Epoch: 4 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:23,256-Speed 4751.89 samples/sec Loss 6.9990 Epoch: 4 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:34,440-Speed 4578.14 samples/sec Loss 7.0369 Epoch: 4 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:46,012-Speed 4424.98 samples/sec Loss 6.9617 Epoch: 4 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:37:56,654-Speed 4811.16 samples/sec Loss 7.0402 Epoch: 4 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:38:07,167-Speed 4870.96 samples/sec Loss 6.9747 Epoch: 4 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:38:17,990-Speed 4730.83 samples/sec Loss 6.9843 Epoch: 4 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:38:28,545-Speed 4851.19 samples/sec Loss 7.0046 Epoch: 4 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:38:39,136-Speed 4834.54 samples/sec Loss 7.0185 Epoch: 4 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:38:49,836-Speed 4785.64 samples/sec Loss 6.9527 Epoch: 4 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:00,540-Speed 4783.33 samples/sec Loss 6.9812 Epoch: 4 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:11,137-Speed 4832.19 samples/sec Loss 6.9975 Epoch: 4 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:21,811-Speed 4796.77 samples/sec Loss 7.0166 Epoch: 4 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:32,439-Speed 4817.78 samples/sec Loss 7.0339 Epoch: 4 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:43,264-Speed 4729.91 samples/sec Loss 6.9701 Epoch: 4 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-17 20:39:53,964-Speed 4785.38 samples/sec Loss 7.0072 Epoch: 4 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:04,942-Speed 4664.24 samples/sec Loss 6.9652 Epoch: 4 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:15,947-Speed 4652.79 samples/sec Loss 7.0256 Epoch: 4 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:26,693-Speed 4764.76 samples/sec Loss 7.0272 Epoch: 4 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:37,205-Speed 4870.95 samples/sec Loss 7.0045 Epoch: 4 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:47,954-Speed 4763.36 samples/sec Loss 7.0239 Epoch: 4 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:40:58,643-Speed 4790.61 samples/sec Loss 7.0160 Epoch: 4 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:41:09,597-Speed 4674.10 samples/sec Loss 7.0120 Epoch: 4 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:41:20,438-Speed 4723.20 samples/sec Loss 7.0489 Epoch: 4 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:41:31,088-Speed 4808.09 samples/sec Loss 7.0151 Epoch: 4 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:41:41,622-Speed 4860.63 samples/sec Loss 6.9716 Epoch: 4 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:41:52,172-Speed 4853.76 samples/sec Loss 7.0285 Epoch: 4 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:03,168-Speed 4656.59 samples/sec Loss 6.9710 Epoch: 4 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:13,824-Speed 4804.94 samples/sec Loss 7.0255 Epoch: 4 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:25,463-Speed 4399.35 samples/sec Loss 6.9510 Epoch: 4 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:36,416-Speed 4674.82 samples/sec Loss 6.9868 Epoch: 4 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:47,236-Speed 4732.17 samples/sec Loss 6.9820 Epoch: 4 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:42:57,967-Speed 4771.58 samples/sec Loss 6.9816 Epoch: 4 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:43:08,726-Speed 4758.88 samples/sec Loss 6.9431 Epoch: 4 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:43:19,242-Speed 4869.28 samples/sec Loss 6.9867 Epoch: 4 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:43:43,651-[lfw][76000]XNorm: 24.747098 Training: 2021-03-17 20:43:43,651-[lfw][76000]Accuracy-Flip: 0.99533+-0.00364 Training: 2021-03-17 20:43:43,651-[lfw][76000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:44:11,380-[cfp_fp][76000]XNorm: 20.542319 Training: 2021-03-17 20:44:11,381-[cfp_fp][76000]Accuracy-Flip: 0.92386+-0.01539 Training: 2021-03-17 20:44:11,381-[cfp_fp][76000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:44:35,103-[agedb_30][76000]XNorm: 23.666465 Training: 2021-03-17 20:44:35,103-[agedb_30][76000]Accuracy-Flip: 0.95050+-0.01128 Training: 2021-03-17 20:44:35,103-[agedb_30][76000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:44:45,621-Speed 592.74 samples/sec Loss 7.0011 Epoch: 4 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:44:56,422-Speed 4740.78 samples/sec Loss 7.0198 Epoch: 4 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:45:07,242-Speed 4732.20 samples/sec Loss 7.0247 Epoch: 4 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:45:18,006-Speed 4756.75 samples/sec Loss 6.9904 Epoch: 4 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:45:28,713-Speed 4782.57 samples/sec Loss 6.9838 Epoch: 4 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:45:39,370-Speed 4804.87 samples/sec Loss 6.9813 Epoch: 4 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:45:50,275-Speed 4695.48 samples/sec Loss 6.9260 Epoch: 4 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:01,882-Speed 4411.32 samples/sec Loss 6.9199 Epoch: 4 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:12,546-Speed 4801.44 samples/sec Loss 6.9334 Epoch: 4 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:23,221-Speed 4796.60 samples/sec Loss 7.0931 Epoch: 4 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:33,996-Speed 4751.97 samples/sec Loss 7.0097 Epoch: 4 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:44,893-Speed 4698.79 samples/sec Loss 6.9959 Epoch: 4 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:46:55,391-Speed 4877.60 samples/sec Loss 6.9907 Epoch: 4 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:47:06,029-Speed 4813.38 samples/sec Loss 6.9491 Epoch: 4 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:47:17,414-Speed 4497.43 samples/sec Loss 7.0203 Epoch: 4 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:47:27,999-Speed 4837.05 samples/sec Loss 6.9644 Epoch: 4 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:47:38,738-Speed 4768.22 samples/sec Loss 6.9953 Epoch: 4 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:47:49,584-Speed 4720.69 samples/sec Loss 7.0288 Epoch: 4 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:00,920-Speed 4516.97 samples/sec Loss 6.9972 Epoch: 4 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:11,669-Speed 4763.32 samples/sec Loss 7.0551 Epoch: 4 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:22,322-Speed 4806.76 samples/sec Loss 7.0292 Epoch: 4 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:33,631-Speed 4527.38 samples/sec Loss 7.0067 Epoch: 4 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:44,463-Speed 4726.92 samples/sec Loss 6.9831 Epoch: 4 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:48:56,254-Speed 4342.50 samples/sec Loss 7.0278 Epoch: 4 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:49:07,453-Speed 4572.33 samples/sec Loss 7.0302 Epoch: 4 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:49:18,879-Speed 4480.99 samples/sec Loss 7.0157 Epoch: 4 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:49:29,365-Speed 4883.18 samples/sec Loss 7.0170 Epoch: 4 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:49:40,032-Speed 4800.32 samples/sec Loss 6.9524 Epoch: 4 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:49:50,692-Speed 4803.15 samples/sec Loss 6.9777 Epoch: 4 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:01,585-Speed 4700.51 samples/sec Loss 7.0412 Epoch: 4 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:12,469-Speed 4704.59 samples/sec Loss 7.0660 Epoch: 4 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:23,182-Speed 4779.27 samples/sec Loss 7.0144 Epoch: 4 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:33,806-Speed 4819.71 samples/sec Loss 6.9683 Epoch: 4 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:44,408-Speed 4829.70 samples/sec Loss 6.9939 Epoch: 4 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:50:55,194-Speed 4746.91 samples/sec Loss 6.9634 Epoch: 4 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:51:06,061-Speed 4711.90 samples/sec Loss 6.9910 Epoch: 4 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:51:17,159-Speed 4613.66 samples/sec Loss 6.9906 Epoch: 4 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:51:28,054-Speed 4699.62 samples/sec Loss 6.9407 Epoch: 4 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:51:38,945-Speed 4701.58 samples/sec Loss 6.9822 Epoch: 4 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:51:49,840-Speed 4699.52 samples/sec Loss 6.9896 Epoch: 4 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:52:13,935-[lfw][78000]XNorm: 24.070755 Training: 2021-03-17 20:52:13,935-[lfw][78000]Accuracy-Flip: 0.99400+-0.00367 Training: 2021-03-17 20:52:13,935-[lfw][78000]Accuracy-Highest: 0.99667 Training: 2021-03-17 20:52:41,363-[cfp_fp][78000]XNorm: 19.803214 Training: 2021-03-17 20:52:41,363-[cfp_fp][78000]Accuracy-Flip: 0.92071+-0.01011 Training: 2021-03-17 20:52:41,363-[cfp_fp][78000]Accuracy-Highest: 0.93229 Training: 2021-03-17 20:53:05,051-[agedb_30][78000]XNorm: 22.829606 Training: 2021-03-17 20:53:05,052-[agedb_30][78000]Accuracy-Flip: 0.95417+-0.00772 Training: 2021-03-17 20:53:05,052-[agedb_30][78000]Accuracy-Highest: 0.95467 Training: 2021-03-17 20:53:15,592-Speed 597.08 samples/sec Loss 7.0197 Epoch: 4 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:53:26,853-Speed 4546.95 samples/sec Loss 6.9021 Epoch: 4 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:53:37,510-Speed 4804.96 samples/sec Loss 7.1404 Epoch: 4 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:53:48,229-Speed 4776.69 samples/sec Loss 7.0484 Epoch: 4 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:53:58,887-Speed 4804.44 samples/sec Loss 6.9759 Epoch: 4 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:54:09,729-Speed 4722.51 samples/sec Loss 6.9992 Epoch: 4 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:54:20,297-Speed 4845.16 samples/sec Loss 6.9552 Epoch: 4 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:54:31,116-Speed 4732.75 samples/sec Loss 6.9795 Epoch: 4 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:54:41,697-Speed 4839.31 samples/sec Loss 6.9836 Epoch: 4 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:54:53,052-Speed 4509.28 samples/sec Loss 6.9948 Epoch: 4 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:03,915-Speed 4713.63 samples/sec Loss 7.0056 Epoch: 4 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:14,379-Speed 4893.11 samples/sec Loss 7.0370 Epoch: 4 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:24,955-Speed 4841.62 samples/sec Loss 7.0191 Epoch: 4 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:35,671-Speed 4778.01 samples/sec Loss 6.9835 Epoch: 4 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:46,542-Speed 4710.00 samples/sec Loss 6.9873 Epoch: 4 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:55:57,563-Speed 4646.47 samples/sec Loss 6.9800 Epoch: 4 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:56:08,284-Speed 4775.93 samples/sec Loss 7.0038 Epoch: 4 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:56:19,063-Speed 4750.41 samples/sec Loss 7.0403 Epoch: 4 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:56:29,793-Speed 4771.64 samples/sec Loss 6.9989 Epoch: 4 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:56:40,532-Speed 4767.97 samples/sec Loss 7.0300 Epoch: 4 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:56:51,229-Speed 4786.61 samples/sec Loss 6.9959 Epoch: 4 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:02,611-Speed 4498.57 samples/sec Loss 7.0137 Epoch: 4 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:13,407-Speed 4743.04 samples/sec Loss 7.0115 Epoch: 4 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:24,112-Speed 4782.87 samples/sec Loss 6.9574 Epoch: 4 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:34,741-Speed 4817.44 samples/sec Loss 6.9592 Epoch: 4 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:45,698-Speed 4672.89 samples/sec Loss 6.9942 Epoch: 4 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:57:56,289-Speed 4834.90 samples/sec Loss 7.0101 Epoch: 4 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:58:07,135-Speed 4720.96 samples/sec Loss 6.9736 Epoch: 4 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:58:17,883-Speed 4764.03 samples/sec Loss 7.0132 Epoch: 4 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:58:28,786-Speed 4696.09 samples/sec Loss 6.9984 Epoch: 4 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:58:39,382-Speed 4832.57 samples/sec Loss 7.0023 Epoch: 4 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:58:50,009-Speed 4817.88 samples/sec Loss 6.9422 Epoch: 4 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:01,435-Speed 4481.36 samples/sec Loss 7.0307 Epoch: 4 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:12,214-Speed 4750.45 samples/sec Loss 7.0101 Epoch: 4 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:23,723-Speed 4448.61 samples/sec Loss 7.0493 Epoch: 4 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:34,375-Speed 4807.07 samples/sec Loss 7.0040 Epoch: 4 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:45,019-Speed 4810.87 samples/sec Loss 6.9427 Epoch: 4 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 20:59:55,941-Speed 4687.96 samples/sec Loss 7.0069 Epoch: 4 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:00:07,303-Speed 4506.21 samples/sec Loss 7.0478 Epoch: 4 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:00:18,714-Speed 4487.14 samples/sec Loss 7.0151 Epoch: 4 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:00:43,212-[lfw][80000]XNorm: 21.637347 Training: 2021-03-17 21:00:43,212-[lfw][80000]Accuracy-Flip: 0.99600+-0.00260 Training: 2021-03-17 21:00:43,212-[lfw][80000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:01:11,287-[cfp_fp][80000]XNorm: 17.642356 Training: 2021-03-17 21:01:11,288-[cfp_fp][80000]Accuracy-Flip: 0.91400+-0.01602 Training: 2021-03-17 21:01:11,288-[cfp_fp][80000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:01:35,179-[agedb_30][80000]XNorm: 21.014011 Training: 2021-03-17 21:01:35,180-[agedb_30][80000]Accuracy-Flip: 0.95133+-0.00849 Training: 2021-03-17 21:01:35,180-[agedb_30][80000]Accuracy-Highest: 0.95467 Training: 2021-03-17 21:01:45,910-Speed 587.19 samples/sec Loss 6.9890 Epoch: 4 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:01:58,362-Speed 4111.86 samples/sec Loss 6.9913 Epoch: 4 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:02:09,011-Speed 4808.24 samples/sec Loss 6.9386 Epoch: 4 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:02:19,639-Speed 4817.88 samples/sec Loss 6.9594 Epoch: 4 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:02:30,509-Speed 4710.54 samples/sec Loss 6.9828 Epoch: 4 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:02:41,171-Speed 4802.31 samples/sec Loss 6.9578 Epoch: 4 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:02:51,826-Speed 4805.52 samples/sec Loss 6.9803 Epoch: 4 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:02,559-Speed 4770.74 samples/sec Loss 6.9804 Epoch: 4 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:13,420-Speed 4714.27 samples/sec Loss 6.9694 Epoch: 4 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:24,327-Speed 4694.36 samples/sec Loss 7.0148 Epoch: 4 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:34,997-Speed 4799.18 samples/sec Loss 6.9505 Epoch: 4 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:45,826-Speed 4727.92 samples/sec Loss 6.9921 Epoch: 4 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:03:56,819-Speed 4657.76 samples/sec Loss 6.9213 Epoch: 4 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:04:07,512-Speed 4788.85 samples/sec Loss 6.9378 Epoch: 4 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:04:18,075-Speed 4847.06 samples/sec Loss 7.0049 Epoch: 4 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:04:28,699-Speed 4819.79 samples/sec Loss 6.9495 Epoch: 4 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:04:39,504-Speed 4738.67 samples/sec Loss 6.9844 Epoch: 4 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:04:50,376-Speed 4709.86 samples/sec Loss 7.0301 Epoch: 4 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:00,976-Speed 4830.36 samples/sec Loss 6.9729 Epoch: 4 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:11,670-Speed 4787.87 samples/sec Loss 6.9881 Epoch: 4 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:22,428-Speed 4759.78 samples/sec Loss 6.9748 Epoch: 4 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:33,138-Speed 4780.85 samples/sec Loss 6.9527 Epoch: 4 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:43,842-Speed 4783.44 samples/sec Loss 6.9667 Epoch: 4 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:05:54,313-Speed 4889.95 samples/sec Loss 7.0264 Epoch: 4 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:05,794-Speed 4459.69 samples/sec Loss 6.9739 Epoch: 4 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:16,785-Speed 4658.84 samples/sec Loss 7.0071 Epoch: 4 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:27,660-Speed 4708.11 samples/sec Loss 6.9965 Epoch: 4 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:38,194-Speed 4860.91 samples/sec Loss 6.9693 Epoch: 4 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:49,029-Speed 4725.72 samples/sec Loss 6.9604 Epoch: 4 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:06:59,754-Speed 4773.92 samples/sec Loss 6.9713 Epoch: 4 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:07:10,503-Speed 4763.43 samples/sec Loss 7.0198 Epoch: 4 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:07:21,150-Speed 4809.44 samples/sec Loss 6.9836 Epoch: 4 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:07:32,168-Speed 4647.01 samples/sec Loss 6.9710 Epoch: 4 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:07:43,017-Speed 4719.82 samples/sec Loss 7.0398 Epoch: 4 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:07:53,660-Speed 4810.94 samples/sec Loss 6.9426 Epoch: 4 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:08:04,359-Speed 4785.91 samples/sec Loss 6.9913 Epoch: 4 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:08:15,079-Speed 4776.55 samples/sec Loss 7.0232 Epoch: 4 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:08:26,511-Speed 4479.05 samples/sec Loss 6.9437 Epoch: 4 Global Step: 81900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:08:37,189-Speed 4795.09 samples/sec Loss 7.0438 Epoch: 4 Global Step: 81950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:08:48,016-Speed 4729.33 samples/sec Loss 7.0041 Epoch: 4 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:09:12,202-[lfw][82000]XNorm: 22.631901 Training: 2021-03-17 21:09:12,202-[lfw][82000]Accuracy-Flip: 0.99500+-0.00333 Training: 2021-03-17 21:09:12,202-[lfw][82000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:09:39,946-[cfp_fp][82000]XNorm: 18.698521 Training: 2021-03-17 21:09:39,946-[cfp_fp][82000]Accuracy-Flip: 0.91800+-0.01112 Training: 2021-03-17 21:09:39,946-[cfp_fp][82000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:10:03,854-[agedb_30][82000]XNorm: 22.305104 Training: 2021-03-17 21:10:03,854-[agedb_30][82000]Accuracy-Flip: 0.95167+-0.01193 Training: 2021-03-17 21:10:03,854-[agedb_30][82000]Accuracy-Highest: 0.95467 Training: 2021-03-17 21:10:14,214-Speed 593.98 samples/sec Loss 6.9685 Epoch: 4 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:10:24,897-Speed 4793.07 samples/sec Loss 7.0057 Epoch: 4 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:10:35,534-Speed 4813.59 samples/sec Loss 7.0261 Epoch: 4 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:10:46,493-Speed 4672.30 samples/sec Loss 6.9712 Epoch: 4 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:10:57,009-Speed 4868.71 samples/sec Loss 6.9991 Epoch: 4 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:11:07,370-Speed 4942.25 samples/sec Loss 6.9958 Epoch: 4 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:11:18,130-Speed 4758.48 samples/sec Loss 6.9455 Epoch: 4 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:11:28,852-Speed 4775.36 samples/sec Loss 6.9591 Epoch: 4 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:11:39,632-Speed 4749.70 samples/sec Loss 6.9659 Epoch: 4 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:11:51,148-Speed 4446.43 samples/sec Loss 6.9982 Epoch: 4 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:01,767-Speed 4821.80 samples/sec Loss 6.9910 Epoch: 4 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:13,510-Speed 4360.10 samples/sec Loss 6.9653 Epoch: 4 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:24,328-Speed 4733.20 samples/sec Loss 6.9810 Epoch: 4 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:35,132-Speed 4739.31 samples/sec Loss 6.9028 Epoch: 4 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:45,993-Speed 4714.69 samples/sec Loss 7.0257 Epoch: 4 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:12:56,631-Speed 4813.17 samples/sec Loss 6.9893 Epoch: 4 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:13:08,845-Speed 4192.15 samples/sec Loss 6.9998 Epoch: 4 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:13:19,495-Speed 4807.65 samples/sec Loss 6.9645 Epoch: 4 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:13:31,299-Speed 4337.93 samples/sec Loss 6.9319 Epoch: 4 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:13:42,567-Speed 4543.76 samples/sec Loss 7.0524 Epoch: 4 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:13:53,352-Speed 4747.87 samples/sec Loss 6.9617 Epoch: 4 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:04,098-Speed 4765.13 samples/sec Loss 7.0462 Epoch: 4 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:14,681-Speed 4837.92 samples/sec Loss 6.9517 Epoch: 4 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:25,453-Speed 4753.22 samples/sec Loss 6.9957 Epoch: 4 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:36,124-Speed 4798.64 samples/sec Loss 6.9507 Epoch: 4 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:46,897-Speed 4752.57 samples/sec Loss 6.9469 Epoch: 4 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:14:57,759-Speed 4713.90 samples/sec Loss 6.9160 Epoch: 4 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:15:08,445-Speed 4791.85 samples/sec Loss 6.9834 Epoch: 4 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:15:20,249-Speed 4337.82 samples/sec Loss 6.9776 Epoch: 4 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:15:42,648-Speed 2285.81 samples/sec Loss 6.4140 Epoch: 5 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:15:53,155-Speed 4873.65 samples/sec Loss 6.2889 Epoch: 5 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:04,112-Speed 4672.93 samples/sec Loss 6.2802 Epoch: 5 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:14,745-Speed 4815.91 samples/sec Loss 6.3136 Epoch: 5 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:25,211-Speed 4892.23 samples/sec Loss 6.2975 Epoch: 5 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:36,030-Speed 4732.44 samples/sec Loss 6.2884 Epoch: 5 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:46,894-Speed 4713.55 samples/sec Loss 6.3851 Epoch: 5 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:16:57,662-Speed 4755.19 samples/sec Loss 6.3889 Epoch: 5 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:17:08,343-Speed 4793.89 samples/sec Loss 6.3845 Epoch: 5 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:17:20,075-Speed 4364.55 samples/sec Loss 6.4100 Epoch: 5 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:17:30,766-Speed 4789.32 samples/sec Loss 6.4381 Epoch: 5 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:17:55,286-[lfw][84000]XNorm: 20.084703 Training: 2021-03-17 21:17:55,286-[lfw][84000]Accuracy-Flip: 0.99567+-0.00213 Training: 2021-03-17 21:17:55,286-[lfw][84000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:18:22,835-[cfp_fp][84000]XNorm: 16.852868 Training: 2021-03-17 21:18:22,836-[cfp_fp][84000]Accuracy-Flip: 0.92557+-0.00929 Training: 2021-03-17 21:18:22,836-[cfp_fp][84000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:18:46,698-[agedb_30][84000]XNorm: 19.213286 Training: 2021-03-17 21:18:46,699-[agedb_30][84000]Accuracy-Flip: 0.95367+-0.01113 Training: 2021-03-17 21:18:46,699-[agedb_30][84000]Accuracy-Highest: 0.95467 Training: 2021-03-17 21:18:57,520-Speed 590.18 samples/sec Loss 6.4987 Epoch: 5 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:19:08,454-Speed 4683.06 samples/sec Loss 6.5250 Epoch: 5 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:19:19,219-Speed 4756.11 samples/sec Loss 6.5334 Epoch: 5 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:19:29,940-Speed 4776.29 samples/sec Loss 6.4757 Epoch: 5 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:19:40,719-Speed 4750.08 samples/sec Loss 6.5195 Epoch: 5 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:19:51,298-Speed 4840.43 samples/sec Loss 6.5451 Epoch: 5 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:02,214-Speed 4690.74 samples/sec Loss 6.5292 Epoch: 5 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:12,873-Speed 4803.88 samples/sec Loss 6.5703 Epoch: 5 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:23,689-Speed 4733.96 samples/sec Loss 6.5778 Epoch: 5 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:34,598-Speed 4693.80 samples/sec Loss 6.5571 Epoch: 5 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:45,144-Speed 4854.93 samples/sec Loss 6.5785 Epoch: 5 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:20:55,786-Speed 4811.32 samples/sec Loss 6.6295 Epoch: 5 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:21:07,531-Speed 4359.63 samples/sec Loss 6.6059 Epoch: 5 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:21:18,290-Speed 4759.38 samples/sec Loss 6.5355 Epoch: 5 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:21:28,958-Speed 4799.78 samples/sec Loss 6.6428 Epoch: 5 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:21:39,928-Speed 4667.76 samples/sec Loss 6.6314 Epoch: 5 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:21:50,693-Speed 4756.42 samples/sec Loss 6.6270 Epoch: 5 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:01,314-Speed 4820.96 samples/sec Loss 6.7027 Epoch: 5 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:12,244-Speed 4684.43 samples/sec Loss 6.6834 Epoch: 5 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:23,001-Speed 4760.14 samples/sec Loss 6.7013 Epoch: 5 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:33,741-Speed 4767.19 samples/sec Loss 6.6654 Epoch: 5 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:44,387-Speed 4809.82 samples/sec Loss 6.6787 Epoch: 5 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:22:54,896-Speed 4872.38 samples/sec Loss 6.7529 Epoch: 5 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:23:05,568-Speed 4797.82 samples/sec Loss 6.6798 Epoch: 5 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:23:16,449-Speed 4705.83 samples/sec Loss 6.6928 Epoch: 5 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:23:26,975-Speed 4864.45 samples/sec Loss 6.7218 Epoch: 5 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:23:38,475-Speed 4452.13 samples/sec Loss 6.6990 Epoch: 5 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:23:49,925-Speed 4472.01 samples/sec Loss 6.6755 Epoch: 5 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:00,534-Speed 4826.31 samples/sec Loss 6.6950 Epoch: 5 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:11,179-Speed 4810.22 samples/sec Loss 6.6943 Epoch: 5 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:22,104-Speed 4686.59 samples/sec Loss 6.7077 Epoch: 5 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:32,629-Speed 4864.91 samples/sec Loss 6.7111 Epoch: 5 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:44,293-Speed 4389.92 samples/sec Loss 6.6781 Epoch: 5 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:24:54,988-Speed 4787.61 samples/sec Loss 6.7311 Epoch: 5 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:25:05,794-Speed 4738.43 samples/sec Loss 6.7619 Epoch: 5 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:25:17,214-Speed 4483.66 samples/sec Loss 6.6901 Epoch: 5 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:25:28,067-Speed 4718.00 samples/sec Loss 6.7717 Epoch: 5 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:25:39,953-Speed 4307.64 samples/sec Loss 6.7200 Epoch: 5 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:25:50,847-Speed 4700.35 samples/sec Loss 6.7097 Epoch: 5 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:26:01,798-Speed 4675.43 samples/sec Loss 6.6818 Epoch: 5 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:26:26,215-[lfw][86000]XNorm: 21.864197 Training: 2021-03-17 21:26:26,215-[lfw][86000]Accuracy-Flip: 0.99517+-0.00345 Training: 2021-03-17 21:26:26,215-[lfw][86000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:26:53,885-[cfp_fp][86000]XNorm: 18.088712 Training: 2021-03-17 21:26:53,886-[cfp_fp][86000]Accuracy-Flip: 0.92614+-0.00973 Training: 2021-03-17 21:26:53,886-[cfp_fp][86000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:27:17,847-[agedb_30][86000]XNorm: 21.257106 Training: 2021-03-17 21:27:17,848-[agedb_30][86000]Accuracy-Flip: 0.95550+-0.00966 Training: 2021-03-17 21:27:17,848-[agedb_30][86000]Accuracy-Highest: 0.95550 Training: 2021-03-17 21:27:28,510-Speed 590.47 samples/sec Loss 6.7895 Epoch: 5 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:27:39,326-Speed 4734.37 samples/sec Loss 6.7388 Epoch: 5 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:27:49,879-Speed 4852.10 samples/sec Loss 6.7408 Epoch: 5 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:00,612-Speed 4770.51 samples/sec Loss 6.7668 Epoch: 5 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:11,330-Speed 4777.19 samples/sec Loss 6.7594 Epoch: 5 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:21,908-Speed 4840.95 samples/sec Loss 6.7521 Epoch: 5 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:32,668-Speed 4758.43 samples/sec Loss 6.8365 Epoch: 5 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:43,438-Speed 4754.11 samples/sec Loss 6.7872 Epoch: 5 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:28:54,068-Speed 4816.74 samples/sec Loss 6.7644 Epoch: 5 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:04,939-Speed 4710.15 samples/sec Loss 6.8197 Epoch: 5 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:15,553-Speed 4824.07 samples/sec Loss 6.7629 Epoch: 5 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:26,301-Speed 4764.19 samples/sec Loss 6.7883 Epoch: 5 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:37,974-Speed 4386.31 samples/sec Loss 6.8233 Epoch: 5 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:48,738-Speed 4756.95 samples/sec Loss 6.7983 Epoch: 5 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:29:59,401-Speed 4802.00 samples/sec Loss 6.7225 Epoch: 5 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:30:10,089-Speed 4790.62 samples/sec Loss 6.7757 Epoch: 5 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:30:20,893-Speed 4738.93 samples/sec Loss 6.8647 Epoch: 5 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:30:31,884-Speed 4658.77 samples/sec Loss 6.7958 Epoch: 5 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:30:42,778-Speed 4700.27 samples/sec Loss 6.8215 Epoch: 5 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:30:53,302-Speed 4865.29 samples/sec Loss 6.8821 Epoch: 5 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:03,953-Speed 4807.37 samples/sec Loss 6.8230 Epoch: 5 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:14,517-Speed 4847.12 samples/sec Loss 6.8533 Epoch: 5 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:25,162-Speed 4810.18 samples/sec Loss 6.8080 Epoch: 5 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:35,783-Speed 4820.79 samples/sec Loss 6.8539 Epoch: 5 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:46,458-Speed 4796.55 samples/sec Loss 6.7892 Epoch: 5 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:31:57,119-Speed 4802.84 samples/sec Loss 6.8488 Epoch: 5 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:32:07,875-Speed 4760.22 samples/sec Loss 6.8140 Epoch: 5 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:32:18,715-Speed 4723.75 samples/sec Loss 6.8095 Epoch: 5 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:32:29,403-Speed 4790.70 samples/sec Loss 6.9375 Epoch: 5 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:32:40,411-Speed 4651.24 samples/sec Loss 6.8348 Epoch: 5 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:32:51,769-Speed 4508.38 samples/sec Loss 6.8252 Epoch: 5 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:02,680-Speed 4692.43 samples/sec Loss 6.7774 Epoch: 5 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:13,479-Speed 4741.65 samples/sec Loss 6.8789 Epoch: 5 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:24,149-Speed 4798.68 samples/sec Loss 6.8340 Epoch: 5 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:34,776-Speed 4818.36 samples/sec Loss 6.8794 Epoch: 5 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:45,589-Speed 4735.24 samples/sec Loss 6.9062 Epoch: 5 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:33:56,185-Speed 4832.62 samples/sec Loss 6.8286 Epoch: 5 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:34:06,887-Speed 4784.54 samples/sec Loss 6.8958 Epoch: 5 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:34:17,631-Speed 4765.38 samples/sec Loss 6.8301 Epoch: 5 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:34:28,274-Speed 4811.21 samples/sec Loss 6.8606 Epoch: 5 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:34:52,800-[lfw][88000]XNorm: 21.454407 Training: 2021-03-17 21:34:52,800-[lfw][88000]Accuracy-Flip: 0.99467+-0.00287 Training: 2021-03-17 21:34:52,800-[lfw][88000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:35:20,255-[cfp_fp][88000]XNorm: 17.293316 Training: 2021-03-17 21:35:20,256-[cfp_fp][88000]Accuracy-Flip: 0.91443+-0.01644 Training: 2021-03-17 21:35:20,256-[cfp_fp][88000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:35:44,253-[agedb_30][88000]XNorm: 20.788357 Training: 2021-03-17 21:35:44,253-[agedb_30][88000]Accuracy-Flip: 0.94367+-0.00951 Training: 2021-03-17 21:35:44,253-[agedb_30][88000]Accuracy-Highest: 0.95550 Training: 2021-03-17 21:35:54,849-Speed 591.40 samples/sec Loss 6.8810 Epoch: 5 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:36:05,582-Speed 4770.75 samples/sec Loss 6.9690 Epoch: 5 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:36:16,538-Speed 4673.60 samples/sec Loss 6.9477 Epoch: 5 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:36:27,770-Speed 4558.72 samples/sec Loss 6.9399 Epoch: 5 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:36:39,181-Speed 4487.07 samples/sec Loss 6.8352 Epoch: 5 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:36:49,826-Speed 4810.14 samples/sec Loss 6.8735 Epoch: 5 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:00,653-Speed 4729.27 samples/sec Loss 6.9121 Epoch: 5 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:11,268-Speed 4823.47 samples/sec Loss 6.8171 Epoch: 5 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:21,835-Speed 4845.73 samples/sec Loss 6.8850 Epoch: 5 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:34,114-Speed 4170.02 samples/sec Loss 6.8926 Epoch: 5 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:44,871-Speed 4759.60 samples/sec Loss 6.9128 Epoch: 5 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:37:55,519-Speed 4808.98 samples/sec Loss 6.8888 Epoch: 5 Global Step: 88600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:38:07,044-Speed 4442.85 samples/sec Loss 6.9513 Epoch: 5 Global Step: 88650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:38:18,305-Speed 4546.98 samples/sec Loss 6.9324 Epoch: 5 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:38:29,016-Speed 4780.27 samples/sec Loss 6.8520 Epoch: 5 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:38:39,709-Speed 4788.34 samples/sec Loss 6.8972 Epoch: 5 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:38:50,409-Speed 4785.28 samples/sec Loss 6.8749 Epoch: 5 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-17 21:39:01,189-Speed 4750.04 samples/sec Loss 6.8713 Epoch: 5 Global Step: 88900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:39:11,952-Speed 4757.37 samples/sec Loss 6.9668 Epoch: 5 Global Step: 88950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:39:22,515-Speed 4847.34 samples/sec Loss 6.9339 Epoch: 5 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:39:33,364-Speed 4719.85 samples/sec Loss 6.8959 Epoch: 5 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:39:44,349-Speed 4661.47 samples/sec Loss 6.8778 Epoch: 5 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:39:55,221-Speed 4709.63 samples/sec Loss 6.9250 Epoch: 5 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:40:06,096-Speed 4708.25 samples/sec Loss 6.9191 Epoch: 5 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:40:16,713-Speed 4822.64 samples/sec Loss 6.9044 Epoch: 5 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:40:27,090-Speed 4934.46 samples/sec Loss 6.8841 Epoch: 5 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:40:38,742-Speed 4394.34 samples/sec Loss 6.9160 Epoch: 5 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:40:49,312-Speed 4844.24 samples/sec Loss 6.9105 Epoch: 5 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:00,026-Speed 4779.10 samples/sec Loss 6.9131 Epoch: 5 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:10,756-Speed 4772.00 samples/sec Loss 6.8978 Epoch: 5 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:21,593-Speed 4724.45 samples/sec Loss 6.9106 Epoch: 5 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:32,337-Speed 4766.14 samples/sec Loss 6.9134 Epoch: 5 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:42,906-Speed 4844.86 samples/sec Loss 6.9221 Epoch: 5 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:41:53,648-Speed 4766.59 samples/sec Loss 6.8594 Epoch: 5 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:04,401-Speed 4761.96 samples/sec Loss 6.9069 Epoch: 5 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:15,077-Speed 4795.98 samples/sec Loss 6.8356 Epoch: 5 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:25,842-Speed 4756.32 samples/sec Loss 6.9339 Epoch: 5 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:36,413-Speed 4843.89 samples/sec Loss 6.9389 Epoch: 5 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:47,186-Speed 4752.67 samples/sec Loss 6.9007 Epoch: 5 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:42:58,098-Speed 4692.29 samples/sec Loss 6.8032 Epoch: 5 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:43:22,163-[lfw][90000]XNorm: 23.271176 Training: 2021-03-17 21:43:22,163-[lfw][90000]Accuracy-Flip: 0.99517+-0.00263 Training: 2021-03-17 21:43:22,163-[lfw][90000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:43:49,748-[cfp_fp][90000]XNorm: 18.722134 Training: 2021-03-17 21:43:49,749-[cfp_fp][90000]Accuracy-Flip: 0.91486+-0.01140 Training: 2021-03-17 21:43:49,749-[cfp_fp][90000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:44:13,486-[agedb_30][90000]XNorm: 22.197914 Training: 2021-03-17 21:44:13,487-[agedb_30][90000]Accuracy-Flip: 0.94900+-0.01086 Training: 2021-03-17 21:44:13,487-[agedb_30][90000]Accuracy-Highest: 0.95550 Training: 2021-03-17 21:44:23,688-Speed 598.21 samples/sec Loss 6.9452 Epoch: 5 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:44:34,551-Speed 4713.70 samples/sec Loss 6.9225 Epoch: 5 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:44:45,298-Speed 4764.62 samples/sec Loss 6.8660 Epoch: 5 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:44:55,984-Speed 4791.21 samples/sec Loss 6.8798 Epoch: 5 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:45:06,924-Speed 4680.59 samples/sec Loss 6.9362 Epoch: 5 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:45:17,673-Speed 4763.71 samples/sec Loss 6.9578 Epoch: 5 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:45:29,294-Speed 4405.81 samples/sec Loss 6.9175 Epoch: 5 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:45:40,157-Speed 4713.52 samples/sec Loss 6.9412 Epoch: 5 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:45:50,754-Speed 4832.11 samples/sec Loss 6.9153 Epoch: 5 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:01,607-Speed 4717.46 samples/sec Loss 6.9395 Epoch: 5 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:12,592-Speed 4661.26 samples/sec Loss 6.9244 Epoch: 5 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:23,347-Speed 4760.89 samples/sec Loss 6.8577 Epoch: 5 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:34,306-Speed 4672.39 samples/sec Loss 6.8896 Epoch: 5 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:45,049-Speed 4766.06 samples/sec Loss 6.9201 Epoch: 5 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:46:55,850-Speed 4740.35 samples/sec Loss 6.9301 Epoch: 5 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:47:06,451-Speed 4830.06 samples/sec Loss 6.9194 Epoch: 5 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:47:17,344-Speed 4700.76 samples/sec Loss 6.8112 Epoch: 5 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:47:28,281-Speed 4681.41 samples/sec Loss 6.8803 Epoch: 5 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:47:39,052-Speed 4753.82 samples/sec Loss 6.8821 Epoch: 5 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:47:50,607-Speed 4431.25 samples/sec Loss 6.9346 Epoch: 5 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:01,120-Speed 4870.50 samples/sec Loss 6.9159 Epoch: 5 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:11,946-Speed 4729.59 samples/sec Loss 6.9246 Epoch: 5 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:23,395-Speed 4472.47 samples/sec Loss 6.9117 Epoch: 5 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:34,065-Speed 4798.75 samples/sec Loss 6.9139 Epoch: 5 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:44,615-Speed 4853.27 samples/sec Loss 6.8858 Epoch: 5 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:48:55,471-Speed 4716.34 samples/sec Loss 6.9168 Epoch: 5 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:49:07,584-Speed 4227.30 samples/sec Loss 6.9402 Epoch: 5 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:49:18,366-Speed 4748.89 samples/sec Loss 6.9244 Epoch: 5 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:49:29,108-Speed 4766.35 samples/sec Loss 6.9451 Epoch: 5 Global Step: 91450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:49:40,762-Speed 4393.76 samples/sec Loss 6.9714 Epoch: 5 Global Step: 91500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:49:52,220-Speed 4468.54 samples/sec Loss 7.0059 Epoch: 5 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:02,912-Speed 4789.18 samples/sec Loss 6.9272 Epoch: 5 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:13,829-Speed 4690.21 samples/sec Loss 6.9809 Epoch: 5 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:24,863-Speed 4640.61 samples/sec Loss 6.9401 Epoch: 5 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:35,485-Speed 4820.36 samples/sec Loss 6.9753 Epoch: 5 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:46,262-Speed 4751.07 samples/sec Loss 6.8997 Epoch: 5 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:50:56,909-Speed 4809.41 samples/sec Loss 6.8899 Epoch: 5 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:51:07,655-Speed 4764.90 samples/sec Loss 6.8836 Epoch: 5 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:51:18,447-Speed 4744.45 samples/sec Loss 6.9305 Epoch: 5 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:51:29,325-Speed 4707.15 samples/sec Loss 6.9079 Epoch: 5 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:51:53,767-[lfw][92000]XNorm: 23.029648 Training: 2021-03-17 21:51:53,767-[lfw][92000]Accuracy-Flip: 0.99533+-0.00267 Training: 2021-03-17 21:51:53,767-[lfw][92000]Accuracy-Highest: 0.99667 Training: 2021-03-17 21:52:21,321-[cfp_fp][92000]XNorm: 18.976315 Training: 2021-03-17 21:52:21,322-[cfp_fp][92000]Accuracy-Flip: 0.92629+-0.01484 Training: 2021-03-17 21:52:21,322-[cfp_fp][92000]Accuracy-Highest: 0.93229 Training: 2021-03-17 21:52:45,151-[agedb_30][92000]XNorm: 21.739714 Training: 2021-03-17 21:52:45,151-[agedb_30][92000]Accuracy-Flip: 0.95467+-0.00918 Training: 2021-03-17 21:52:45,151-[agedb_30][92000]Accuracy-Highest: 0.95550 Training: 2021-03-17 21:52:56,476-Speed 587.49 samples/sec Loss 6.9615 Epoch: 5 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:53:07,046-Speed 4844.15 samples/sec Loss 6.9704 Epoch: 5 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:53:17,772-Speed 4773.58 samples/sec Loss 6.8717 Epoch: 5 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:53:28,488-Speed 4778.28 samples/sec Loss 6.9520 Epoch: 5 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:53:39,483-Speed 4656.96 samples/sec Loss 6.9546 Epoch: 5 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:53:50,248-Speed 4756.64 samples/sec Loss 6.9536 Epoch: 5 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:00,955-Speed 4782.03 samples/sec Loss 6.9081 Epoch: 5 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:11,411-Speed 4897.04 samples/sec Loss 6.9300 Epoch: 5 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:22,049-Speed 4813.22 samples/sec Loss 6.9631 Epoch: 5 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:32,774-Speed 4774.21 samples/sec Loss 6.9959 Epoch: 5 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:43,617-Speed 4721.96 samples/sec Loss 6.9088 Epoch: 5 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:54:54,313-Speed 4787.04 samples/sec Loss 6.8862 Epoch: 5 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:05,191-Speed 4707.12 samples/sec Loss 6.9422 Epoch: 5 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:15,655-Speed 4893.39 samples/sec Loss 6.9307 Epoch: 5 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:26,160-Speed 4874.12 samples/sec Loss 6.9493 Epoch: 5 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:36,649-Speed 4881.68 samples/sec Loss 6.9385 Epoch: 5 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:47,353-Speed 4783.63 samples/sec Loss 6.9555 Epoch: 5 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:55:58,026-Speed 4797.51 samples/sec Loss 6.8712 Epoch: 5 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:56:08,875-Speed 4719.72 samples/sec Loss 6.9322 Epoch: 5 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:56:19,432-Speed 4850.06 samples/sec Loss 6.9451 Epoch: 5 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:56:30,037-Speed 4827.90 samples/sec Loss 6.9711 Epoch: 5 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:56:40,716-Speed 4794.72 samples/sec Loss 6.8611 Epoch: 5 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:56:51,400-Speed 4792.70 samples/sec Loss 6.9930 Epoch: 5 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:02,950-Speed 4433.19 samples/sec Loss 6.9763 Epoch: 5 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:13,662-Speed 4779.95 samples/sec Loss 6.9656 Epoch: 5 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:24,292-Speed 4817.19 samples/sec Loss 6.9457 Epoch: 5 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:35,198-Speed 4694.52 samples/sec Loss 6.9307 Epoch: 5 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:46,047-Speed 4719.92 samples/sec Loss 6.9632 Epoch: 5 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:57:56,676-Speed 4817.24 samples/sec Loss 6.9268 Epoch: 5 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:58:07,521-Speed 4721.22 samples/sec Loss 6.9071 Epoch: 5 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:58:18,153-Speed 4816.06 samples/sec Loss 6.9452 Epoch: 5 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:58:28,651-Speed 4877.40 samples/sec Loss 6.9568 Epoch: 5 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:58:39,271-Speed 4821.48 samples/sec Loss 6.9518 Epoch: 5 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:58:49,806-Speed 4860.34 samples/sec Loss 6.8726 Epoch: 5 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:00,341-Speed 4859.95 samples/sec Loss 6.9801 Epoch: 5 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:11,669-Speed 4520.07 samples/sec Loss 6.9650 Epoch: 5 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:22,559-Speed 4702.18 samples/sec Loss 6.9206 Epoch: 5 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:33,252-Speed 4788.32 samples/sec Loss 6.9571 Epoch: 5 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:43,900-Speed 4808.88 samples/sec Loss 6.9830 Epoch: 5 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 21:59:55,308-Speed 4488.23 samples/sec Loss 6.9664 Epoch: 5 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:00:19,556-[lfw][94000]XNorm: 23.746644 Training: 2021-03-17 22:00:19,556-[lfw][94000]Accuracy-Flip: 0.99583+-0.00352 Training: 2021-03-17 22:00:19,556-[lfw][94000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:00:47,094-[cfp_fp][94000]XNorm: 19.392075 Training: 2021-03-17 22:00:47,094-[cfp_fp][94000]Accuracy-Flip: 0.91886+-0.01475 Training: 2021-03-17 22:00:47,094-[cfp_fp][94000]Accuracy-Highest: 0.93229 Training: 2021-03-17 22:01:10,877-[agedb_30][94000]XNorm: 22.438579 Training: 2021-03-17 22:01:10,878-[agedb_30][94000]Accuracy-Flip: 0.94950+-0.00806 Training: 2021-03-17 22:01:10,878-[agedb_30][94000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:01:21,561-Speed 593.60 samples/sec Loss 6.9333 Epoch: 5 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:01:32,373-Speed 4736.08 samples/sec Loss 6.8980 Epoch: 5 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:01:43,046-Speed 4797.41 samples/sec Loss 6.9331 Epoch: 5 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:01:53,711-Speed 4800.81 samples/sec Loss 7.0225 Epoch: 5 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:02:05,828-Speed 4225.81 samples/sec Loss 6.9318 Epoch: 5 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:02:16,365-Speed 4859.32 samples/sec Loss 6.9720 Epoch: 5 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:02:27,308-Speed 4679.13 samples/sec Loss 6.9415 Epoch: 5 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:02:38,763-Speed 4469.83 samples/sec Loss 6.9642 Epoch: 5 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:02:50,353-Speed 4417.99 samples/sec Loss 6.9666 Epoch: 5 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:00,778-Speed 4911.59 samples/sec Loss 6.9775 Epoch: 5 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:11,557-Speed 4749.97 samples/sec Loss 6.9486 Epoch: 5 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:22,354-Speed 4742.60 samples/sec Loss 6.9278 Epoch: 5 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:33,102-Speed 4763.56 samples/sec Loss 6.9147 Epoch: 5 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:43,961-Speed 4715.43 samples/sec Loss 6.9514 Epoch: 5 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:03:54,641-Speed 4794.52 samples/sec Loss 6.9053 Epoch: 5 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:04:05,370-Speed 4772.42 samples/sec Loss 6.9070 Epoch: 5 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:04:16,235-Speed 4712.70 samples/sec Loss 6.9234 Epoch: 5 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:04:27,899-Speed 4389.97 samples/sec Loss 6.8827 Epoch: 5 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:04:38,544-Speed 4809.73 samples/sec Loss 6.9284 Epoch: 5 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:04:49,381-Speed 4724.91 samples/sec Loss 6.8552 Epoch: 5 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:00,173-Speed 4744.52 samples/sec Loss 6.9350 Epoch: 5 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:10,857-Speed 4792.46 samples/sec Loss 6.9763 Epoch: 5 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:21,494-Speed 4813.85 samples/sec Loss 6.9642 Epoch: 5 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:32,227-Speed 4770.52 samples/sec Loss 6.8823 Epoch: 5 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:42,911-Speed 4792.63 samples/sec Loss 6.9161 Epoch: 5 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:05:53,499-Speed 4835.81 samples/sec Loss 6.9111 Epoch: 5 Global Step: 95300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:04,247-Speed 4763.97 samples/sec Loss 6.9489 Epoch: 5 Global Step: 95350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:14,820-Speed 4842.62 samples/sec Loss 6.9652 Epoch: 5 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:25,604-Speed 4747.96 samples/sec Loss 6.8980 Epoch: 5 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:36,427-Speed 4730.94 samples/sec Loss 6.9610 Epoch: 5 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:47,054-Speed 4818.47 samples/sec Loss 6.9432 Epoch: 5 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:06:57,836-Speed 4748.86 samples/sec Loss 6.9048 Epoch: 5 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:07:08,473-Speed 4814.36 samples/sec Loss 6.9879 Epoch: 5 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:07:19,152-Speed 4794.87 samples/sec Loss 6.9418 Epoch: 5 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:07:29,796-Speed 4810.65 samples/sec Loss 6.9160 Epoch: 5 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:07:40,474-Speed 4795.17 samples/sec Loss 6.9170 Epoch: 5 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:07:51,505-Speed 4642.01 samples/sec Loss 6.8712 Epoch: 5 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:08:01,959-Speed 4897.80 samples/sec Loss 6.9314 Epoch: 5 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:08:13,681-Speed 4368.23 samples/sec Loss 6.9352 Epoch: 5 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:08:24,250-Speed 4844.68 samples/sec Loss 6.9283 Epoch: 5 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:08:48,487-[lfw][96000]XNorm: 23.493993 Training: 2021-03-17 22:08:48,488-[lfw][96000]Accuracy-Flip: 0.99433+-0.00343 Training: 2021-03-17 22:08:48,488-[lfw][96000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:09:16,132-[cfp_fp][96000]XNorm: 19.383787 Training: 2021-03-17 22:09:16,132-[cfp_fp][96000]Accuracy-Flip: 0.92371+-0.01201 Training: 2021-03-17 22:09:16,132-[cfp_fp][96000]Accuracy-Highest: 0.93229 Training: 2021-03-17 22:09:40,050-[agedb_30][96000]XNorm: 22.407450 Training: 2021-03-17 22:09:40,050-[agedb_30][96000]Accuracy-Flip: 0.94683+-0.01122 Training: 2021-03-17 22:09:40,050-[agedb_30][96000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:09:50,513-Speed 593.54 samples/sec Loss 6.9700 Epoch: 5 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:00,928-Speed 4916.07 samples/sec Loss 6.9855 Epoch: 5 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:11,587-Speed 4803.89 samples/sec Loss 6.9276 Epoch: 5 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:22,457-Speed 4710.47 samples/sec Loss 6.9227 Epoch: 5 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:33,111-Speed 4805.99 samples/sec Loss 6.9435 Epoch: 5 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:44,007-Speed 4699.40 samples/sec Loss 6.9532 Epoch: 5 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:10:54,851-Speed 4721.81 samples/sec Loss 6.9073 Epoch: 5 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:05,563-Speed 4779.85 samples/sec Loss 6.9369 Epoch: 5 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:16,081-Speed 4868.23 samples/sec Loss 6.8972 Epoch: 5 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:26,800-Speed 4776.70 samples/sec Loss 6.9247 Epoch: 5 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:38,219-Speed 4483.96 samples/sec Loss 6.9901 Epoch: 5 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:48,803-Speed 4837.83 samples/sec Loss 6.9126 Epoch: 5 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:11:59,387-Speed 4837.96 samples/sec Loss 6.9560 Epoch: 5 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:12:10,218-Speed 4727.23 samples/sec Loss 6.9544 Epoch: 5 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:12:20,862-Speed 4810.87 samples/sec Loss 6.9238 Epoch: 5 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:12:31,650-Speed 4745.95 samples/sec Loss 6.9222 Epoch: 5 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:12:43,169-Speed 4445.32 samples/sec Loss 6.9044 Epoch: 5 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:12:53,804-Speed 4814.31 samples/sec Loss 6.9873 Epoch: 5 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:13:04,643-Speed 4724.35 samples/sec Loss 6.9323 Epoch: 5 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:13:16,288-Speed 4396.75 samples/sec Loss 6.9481 Epoch: 5 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:13:26,916-Speed 4817.81 samples/sec Loss 6.9658 Epoch: 5 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:13:38,388-Speed 4463.46 samples/sec Loss 6.8892 Epoch: 5 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:13:48,962-Speed 4842.19 samples/sec Loss 6.9654 Epoch: 5 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:00,308-Speed 4512.85 samples/sec Loss 6.9863 Epoch: 5 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:10,946-Speed 4813.14 samples/sec Loss 6.9239 Epoch: 5 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:22,319-Speed 4502.18 samples/sec Loss 6.9233 Epoch: 5 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:32,917-Speed 4831.54 samples/sec Loss 6.8723 Epoch: 5 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:43,641-Speed 4774.54 samples/sec Loss 6.8872 Epoch: 5 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:14:54,279-Speed 4813.40 samples/sec Loss 6.8995 Epoch: 5 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:04,893-Speed 4824.40 samples/sec Loss 6.9506 Epoch: 5 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:15,676-Speed 4748.64 samples/sec Loss 6.9227 Epoch: 5 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:26,359-Speed 4793.02 samples/sec Loss 6.9318 Epoch: 5 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:37,300-Speed 4680.18 samples/sec Loss 6.9371 Epoch: 5 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:47,901-Speed 4830.06 samples/sec Loss 6.9286 Epoch: 5 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:15:59,435-Speed 4439.13 samples/sec Loss 6.9344 Epoch: 5 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:16:10,278-Speed 4722.42 samples/sec Loss 6.9915 Epoch: 5 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:16:21,403-Speed 4602.17 samples/sec Loss 6.9357 Epoch: 5 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:16:32,355-Speed 4675.33 samples/sec Loss 6.9669 Epoch: 5 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:16:43,063-Speed 4781.98 samples/sec Loss 6.9111 Epoch: 5 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:16:53,828-Speed 4756.72 samples/sec Loss 6.9644 Epoch: 5 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:17:18,424-[lfw][98000]XNorm: 23.587002 Training: 2021-03-17 22:17:18,424-[lfw][98000]Accuracy-Flip: 0.99583+-0.00310 Training: 2021-03-17 22:17:18,424-[lfw][98000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:17:45,927-[cfp_fp][98000]XNorm: 19.517006 Training: 2021-03-17 22:17:45,928-[cfp_fp][98000]Accuracy-Flip: 0.93900+-0.01593 Training: 2021-03-17 22:17:45,928-[cfp_fp][98000]Accuracy-Highest: 0.93900 Training: 2021-03-17 22:18:09,687-[agedb_30][98000]XNorm: 22.406967 Training: 2021-03-17 22:18:09,687-[agedb_30][98000]Accuracy-Flip: 0.95000+-0.01090 Training: 2021-03-17 22:18:09,687-[agedb_30][98000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:18:20,180-Speed 592.93 samples/sec Loss 6.8872 Epoch: 5 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:18:30,910-Speed 4771.59 samples/sec Loss 6.9027 Epoch: 5 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:18:41,734-Speed 4730.76 samples/sec Loss 6.9370 Epoch: 5 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:18:52,405-Speed 4798.25 samples/sec Loss 6.9184 Epoch: 5 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:02,944-Speed 4858.47 samples/sec Loss 6.9109 Epoch: 5 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:13,610-Speed 4800.59 samples/sec Loss 6.9425 Epoch: 5 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:24,385-Speed 4751.94 samples/sec Loss 6.9101 Epoch: 5 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:34,993-Speed 4827.02 samples/sec Loss 6.9127 Epoch: 5 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:45,558-Speed 4846.69 samples/sec Loss 6.9118 Epoch: 5 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:19:56,216-Speed 4803.85 samples/sec Loss 6.9370 Epoch: 5 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:20:06,791-Speed 4841.94 samples/sec Loss 6.9262 Epoch: 5 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:20:17,397-Speed 4828.00 samples/sec Loss 6.9688 Epoch: 5 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:20:28,124-Speed 4773.22 samples/sec Loss 6.9197 Epoch: 5 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:20:38,983-Speed 4715.23 samples/sec Loss 7.0041 Epoch: 5 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:20:50,446-Speed 4466.86 samples/sec Loss 6.9265 Epoch: 5 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:01,057-Speed 4825.44 samples/sec Loss 6.9088 Epoch: 5 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:11,685-Speed 4817.93 samples/sec Loss 6.9091 Epoch: 5 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:22,381-Speed 4787.18 samples/sec Loss 6.9759 Epoch: 5 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:33,141-Speed 4758.73 samples/sec Loss 6.9580 Epoch: 5 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:43,748-Speed 4827.51 samples/sec Loss 6.9529 Epoch: 5 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:21:54,401-Speed 4806.17 samples/sec Loss 6.9413 Epoch: 5 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:05,114-Speed 4779.36 samples/sec Loss 6.9524 Epoch: 5 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:15,729-Speed 4823.66 samples/sec Loss 6.9181 Epoch: 5 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:26,521-Speed 4744.77 samples/sec Loss 6.9183 Epoch: 5 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:37,318-Speed 4742.35 samples/sec Loss 6.8929 Epoch: 5 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:48,053-Speed 4769.66 samples/sec Loss 6.9572 Epoch: 5 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:22:59,684-Speed 4402.49 samples/sec Loss 6.9002 Epoch: 5 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:23:10,652-Speed 4668.31 samples/sec Loss 6.9441 Epoch: 5 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:23:21,332-Speed 4794.51 samples/sec Loss 6.9274 Epoch: 5 Global Step: 99450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:23:32,214-Speed 4705.33 samples/sec Loss 6.9420 Epoch: 5 Global Step: 99500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:23:43,044-Speed 4728.00 samples/sec Loss 7.0035 Epoch: 5 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:23:53,686-Speed 4811.12 samples/sec Loss 6.8869 Epoch: 5 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:04,525-Speed 4724.12 samples/sec Loss 6.9011 Epoch: 5 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:15,496-Speed 4667.14 samples/sec Loss 6.8922 Epoch: 5 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:27,034-Speed 4437.73 samples/sec Loss 6.9520 Epoch: 5 Global Step: 99750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:37,681-Speed 4808.91 samples/sec Loss 6.8931 Epoch: 5 Global Step: 99800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:48,469-Speed 4746.26 samples/sec Loss 6.9738 Epoch: 5 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:24:59,218-Speed 4763.74 samples/sec Loss 6.9903 Epoch: 5 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:25:10,808-Speed 4417.83 samples/sec Loss 6.9041 Epoch: 5 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:25:21,560-Speed 4761.85 samples/sec Loss 6.9422 Epoch: 5 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:25:46,060-[lfw][100000]XNorm: 22.634965 Training: 2021-03-17 22:25:46,061-[lfw][100000]Accuracy-Flip: 0.99367+-0.00379 Training: 2021-03-17 22:25:46,061-[lfw][100000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:26:13,541-[cfp_fp][100000]XNorm: 18.380471 Training: 2021-03-17 22:26:13,542-[cfp_fp][100000]Accuracy-Flip: 0.91257+-0.01532 Training: 2021-03-17 22:26:13,542-[cfp_fp][100000]Accuracy-Highest: 0.93900 Training: 2021-03-17 22:26:37,321-[agedb_30][100000]XNorm: 21.580727 Training: 2021-03-17 22:26:37,321-[agedb_30][100000]Accuracy-Flip: 0.95350+-0.00998 Training: 2021-03-17 22:26:37,321-[agedb_30][100000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:26:49,718-Speed 580.78 samples/sec Loss 6.9310 Epoch: 5 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:27:00,250-Speed 4861.65 samples/sec Loss 6.9656 Epoch: 5 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:27:23,524-Speed 2199.97 samples/sec Loss 6.8801 Epoch: 6 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:27:35,140-Speed 4407.96 samples/sec Loss 6.2269 Epoch: 6 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:27:46,109-Speed 4667.77 samples/sec Loss 6.1912 Epoch: 6 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:27:56,834-Speed 4774.28 samples/sec Loss 6.2277 Epoch: 6 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:28:07,719-Speed 4704.12 samples/sec Loss 6.2758 Epoch: 6 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:28:18,542-Speed 4731.06 samples/sec Loss 6.2991 Epoch: 6 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:28:29,183-Speed 4811.67 samples/sec Loss 6.3060 Epoch: 6 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:28:39,776-Speed 4834.07 samples/sec Loss 6.3264 Epoch: 6 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:28:51,391-Speed 4408.34 samples/sec Loss 6.3525 Epoch: 6 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:01,858-Speed 4891.53 samples/sec Loss 6.3962 Epoch: 6 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:12,698-Speed 4723.97 samples/sec Loss 6.4200 Epoch: 6 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:23,580-Speed 4705.18 samples/sec Loss 6.4235 Epoch: 6 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:34,242-Speed 4802.47 samples/sec Loss 6.3912 Epoch: 6 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:44,973-Speed 4771.14 samples/sec Loss 6.4317 Epoch: 6 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:29:55,741-Speed 4755.07 samples/sec Loss 6.4636 Epoch: 6 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:30:06,522-Speed 4749.83 samples/sec Loss 6.4877 Epoch: 6 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:30:17,386-Speed 4713.14 samples/sec Loss 6.4672 Epoch: 6 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:30:28,218-Speed 4726.67 samples/sec Loss 6.4575 Epoch: 6 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:30:38,962-Speed 4765.96 samples/sec Loss 6.4836 Epoch: 6 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:30:49,601-Speed 4813.06 samples/sec Loss 6.4956 Epoch: 6 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:00,178-Speed 4840.84 samples/sec Loss 6.5364 Epoch: 6 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:10,777-Speed 4830.90 samples/sec Loss 6.4847 Epoch: 6 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:21,445-Speed 4799.58 samples/sec Loss 6.5308 Epoch: 6 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:32,146-Speed 4785.03 samples/sec Loss 6.5491 Epoch: 6 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:42,916-Speed 4754.26 samples/sec Loss 6.5035 Epoch: 6 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:31:53,608-Speed 4789.18 samples/sec Loss 6.5512 Epoch: 6 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:04,492-Speed 4704.31 samples/sec Loss 6.5409 Epoch: 6 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:15,276-Speed 4748.03 samples/sec Loss 6.5584 Epoch: 6 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:25,936-Speed 4802.89 samples/sec Loss 6.5294 Epoch: 6 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:37,297-Speed 4506.96 samples/sec Loss 6.6126 Epoch: 6 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:48,071-Speed 4752.57 samples/sec Loss 6.5477 Epoch: 6 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:32:59,162-Speed 4616.77 samples/sec Loss 6.6102 Epoch: 6 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:33:10,158-Speed 4656.39 samples/sec Loss 6.6165 Epoch: 6 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:33:20,759-Speed 4830.05 samples/sec Loss 6.5957 Epoch: 6 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:33:31,631-Speed 4709.95 samples/sec Loss 6.6083 Epoch: 6 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:33:42,260-Speed 4817.24 samples/sec Loss 6.6245 Epoch: 6 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:33:52,908-Speed 4808.74 samples/sec Loss 6.6298 Epoch: 6 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:34:03,724-Speed 4733.88 samples/sec Loss 6.6075 Epoch: 6 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:34:28,095-[lfw][102000]XNorm: 22.637178 Training: 2021-03-17 22:34:28,096-[lfw][102000]Accuracy-Flip: 0.99350+-0.00302 Training: 2021-03-17 22:34:28,096-[lfw][102000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:34:55,611-[cfp_fp][102000]XNorm: 18.118861 Training: 2021-03-17 22:34:55,612-[cfp_fp][102000]Accuracy-Flip: 0.92729+-0.00683 Training: 2021-03-17 22:34:55,612-[cfp_fp][102000]Accuracy-Highest: 0.93900 Training: 2021-03-17 22:35:19,410-[agedb_30][102000]XNorm: 21.282621 Training: 2021-03-17 22:35:19,410-[agedb_30][102000]Accuracy-Flip: 0.94883+-0.01090 Training: 2021-03-17 22:35:19,410-[agedb_30][102000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:35:30,038-Speed 593.19 samples/sec Loss 6.6369 Epoch: 6 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:35:41,010-Speed 4667.03 samples/sec Loss 6.6437 Epoch: 6 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:35:51,729-Speed 4776.59 samples/sec Loss 6.6461 Epoch: 6 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:02,549-Speed 4732.25 samples/sec Loss 6.6688 Epoch: 6 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:13,514-Speed 4669.80 samples/sec Loss 6.6320 Epoch: 6 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:25,007-Speed 4454.95 samples/sec Loss 6.6209 Epoch: 6 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:35,669-Speed 4802.59 samples/sec Loss 6.6774 Epoch: 6 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:46,545-Speed 4708.04 samples/sec Loss 6.6777 Epoch: 6 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:36:57,233-Speed 4790.96 samples/sec Loss 6.6837 Epoch: 6 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:37:07,984-Speed 4762.18 samples/sec Loss 6.6786 Epoch: 6 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:37:18,988-Speed 4653.15 samples/sec Loss 6.7009 Epoch: 6 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:37:30,531-Speed 4435.96 samples/sec Loss 6.6679 Epoch: 6 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:37:40,965-Speed 4907.58 samples/sec Loss 6.7081 Epoch: 6 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-17 22:37:51,493-Speed 4863.42 samples/sec Loss 6.7083 Epoch: 6 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:02,074-Speed 4838.91 samples/sec Loss 6.6700 Epoch: 6 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:14,225-Speed 4214.16 samples/sec Loss 6.7021 Epoch: 6 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:25,157-Speed 4683.57 samples/sec Loss 6.7078 Epoch: 6 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:36,924-Speed 4351.33 samples/sec Loss 6.7163 Epoch: 6 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:48,468-Speed 4435.79 samples/sec Loss 6.6764 Epoch: 6 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:38:59,141-Speed 4797.05 samples/sec Loss 6.7133 Epoch: 6 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:39:09,932-Speed 4744.93 samples/sec Loss 6.7985 Epoch: 6 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:39:21,409-Speed 4461.63 samples/sec Loss 6.7224 Epoch: 6 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:39:32,040-Speed 4816.34 samples/sec Loss 6.7468 Epoch: 6 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:39:42,795-Speed 4761.04 samples/sec Loss 6.7651 Epoch: 6 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:39:53,648-Speed 4717.49 samples/sec Loss 6.7261 Epoch: 6 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:04,403-Speed 4761.13 samples/sec Loss 6.7407 Epoch: 6 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:16,074-Speed 4387.19 samples/sec Loss 6.7623 Epoch: 6 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:26,621-Speed 4854.85 samples/sec Loss 6.7628 Epoch: 6 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:37,745-Speed 4602.78 samples/sec Loss 6.7434 Epoch: 6 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:48,290-Speed 4855.84 samples/sec Loss 6.7043 Epoch: 6 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:40:58,859-Speed 4844.49 samples/sec Loss 6.7414 Epoch: 6 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:41:09,476-Speed 4823.13 samples/sec Loss 6.7792 Epoch: 6 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:41:20,162-Speed 4791.59 samples/sec Loss 6.7284 Epoch: 6 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:41:31,045-Speed 4704.55 samples/sec Loss 6.8063 Epoch: 6 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:41:41,766-Speed 4775.92 samples/sec Loss 6.8338 Epoch: 6 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:41:52,440-Speed 4797.25 samples/sec Loss 6.8418 Epoch: 6 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:42:03,198-Speed 4759.71 samples/sec Loss 6.8025 Epoch: 6 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:42:14,032-Speed 4726.16 samples/sec Loss 6.8279 Epoch: 6 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:42:24,753-Speed 4776.13 samples/sec Loss 6.7552 Epoch: 6 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:42:35,750-Speed 4655.89 samples/sec Loss 6.8088 Epoch: 6 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:43:00,265-[lfw][104000]XNorm: 22.266761 Training: 2021-03-17 22:43:00,266-[lfw][104000]Accuracy-Flip: 0.99583+-0.00281 Training: 2021-03-17 22:43:00,266-[lfw][104000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:43:27,832-[cfp_fp][104000]XNorm: 18.384524 Training: 2021-03-17 22:43:27,832-[cfp_fp][104000]Accuracy-Flip: 0.91957+-0.01379 Training: 2021-03-17 22:43:27,832-[cfp_fp][104000]Accuracy-Highest: 0.93900 Training: 2021-03-17 22:43:51,601-[agedb_30][104000]XNorm: 21.128305 Training: 2021-03-17 22:43:51,601-[agedb_30][104000]Accuracy-Flip: 0.95317+-0.01023 Training: 2021-03-17 22:43:51,601-[agedb_30][104000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:44:02,370-Speed 591.09 samples/sec Loss 6.7759 Epoch: 6 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:44:13,082-Speed 4780.03 samples/sec Loss 6.7508 Epoch: 6 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:44:23,756-Speed 4796.94 samples/sec Loss 6.7801 Epoch: 6 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:44:34,426-Speed 4798.66 samples/sec Loss 6.7488 Epoch: 6 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:44:45,361-Speed 4682.59 samples/sec Loss 6.8161 Epoch: 6 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:44:55,926-Speed 4846.31 samples/sec Loss 6.7689 Epoch: 6 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:45:06,631-Speed 4783.08 samples/sec Loss 6.8360 Epoch: 6 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:45:18,299-Speed 4388.12 samples/sec Loss 6.8620 Epoch: 6 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:45:29,193-Speed 4700.41 samples/sec Loss 6.8407 Epoch: 6 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:45:39,864-Speed 4798.23 samples/sec Loss 6.8009 Epoch: 6 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:45:50,534-Speed 4799.12 samples/sec Loss 6.7853 Epoch: 6 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:01,250-Speed 4778.13 samples/sec Loss 6.8044 Epoch: 6 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:12,113-Speed 4713.46 samples/sec Loss 6.7894 Epoch: 6 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:23,002-Speed 4702.33 samples/sec Loss 6.8265 Epoch: 6 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:33,974-Speed 4666.58 samples/sec Loss 6.7991 Epoch: 6 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:44,881-Speed 4694.80 samples/sec Loss 6.8215 Epoch: 6 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:46:55,798-Speed 4690.10 samples/sec Loss 6.8125 Epoch: 6 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:47:06,353-Speed 4850.94 samples/sec Loss 6.8211 Epoch: 6 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:47:17,086-Speed 4770.64 samples/sec Loss 6.8208 Epoch: 6 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:47:27,825-Speed 4767.92 samples/sec Loss 6.8658 Epoch: 6 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:47:38,675-Speed 4719.27 samples/sec Loss 6.8434 Epoch: 6 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:47:49,495-Speed 4732.28 samples/sec Loss 6.8275 Epoch: 6 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:00,975-Speed 4460.35 samples/sec Loss 6.8994 Epoch: 6 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:11,823-Speed 4719.80 samples/sec Loss 6.9236 Epoch: 6 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:22,673-Speed 4719.12 samples/sec Loss 6.8434 Epoch: 6 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:33,493-Speed 4732.16 samples/sec Loss 6.8475 Epoch: 6 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:44,443-Speed 4676.25 samples/sec Loss 6.8306 Epoch: 6 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:48:56,090-Speed 4396.09 samples/sec Loss 6.8530 Epoch: 6 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:49:06,886-Speed 4743.02 samples/sec Loss 6.8036 Epoch: 6 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:49:17,532-Speed 4809.62 samples/sec Loss 6.8433 Epoch: 6 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:49:28,068-Speed 4859.68 samples/sec Loss 6.8545 Epoch: 6 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:49:38,808-Speed 4767.61 samples/sec Loss 6.8106 Epoch: 6 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:49:49,690-Speed 4705.43 samples/sec Loss 6.7742 Epoch: 6 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:02,281-Speed 4066.42 samples/sec Loss 6.8360 Epoch: 6 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:12,952-Speed 4798.28 samples/sec Loss 6.8355 Epoch: 6 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:23,597-Speed 4810.31 samples/sec Loss 6.8887 Epoch: 6 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:34,938-Speed 4514.59 samples/sec Loss 6.8730 Epoch: 6 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:45,794-Speed 4716.46 samples/sec Loss 6.8536 Epoch: 6 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:50:56,480-Speed 4791.64 samples/sec Loss 6.8701 Epoch: 6 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:51:08,167-Speed 4381.61 samples/sec Loss 6.9207 Epoch: 6 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:51:32,749-[lfw][106000]XNorm: 22.185648 Training: 2021-03-17 22:51:32,749-[lfw][106000]Accuracy-Flip: 0.99567+-0.00281 Training: 2021-03-17 22:51:32,749-[lfw][106000]Accuracy-Highest: 0.99667 Training: 2021-03-17 22:52:00,379-[cfp_fp][106000]XNorm: 18.148495 Training: 2021-03-17 22:52:00,379-[cfp_fp][106000]Accuracy-Flip: 0.92543+-0.01195 Training: 2021-03-17 22:52:00,379-[cfp_fp][106000]Accuracy-Highest: 0.93900 Training: 2021-03-17 22:52:24,198-[agedb_30][106000]XNorm: 21.175700 Training: 2021-03-17 22:52:24,198-[agedb_30][106000]Accuracy-Flip: 0.95483+-0.00935 Training: 2021-03-17 22:52:24,198-[agedb_30][106000]Accuracy-Highest: 0.95550 Training: 2021-03-17 22:52:34,888-Speed 590.40 samples/sec Loss 6.8445 Epoch: 6 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:52:45,625-Speed 4768.72 samples/sec Loss 6.8227 Epoch: 6 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:52:57,366-Speed 4361.00 samples/sec Loss 6.8474 Epoch: 6 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:53:08,300-Speed 4682.86 samples/sec Loss 6.8771 Epoch: 6 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:53:19,055-Speed 4760.86 samples/sec Loss 6.9047 Epoch: 6 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:53:29,797-Speed 4766.71 samples/sec Loss 6.9523 Epoch: 6 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:53:40,707-Speed 4693.27 samples/sec Loss 6.8760 Epoch: 6 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:53:51,323-Speed 4823.22 samples/sec Loss 6.8807 Epoch: 6 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:02,220-Speed 4698.66 samples/sec Loss 6.8708 Epoch: 6 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:12,977-Speed 4760.04 samples/sec Loss 6.8685 Epoch: 6 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:23,697-Speed 4776.34 samples/sec Loss 6.9166 Epoch: 6 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:34,520-Speed 4731.27 samples/sec Loss 6.8150 Epoch: 6 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:45,353-Speed 4726.40 samples/sec Loss 6.9184 Epoch: 6 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:54:56,021-Speed 4799.70 samples/sec Loss 6.8610 Epoch: 6 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:55:06,700-Speed 4795.04 samples/sec Loss 6.9164 Epoch: 6 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:55:17,660-Speed 4672.14 samples/sec Loss 6.8754 Epoch: 6 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:55:28,124-Speed 4892.97 samples/sec Loss 6.8675 Epoch: 6 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:55:38,817-Speed 4788.40 samples/sec Loss 6.8403 Epoch: 6 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:55:49,384-Speed 4845.89 samples/sec Loss 6.8645 Epoch: 6 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:00,222-Speed 4724.35 samples/sec Loss 6.8599 Epoch: 6 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:11,110-Speed 4703.11 samples/sec Loss 6.8780 Epoch: 6 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:22,761-Speed 4394.55 samples/sec Loss 6.8823 Epoch: 6 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:33,312-Speed 4853.17 samples/sec Loss 6.8670 Epoch: 6 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:43,971-Speed 4803.76 samples/sec Loss 6.8645 Epoch: 6 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:56:54,682-Speed 4780.57 samples/sec Loss 6.9012 Epoch: 6 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:05,263-Speed 4838.99 samples/sec Loss 6.8822 Epoch: 6 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:16,115-Speed 4718.26 samples/sec Loss 6.9397 Epoch: 6 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:26,621-Speed 4873.59 samples/sec Loss 6.8700 Epoch: 6 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:37,370-Speed 4763.97 samples/sec Loss 6.8749 Epoch: 6 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:48,325-Speed 4673.69 samples/sec Loss 6.9324 Epoch: 6 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:57:59,056-Speed 4771.43 samples/sec Loss 6.8847 Epoch: 6 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:58:09,847-Speed 4745.23 samples/sec Loss 6.9404 Epoch: 6 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:58:20,578-Speed 4771.40 samples/sec Loss 6.8577 Epoch: 6 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:58:31,181-Speed 4829.40 samples/sec Loss 6.8775 Epoch: 6 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:58:42,040-Speed 4715.12 samples/sec Loss 6.8894 Epoch: 6 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:58:52,741-Speed 4785.18 samples/sec Loss 6.9102 Epoch: 6 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:59:03,350-Speed 4826.31 samples/sec Loss 6.9062 Epoch: 6 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:59:14,224-Speed 4708.42 samples/sec Loss 6.8775 Epoch: 6 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:59:25,220-Speed 4656.72 samples/sec Loss 6.9038 Epoch: 6 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 22:59:35,950-Speed 4772.04 samples/sec Loss 6.8179 Epoch: 6 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:00:00,344-[lfw][108000]XNorm: 21.370210 Training: 2021-03-17 23:00:00,344-[lfw][108000]Accuracy-Flip: 0.99450+-0.00317 Training: 2021-03-17 23:00:00,344-[lfw][108000]Accuracy-Highest: 0.99667 Training: 2021-03-17 23:00:27,906-[cfp_fp][108000]XNorm: 17.483926 Training: 2021-03-17 23:00:27,906-[cfp_fp][108000]Accuracy-Flip: 0.92671+-0.01206 Training: 2021-03-17 23:00:27,906-[cfp_fp][108000]Accuracy-Highest: 0.93900 Training: 2021-03-17 23:00:51,700-[agedb_30][108000]XNorm: 20.792818 Training: 2021-03-17 23:00:51,700-[agedb_30][108000]Accuracy-Flip: 0.95183+-0.00828 Training: 2021-03-17 23:00:51,700-[agedb_30][108000]Accuracy-Highest: 0.95550 Training: 2021-03-17 23:01:02,352-Speed 592.58 samples/sec Loss 6.8870 Epoch: 6 Global Step: 108050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:01:14,119-Speed 4351.48 samples/sec Loss 6.8855 Epoch: 6 Global Step: 108100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:01:24,833-Speed 4778.85 samples/sec Loss 6.9067 Epoch: 6 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:01:35,682-Speed 4719.65 samples/sec Loss 6.8454 Epoch: 6 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:01:46,617-Speed 4682.86 samples/sec Loss 6.9384 Epoch: 6 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:01:58,054-Speed 4476.91 samples/sec Loss 6.8569 Epoch: 6 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:02:08,824-Speed 4754.12 samples/sec Loss 6.8966 Epoch: 6 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:02:19,722-Speed 4698.47 samples/sec Loss 6.8983 Epoch: 6 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:02:30,477-Speed 4761.07 samples/sec Loss 6.9206 Epoch: 6 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:02:41,323-Speed 4720.66 samples/sec Loss 6.9036 Epoch: 6 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:02:52,238-Speed 4691.00 samples/sec Loss 6.8908 Epoch: 6 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:04,486-Speed 4180.79 samples/sec Loss 6.8911 Epoch: 6 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:15,302-Speed 4733.58 samples/sec Loss 6.8569 Epoch: 6 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:26,863-Speed 4428.98 samples/sec Loss 6.8473 Epoch: 6 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:38,458-Speed 4415.88 samples/sec Loss 6.8218 Epoch: 6 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:49,370-Speed 4692.41 samples/sec Loss 6.8990 Epoch: 6 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:03:59,964-Speed 4833.11 samples/sec Loss 6.8733 Epoch: 6 Global Step: 108850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:04:11,590-Speed 4404.49 samples/sec Loss 6.8791 Epoch: 6 Global Step: 108900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:04:22,863-Speed 4542.05 samples/sec Loss 6.9246 Epoch: 6 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:04:33,476-Speed 4824.59 samples/sec Loss 6.8732 Epoch: 6 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:04:44,215-Speed 4768.20 samples/sec Loss 6.9526 Epoch: 6 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:04:54,979-Speed 4756.82 samples/sec Loss 6.8557 Epoch: 6 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:05:05,947-Speed 4668.58 samples/sec Loss 6.8431 Epoch: 6 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:05:16,773-Speed 4729.52 samples/sec Loss 6.8917 Epoch: 6 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:05:27,495-Speed 4775.34 samples/sec Loss 6.9013 Epoch: 6 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:05:38,436-Speed 4679.94 samples/sec Loss 6.9420 Epoch: 6 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:05:49,549-Speed 4607.82 samples/sec Loss 6.8883 Epoch: 6 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:00,422-Speed 4709.43 samples/sec Loss 6.9170 Epoch: 6 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:11,096-Speed 4797.13 samples/sec Loss 6.8598 Epoch: 6 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:21,832-Speed 4769.42 samples/sec Loss 6.8821 Epoch: 6 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:32,422-Speed 4834.80 samples/sec Loss 6.9371 Epoch: 6 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:43,335-Speed 4691.97 samples/sec Loss 6.8521 Epoch: 6 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:06:54,202-Speed 4711.69 samples/sec Loss 6.8966 Epoch: 6 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:04,930-Speed 4772.97 samples/sec Loss 6.8896 Epoch: 6 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:15,804-Speed 4708.76 samples/sec Loss 6.9082 Epoch: 6 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:26,443-Speed 4813.07 samples/sec Loss 6.8754 Epoch: 6 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:37,407-Speed 4669.82 samples/sec Loss 6.8412 Epoch: 6 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:48,852-Speed 4473.84 samples/sec Loss 6.8704 Epoch: 6 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:07:59,480-Speed 4817.79 samples/sec Loss 6.8620 Epoch: 6 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:08:10,276-Speed 4742.86 samples/sec Loss 6.9336 Epoch: 6 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:08:34,709-[lfw][110000]XNorm: 21.307811 Training: 2021-03-17 23:08:34,710-[lfw][110000]Accuracy-Flip: 0.99250+-0.00318 Training: 2021-03-17 23:08:34,710-[lfw][110000]Accuracy-Highest: 0.99667 Training: 2021-03-17 23:09:02,396-[cfp_fp][110000]XNorm: 17.436788 Training: 2021-03-17 23:09:02,396-[cfp_fp][110000]Accuracy-Flip: 0.92714+-0.01050 Training: 2021-03-17 23:09:02,396-[cfp_fp][110000]Accuracy-Highest: 0.93900 Training: 2021-03-17 23:09:26,231-[agedb_30][110000]XNorm: 20.431218 Training: 2021-03-17 23:09:26,232-[agedb_30][110000]Accuracy-Flip: 0.95550+-0.01131 Training: 2021-03-17 23:09:26,232-[agedb_30][110000]Accuracy-Highest: 0.95550 Training: 2021-03-17 23:09:36,948-Speed 590.74 samples/sec Loss 6.9083 Epoch: 6 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:09:47,486-Speed 4859.01 samples/sec Loss 6.9308 Epoch: 6 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:09:58,413-Speed 4685.87 samples/sec Loss 6.9123 Epoch: 6 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:10:09,238-Speed 4729.98 samples/sec Loss 6.8403 Epoch: 6 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:10:19,755-Speed 4868.54 samples/sec Loss 6.9502 Epoch: 6 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:10:30,504-Speed 4763.67 samples/sec Loss 6.8597 Epoch: 6 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:10:41,322-Speed 4733.12 samples/sec Loss 6.8959 Epoch: 6 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:10:51,992-Speed 4798.92 samples/sec Loss 6.9062 Epoch: 6 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:02,763-Speed 4753.92 samples/sec Loss 6.9737 Epoch: 6 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:13,526-Speed 4757.06 samples/sec Loss 6.8211 Epoch: 6 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:24,185-Speed 4803.94 samples/sec Loss 6.9339 Epoch: 6 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:35,057-Speed 4709.54 samples/sec Loss 6.9200 Epoch: 6 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:45,793-Speed 4769.43 samples/sec Loss 6.9108 Epoch: 6 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:11:56,505-Speed 4780.04 samples/sec Loss 6.8201 Epoch: 6 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:12:07,216-Speed 4780.04 samples/sec Loss 6.8455 Epoch: 6 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:12:18,058-Speed 4722.76 samples/sec Loss 6.8905 Epoch: 6 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:12:28,868-Speed 4736.73 samples/sec Loss 6.8777 Epoch: 6 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:12:39,607-Speed 4768.25 samples/sec Loss 6.9178 Epoch: 6 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:12:50,332-Speed 4774.03 samples/sec Loss 6.8818 Epoch: 6 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:01,823-Speed 4455.93 samples/sec Loss 6.8788 Epoch: 6 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:12,506-Speed 4792.70 samples/sec Loss 6.9238 Epoch: 6 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:23,296-Speed 4745.71 samples/sec Loss 6.9031 Epoch: 6 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:34,252-Speed 4673.64 samples/sec Loss 6.8805 Epoch: 6 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:45,839-Speed 4418.74 samples/sec Loss 6.8624 Epoch: 6 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:13:56,837-Speed 4656.08 samples/sec Loss 6.8910 Epoch: 6 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:14:07,443-Speed 4827.85 samples/sec Loss 6.8963 Epoch: 6 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:14:18,291-Speed 4720.17 samples/sec Loss 6.9426 Epoch: 6 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:14:29,194-Speed 4696.33 samples/sec Loss 6.9367 Epoch: 6 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:14:40,060-Speed 4711.89 samples/sec Loss 6.8757 Epoch: 6 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:14:52,259-Speed 4197.44 samples/sec Loss 6.8887 Epoch: 6 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:02,930-Speed 4798.44 samples/sec Loss 6.9296 Epoch: 6 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:14,540-Speed 4410.14 samples/sec Loss 6.8441 Epoch: 6 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:25,870-Speed 4519.09 samples/sec Loss 6.8633 Epoch: 6 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:36,576-Speed 4782.72 samples/sec Loss 6.8838 Epoch: 6 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:48,375-Speed 4339.57 samples/sec Loss 6.8434 Epoch: 6 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:15:59,107-Speed 4771.13 samples/sec Loss 6.8579 Epoch: 6 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:16:09,891-Speed 4748.09 samples/sec Loss 6.9228 Epoch: 6 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:16:20,618-Speed 4773.29 samples/sec Loss 6.9131 Epoch: 6 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:16:31,167-Speed 4853.78 samples/sec Loss 6.9243 Epoch: 6 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:16:41,977-Speed 4736.83 samples/sec Loss 6.9538 Epoch: 6 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:17:05,953-[lfw][112000]XNorm: 21.801429 Training: 2021-03-17 23:17:05,954-[lfw][112000]Accuracy-Flip: 0.99567+-0.00260 Training: 2021-03-17 23:17:05,954-[lfw][112000]Accuracy-Highest: 0.99667 Training: 2021-03-17 23:17:33,483-[cfp_fp][112000]XNorm: 18.223729 Training: 2021-03-17 23:17:33,483-[cfp_fp][112000]Accuracy-Flip: 0.91057+-0.01388 Training: 2021-03-17 23:17:33,483-[cfp_fp][112000]Accuracy-Highest: 0.93900 Training: 2021-03-17 23:17:57,248-[agedb_30][112000]XNorm: 21.113600 Training: 2021-03-17 23:17:57,249-[agedb_30][112000]Accuracy-Flip: 0.94583+-0.01179 Training: 2021-03-17 23:17:57,249-[agedb_30][112000]Accuracy-Highest: 0.95550 Training: 2021-03-17 23:18:07,811-Speed 596.50 samples/sec Loss 6.9517 Epoch: 6 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:18:18,481-Speed 4799.20 samples/sec Loss 6.9005 Epoch: 6 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:18:29,194-Speed 4779.27 samples/sec Loss 6.8994 Epoch: 6 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:18:40,041-Speed 4720.63 samples/sec Loss 6.8573 Epoch: 6 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:18:50,989-Speed 4676.87 samples/sec Loss 6.9019 Epoch: 6 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:01,806-Speed 4733.69 samples/sec Loss 6.9101 Epoch: 6 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:12,381-Speed 4841.70 samples/sec Loss 6.8832 Epoch: 6 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:23,105-Speed 4774.81 samples/sec Loss 6.8559 Epoch: 6 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:33,752-Speed 4809.10 samples/sec Loss 6.9074 Epoch: 6 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:44,635-Speed 4704.74 samples/sec Loss 6.9090 Epoch: 6 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:19:55,357-Speed 4775.73 samples/sec Loss 6.8612 Epoch: 6 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:20:05,991-Speed 4814.92 samples/sec Loss 6.8668 Epoch: 6 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:20:16,688-Speed 4786.62 samples/sec Loss 6.8794 Epoch: 6 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:20:27,359-Speed 4798.11 samples/sec Loss 6.8595 Epoch: 6 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:20:38,878-Speed 4445.17 samples/sec Loss 6.9413 Epoch: 6 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:20:49,550-Speed 4798.08 samples/sec Loss 6.9107 Epoch: 6 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:00,478-Speed 4685.49 samples/sec Loss 6.9279 Epoch: 6 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:11,335-Speed 4716.24 samples/sec Loss 6.8715 Epoch: 6 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:22,077-Speed 4766.51 samples/sec Loss 6.8854 Epoch: 6 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:32,915-Speed 4724.73 samples/sec Loss 6.8984 Epoch: 6 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:43,577-Speed 4802.19 samples/sec Loss 6.9102 Epoch: 6 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:21:54,209-Speed 4816.30 samples/sec Loss 6.9255 Epoch: 6 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:04,885-Speed 4795.75 samples/sec Loss 6.9472 Epoch: 6 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:15,476-Speed 4834.77 samples/sec Loss 6.8275 Epoch: 6 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:26,340-Speed 4713.15 samples/sec Loss 6.9329 Epoch: 6 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:37,297-Speed 4673.12 samples/sec Loss 6.9729 Epoch: 6 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:48,162-Speed 4712.72 samples/sec Loss 6.9137 Epoch: 6 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:22:58,742-Speed 4839.51 samples/sec Loss 6.8789 Epoch: 6 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:23:09,566-Speed 4730.76 samples/sec Loss 6.9110 Epoch: 6 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:23:20,425-Speed 4715.27 samples/sec Loss 6.8630 Epoch: 6 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:23:31,616-Speed 4575.41 samples/sec Loss 6.8876 Epoch: 6 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:23:42,583-Speed 4668.71 samples/sec Loss 6.8718 Epoch: 6 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:23:53,269-Speed 4791.60 samples/sec Loss 6.8715 Epoch: 6 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:04,106-Speed 4724.80 samples/sec Loss 6.8467 Epoch: 6 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:14,784-Speed 4795.22 samples/sec Loss 6.8974 Epoch: 6 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:25,534-Speed 4763.26 samples/sec Loss 6.9047 Epoch: 6 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:36,668-Speed 4598.96 samples/sec Loss 6.9398 Epoch: 6 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:47,490-Speed 4731.16 samples/sec Loss 6.8988 Epoch: 6 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:24:58,976-Speed 4457.67 samples/sec Loss 6.9299 Epoch: 6 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:25:09,682-Speed 4782.62 samples/sec Loss 6.8748 Epoch: 6 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:25:34,140-[lfw][114000]XNorm: 23.744195 Training: 2021-03-17 23:25:34,140-[lfw][114000]Accuracy-Flip: 0.99483+-0.00345 Training: 2021-03-17 23:25:34,141-[lfw][114000]Accuracy-Highest: 0.99667 Training: 2021-03-17 23:26:01,647-[cfp_fp][114000]XNorm: 19.420971 Training: 2021-03-17 23:26:01,647-[cfp_fp][114000]Accuracy-Flip: 0.91457+-0.01047 Training: 2021-03-17 23:26:01,647-[cfp_fp][114000]Accuracy-Highest: 0.93900 Training: 2021-03-17 23:26:25,380-[agedb_30][114000]XNorm: 22.973845 Training: 2021-03-17 23:26:25,381-[agedb_30][114000]Accuracy-Flip: 0.95767+-0.00867 Training: 2021-03-17 23:26:25,381-[agedb_30][114000]Accuracy-Highest: 0.95767 Training: 2021-03-17 23:26:36,878-Speed 587.19 samples/sec Loss 6.9022 Epoch: 6 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:26:47,505-Speed 4818.09 samples/sec Loss 6.9157 Epoch: 6 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:26:58,407-Speed 4696.75 samples/sec Loss 6.8353 Epoch: 6 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:27:09,103-Speed 4787.27 samples/sec Loss 6.8494 Epoch: 6 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:27:19,585-Speed 4884.83 samples/sec Loss 6.8730 Epoch: 6 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:27:30,397-Speed 4735.86 samples/sec Loss 6.8979 Epoch: 6 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:27:41,140-Speed 4766.12 samples/sec Loss 6.8813 Epoch: 6 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:27:52,843-Speed 4375.09 samples/sec Loss 6.8630 Epoch: 6 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:28:05,285-Speed 4115.46 samples/sec Loss 6.8815 Epoch: 6 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:28:17,086-Speed 4338.75 samples/sec Loss 6.8584 Epoch: 6 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:28:27,896-Speed 4736.54 samples/sec Loss 6.8063 Epoch: 6 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:28:39,942-Speed 4250.77 samples/sec Loss 6.7809 Epoch: 6 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:28:50,668-Speed 4773.79 samples/sec Loss 6.9235 Epoch: 6 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:01,535-Speed 4711.59 samples/sec Loss 6.8984 Epoch: 6 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:12,412-Speed 4707.75 samples/sec Loss 6.9167 Epoch: 6 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:23,310-Speed 4698.43 samples/sec Loss 6.8813 Epoch: 6 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:33,990-Speed 4794.23 samples/sec Loss 6.8989 Epoch: 6 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:44,732-Speed 4766.31 samples/sec Loss 6.9249 Epoch: 6 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:29:55,485-Speed 4762.08 samples/sec Loss 6.8753 Epoch: 6 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:06,230-Speed 4764.98 samples/sec Loss 6.8817 Epoch: 6 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:17,029-Speed 4741.45 samples/sec Loss 6.8980 Epoch: 6 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:27,915-Speed 4703.90 samples/sec Loss 6.9078 Epoch: 6 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:38,520-Speed 4828.10 samples/sec Loss 6.9564 Epoch: 6 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:49,358-Speed 4724.28 samples/sec Loss 6.8613 Epoch: 6 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:30:59,934-Speed 4841.35 samples/sec Loss 6.9222 Epoch: 6 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:31:10,758-Speed 4730.49 samples/sec Loss 6.8737 Epoch: 6 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:31:21,381-Speed 4820.12 samples/sec Loss 6.8875 Epoch: 6 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:31:32,117-Speed 4769.25 samples/sec Loss 6.8385 Epoch: 6 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:31:42,709-Speed 4834.42 samples/sec Loss 6.9093 Epoch: 6 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:31:53,277-Speed 4844.93 samples/sec Loss 6.8591 Epoch: 6 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:04,156-Speed 4706.74 samples/sec Loss 6.8988 Epoch: 6 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:16,074-Speed 4296.37 samples/sec Loss 6.8246 Epoch: 6 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:26,949-Speed 4708.29 samples/sec Loss 6.9375 Epoch: 6 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:37,578-Speed 4817.62 samples/sec Loss 6.8807 Epoch: 6 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:48,706-Speed 4601.04 samples/sec Loss 6.8755 Epoch: 6 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:32:59,411-Speed 4783.33 samples/sec Loss 6.9311 Epoch: 6 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:33:10,021-Speed 4825.94 samples/sec Loss 6.8912 Epoch: 6 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:33:20,849-Speed 4729.01 samples/sec Loss 6.8852 Epoch: 6 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:33:31,668-Speed 4732.72 samples/sec Loss 6.8936 Epoch: 6 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:33:42,537-Speed 4710.83 samples/sec Loss 6.9014 Epoch: 6 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:34:06,652-[lfw][116000]XNorm: 23.489022 Training: 2021-03-17 23:34:06,652-[lfw][116000]Accuracy-Flip: 0.99600+-0.00335 Training: 2021-03-17 23:34:06,652-[lfw][116000]Accuracy-Highest: 0.99667 Training: 2021-03-17 23:34:34,109-[cfp_fp][116000]XNorm: 19.465752 Training: 2021-03-17 23:34:34,109-[cfp_fp][116000]Accuracy-Flip: 0.90443+-0.01757 Training: 2021-03-17 23:34:34,109-[cfp_fp][116000]Accuracy-Highest: 0.93900 Training: 2021-03-17 23:34:57,777-[agedb_30][116000]XNorm: 21.878698 Training: 2021-03-17 23:34:57,777-[agedb_30][116000]Accuracy-Flip: 0.95367+-0.01002 Training: 2021-03-17 23:34:57,777-[agedb_30][116000]Accuracy-Highest: 0.95767 Training: 2021-03-17 23:35:08,437-Speed 596.05 samples/sec Loss 6.9317 Epoch: 6 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:35:19,255-Speed 4733.08 samples/sec Loss 6.8629 Epoch: 6 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:35:29,999-Speed 4765.49 samples/sec Loss 6.8861 Epoch: 6 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:35:40,716-Speed 4777.62 samples/sec Loss 6.8969 Epoch: 6 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:35:51,386-Speed 4799.10 samples/sec Loss 6.8654 Epoch: 6 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:02,107-Speed 4775.94 samples/sec Loss 6.8651 Epoch: 6 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:12,827-Speed 4776.08 samples/sec Loss 6.8176 Epoch: 6 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:23,558-Speed 4771.91 samples/sec Loss 6.9469 Epoch: 6 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:34,330-Speed 4753.29 samples/sec Loss 6.8883 Epoch: 6 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:45,165-Speed 4725.60 samples/sec Loss 6.8667 Epoch: 6 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:36:55,984-Speed 4732.91 samples/sec Loss 6.8971 Epoch: 6 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-17 23:37:06,814-Speed 4727.66 samples/sec Loss 6.9229 Epoch: 6 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:37:17,575-Speed 4758.40 samples/sec Loss 6.8944 Epoch: 6 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:37:28,320-Speed 4765.46 samples/sec Loss 6.9111 Epoch: 6 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:37:39,196-Speed 4707.50 samples/sec Loss 6.8549 Epoch: 6 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:37:50,980-Speed 4345.24 samples/sec Loss 6.8896 Epoch: 6 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:38:14,708-Speed 2157.91 samples/sec Loss 6.6161 Epoch: 7 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:38:27,372-Speed 4043.05 samples/sec Loss 5.2699 Epoch: 7 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:38:38,252-Speed 4706.29 samples/sec Loss 4.9182 Epoch: 7 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:38:49,111-Speed 4715.45 samples/sec Loss 4.7038 Epoch: 7 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:38:59,807-Speed 4786.96 samples/sec Loss 4.6519 Epoch: 7 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:39:10,807-Speed 4655.05 samples/sec Loss 4.5142 Epoch: 7 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:39:21,483-Speed 4796.03 samples/sec Loss 4.4665 Epoch: 7 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:39:32,169-Speed 4791.81 samples/sec Loss 4.4108 Epoch: 7 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:39:45,876-Speed 3735.49 samples/sec Loss 4.3320 Epoch: 7 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:39:56,663-Speed 4746.81 samples/sec Loss 4.2991 Epoch: 7 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:40:07,531-Speed 4711.39 samples/sec Loss 4.2585 Epoch: 7 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:40:19,995-Speed 4108.07 samples/sec Loss 4.2821 Epoch: 7 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:40:30,670-Speed 4796.87 samples/sec Loss 4.2022 Epoch: 7 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:40:41,430-Speed 4758.29 samples/sec Loss 4.1409 Epoch: 7 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:40:52,219-Speed 4746.30 samples/sec Loss 4.1476 Epoch: 7 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:02,986-Speed 4755.61 samples/sec Loss 4.1024 Epoch: 7 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:13,849-Speed 4713.36 samples/sec Loss 4.0669 Epoch: 7 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:24,456-Speed 4827.56 samples/sec Loss 4.0743 Epoch: 7 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:35,240-Speed 4748.03 samples/sec Loss 4.0749 Epoch: 7 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:45,929-Speed 4790.43 samples/sec Loss 3.9891 Epoch: 7 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:41:56,769-Speed 4723.36 samples/sec Loss 4.0504 Epoch: 7 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:42:07,544-Speed 4752.06 samples/sec Loss 3.9661 Epoch: 7 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:42:18,119-Speed 4841.78 samples/sec Loss 3.9769 Epoch: 7 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:42:28,900-Speed 4749.29 samples/sec Loss 3.8724 Epoch: 7 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:42:53,345-[lfw][118000]XNorm: 22.970354 Training: 2021-03-17 23:42:53,345-[lfw][118000]Accuracy-Flip: 0.99767+-0.00271 Training: 2021-03-17 23:42:53,345-[lfw][118000]Accuracy-Highest: 0.99767 Training: 2021-03-17 23:43:21,068-[cfp_fp][118000]XNorm: 19.182332 Training: 2021-03-17 23:43:21,068-[cfp_fp][118000]Accuracy-Flip: 0.96671+-0.00862 Training: 2021-03-17 23:43:21,068-[cfp_fp][118000]Accuracy-Highest: 0.96671 Training: 2021-03-17 23:43:45,241-[agedb_30][118000]XNorm: 22.245920 Training: 2021-03-17 23:43:45,241-[agedb_30][118000]Accuracy-Flip: 0.97033+-0.00586 Training: 2021-03-17 23:43:45,241-[agedb_30][118000]Accuracy-Highest: 0.97033 Training: 2021-03-17 23:43:55,957-Speed 588.12 samples/sec Loss 3.9078 Epoch: 7 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:44:06,544-Speed 4836.49 samples/sec Loss 3.8811 Epoch: 7 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:44:17,288-Speed 4765.69 samples/sec Loss 3.8840 Epoch: 7 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:44:27,943-Speed 4805.56 samples/sec Loss 3.8479 Epoch: 7 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:44:38,786-Speed 4722.45 samples/sec Loss 3.8357 Epoch: 7 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:44:49,642-Speed 4716.50 samples/sec Loss 3.8138 Epoch: 7 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:00,307-Speed 4800.76 samples/sec Loss 3.8364 Epoch: 7 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:11,053-Speed 4764.98 samples/sec Loss 3.7940 Epoch: 7 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:22,693-Speed 4398.97 samples/sec Loss 3.7624 Epoch: 7 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:33,276-Speed 4838.34 samples/sec Loss 3.7482 Epoch: 7 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:44,076-Speed 4740.75 samples/sec Loss 3.7457 Epoch: 7 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:45:54,792-Speed 4778.36 samples/sec Loss 3.7800 Epoch: 7 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:05,785-Speed 4657.69 samples/sec Loss 3.6982 Epoch: 7 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:16,657-Speed 4709.62 samples/sec Loss 3.6839 Epoch: 7 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:27,618-Speed 4671.82 samples/sec Loss 3.6732 Epoch: 7 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:38,278-Speed 4802.87 samples/sec Loss 3.6837 Epoch: 7 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:49,118-Speed 4723.82 samples/sec Loss 3.6641 Epoch: 7 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:46:59,730-Speed 4824.68 samples/sec Loss 3.6958 Epoch: 7 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:47:10,521-Speed 4745.38 samples/sec Loss 3.6475 Epoch: 7 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:47:21,130-Speed 4826.36 samples/sec Loss 3.6309 Epoch: 7 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:47:31,943-Speed 4735.25 samples/sec Loss 3.6202 Epoch: 7 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:47:42,816-Speed 4709.03 samples/sec Loss 3.6512 Epoch: 7 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:47:53,603-Speed 4746.55 samples/sec Loss 3.5888 Epoch: 7 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:04,467-Speed 4713.10 samples/sec Loss 3.5902 Epoch: 7 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:15,283-Speed 4734.29 samples/sec Loss 3.6158 Epoch: 7 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:26,074-Speed 4744.86 samples/sec Loss 3.5971 Epoch: 7 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:36,744-Speed 4798.81 samples/sec Loss 3.5762 Epoch: 7 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:47,759-Speed 4648.71 samples/sec Loss 3.5382 Epoch: 7 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:48:58,356-Speed 4831.56 samples/sec Loss 3.5575 Epoch: 7 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:49:09,232-Speed 4708.11 samples/sec Loss 3.5467 Epoch: 7 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:49:19,977-Speed 4765.19 samples/sec Loss 3.5407 Epoch: 7 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:49:30,737-Speed 4758.63 samples/sec Loss 3.5132 Epoch: 7 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:49:42,279-Speed 4436.06 samples/sec Loss 3.4824 Epoch: 7 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:49:53,700-Speed 4483.21 samples/sec Loss 3.5203 Epoch: 7 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:04,398-Speed 4786.31 samples/sec Loss 3.4852 Epoch: 7 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:15,084-Speed 4791.77 samples/sec Loss 3.5215 Epoch: 7 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:25,908-Speed 4730.42 samples/sec Loss 3.4922 Epoch: 7 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:36,680-Speed 4753.51 samples/sec Loss 3.4557 Epoch: 7 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:47,366-Speed 4791.44 samples/sec Loss 3.4722 Epoch: 7 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:50:58,120-Speed 4761.30 samples/sec Loss 3.4313 Epoch: 7 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:51:22,159-[lfw][120000]XNorm: 22.398188 Training: 2021-03-17 23:51:22,159-[lfw][120000]Accuracy-Flip: 0.99633+-0.00245 Training: 2021-03-17 23:51:22,159-[lfw][120000]Accuracy-Highest: 0.99767 Training: 2021-03-17 23:51:49,676-[cfp_fp][120000]XNorm: 19.117957 Training: 2021-03-17 23:51:49,676-[cfp_fp][120000]Accuracy-Flip: 0.97329+-0.00899 Training: 2021-03-17 23:51:49,676-[cfp_fp][120000]Accuracy-Highest: 0.97329 Training: 2021-03-17 23:52:13,466-[agedb_30][120000]XNorm: 21.782649 Training: 2021-03-17 23:52:13,466-[agedb_30][120000]Accuracy-Flip: 0.97133+-0.00745 Training: 2021-03-17 23:52:13,466-[agedb_30][120000]Accuracy-Highest: 0.97133 Training: 2021-03-17 23:52:24,161-Speed 595.07 samples/sec Loss 3.4424 Epoch: 7 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:52:35,846-Speed 4382.01 samples/sec Loss 3.4235 Epoch: 7 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:52:47,454-Speed 4410.88 samples/sec Loss 3.4271 Epoch: 7 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:52:58,914-Speed 4467.83 samples/sec Loss 3.4178 Epoch: 7 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:53:10,632-Speed 4369.84 samples/sec Loss 3.4082 Epoch: 7 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:53:21,302-Speed 4798.87 samples/sec Loss 3.3888 Epoch: 7 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:53:33,025-Speed 4367.63 samples/sec Loss 3.4321 Epoch: 7 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:53:43,789-Speed 4756.82 samples/sec Loss 3.4172 Epoch: 7 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:53:54,280-Speed 4880.54 samples/sec Loss 3.3993 Epoch: 7 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:04,944-Speed 4801.59 samples/sec Loss 3.3819 Epoch: 7 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:15,684-Speed 4767.45 samples/sec Loss 3.3603 Epoch: 7 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:26,363-Speed 4795.03 samples/sec Loss 3.3912 Epoch: 7 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:37,015-Speed 4806.83 samples/sec Loss 3.4000 Epoch: 7 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:47,977-Speed 4670.68 samples/sec Loss 3.3554 Epoch: 7 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:54:58,836-Speed 4715.31 samples/sec Loss 3.3307 Epoch: 7 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:55:09,687-Speed 4718.58 samples/sec Loss 3.3608 Epoch: 7 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:55:20,707-Speed 4646.61 samples/sec Loss 3.3179 Epoch: 7 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:55:31,341-Speed 4815.19 samples/sec Loss 3.3516 Epoch: 7 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:55:42,079-Speed 4768.31 samples/sec Loss 3.2889 Epoch: 7 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:55:53,038-Speed 4672.12 samples/sec Loss 3.2928 Epoch: 7 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:04,030-Speed 4658.17 samples/sec Loss 3.3110 Epoch: 7 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:14,739-Speed 4781.93 samples/sec Loss 3.3243 Epoch: 7 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:25,370-Speed 4816.24 samples/sec Loss 3.3206 Epoch: 7 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:36,152-Speed 4748.83 samples/sec Loss 3.3213 Epoch: 7 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:47,008-Speed 4716.61 samples/sec Loss 3.3123 Epoch: 7 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:56:58,639-Speed 4402.21 samples/sec Loss 3.2772 Epoch: 7 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:57:09,364-Speed 4774.17 samples/sec Loss 3.3255 Epoch: 7 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:57:20,184-Speed 4732.55 samples/sec Loss 3.3220 Epoch: 7 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:57:30,915-Speed 4771.40 samples/sec Loss 3.2768 Epoch: 7 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:57:41,572-Speed 4804.62 samples/sec Loss 3.2795 Epoch: 7 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:57:52,378-Speed 4738.35 samples/sec Loss 3.2877 Epoch: 7 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:03,185-Speed 4738.03 samples/sec Loss 3.2615 Epoch: 7 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:14,147-Speed 4670.81 samples/sec Loss 3.2857 Epoch: 7 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:25,112-Speed 4669.77 samples/sec Loss 3.2760 Epoch: 7 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:36,155-Speed 4636.98 samples/sec Loss 3.2370 Epoch: 7 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:46,897-Speed 4766.41 samples/sec Loss 3.2683 Epoch: 7 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:58:57,922-Speed 4644.35 samples/sec Loss 3.2634 Epoch: 7 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:59:08,607-Speed 4791.96 samples/sec Loss 3.2842 Epoch: 7 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:59:19,430-Speed 4731.14 samples/sec Loss 3.2710 Epoch: 7 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:59:30,347-Speed 4690.11 samples/sec Loss 3.2452 Epoch: 7 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-17 23:59:54,339-[lfw][122000]XNorm: 22.918114 Training: 2021-03-17 23:59:54,340-[lfw][122000]Accuracy-Flip: 0.99600+-0.00260 Training: 2021-03-17 23:59:54,340-[lfw][122000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:00:21,951-[cfp_fp][122000]XNorm: 19.405143 Training: 2021-03-18 00:00:21,951-[cfp_fp][122000]Accuracy-Flip: 0.97143+-0.00720 Training: 2021-03-18 00:00:21,951-[cfp_fp][122000]Accuracy-Highest: 0.97329 Training: 2021-03-18 00:00:45,675-[agedb_30][122000]XNorm: 22.273245 Training: 2021-03-18 00:00:45,676-[agedb_30][122000]Accuracy-Flip: 0.97300+-0.00710 Training: 2021-03-18 00:00:45,676-[agedb_30][122000]Accuracy-Highest: 0.97300 Training: 2021-03-18 00:00:56,369-Speed 595.20 samples/sec Loss 3.2551 Epoch: 7 Global Step: 122050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:01:06,978-Speed 4826.46 samples/sec Loss 3.2288 Epoch: 7 Global Step: 122100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:01:17,672-Speed 4788.25 samples/sec Loss 3.2379 Epoch: 7 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:01:28,405-Speed 4770.48 samples/sec Loss 3.2058 Epoch: 7 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:01:39,363-Speed 4672.44 samples/sec Loss 3.2651 Epoch: 7 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:01:50,003-Speed 4812.67 samples/sec Loss 3.2118 Epoch: 7 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:00,932-Speed 4684.62 samples/sec Loss 3.2176 Epoch: 7 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:11,822-Speed 4702.02 samples/sec Loss 3.2302 Epoch: 7 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:22,687-Speed 4712.57 samples/sec Loss 3.2262 Epoch: 7 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:34,389-Speed 4375.67 samples/sec Loss 3.2101 Epoch: 7 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:44,940-Speed 4852.83 samples/sec Loss 3.2157 Epoch: 7 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:02:55,981-Speed 4637.90 samples/sec Loss 3.2004 Epoch: 7 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:03:07,826-Speed 4322.59 samples/sec Loss 3.2353 Epoch: 7 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:03:18,592-Speed 4755.98 samples/sec Loss 3.2131 Epoch: 7 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:03:29,397-Speed 4738.75 samples/sec Loss 3.2020 Epoch: 7 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:03:40,375-Speed 4664.26 samples/sec Loss 3.1759 Epoch: 7 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:03:51,122-Speed 4764.71 samples/sec Loss 3.1672 Epoch: 7 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:02,848-Speed 4366.46 samples/sec Loss 3.1996 Epoch: 7 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:13,469-Speed 4821.02 samples/sec Loss 3.1604 Epoch: 7 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:25,073-Speed 4412.66 samples/sec Loss 3.1710 Epoch: 7 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:36,455-Speed 4498.41 samples/sec Loss 3.1938 Epoch: 7 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:47,410-Speed 4674.02 samples/sec Loss 3.1975 Epoch: 7 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:04:58,873-Speed 4466.93 samples/sec Loss 3.1915 Epoch: 7 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:05:09,683-Speed 4736.35 samples/sec Loss 3.1751 Epoch: 7 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:05:21,240-Speed 4430.80 samples/sec Loss 3.1845 Epoch: 7 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:05:32,185-Speed 4678.23 samples/sec Loss 3.1313 Epoch: 7 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:05:43,077-Speed 4700.62 samples/sec Loss 3.2060 Epoch: 7 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:05:53,956-Speed 4706.57 samples/sec Loss 3.1774 Epoch: 7 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:04,754-Speed 4741.91 samples/sec Loss 3.1878 Epoch: 7 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:15,442-Speed 4790.64 samples/sec Loss 3.1600 Epoch: 7 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:26,365-Speed 4688.00 samples/sec Loss 3.1503 Epoch: 7 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:37,038-Speed 4797.19 samples/sec Loss 3.1514 Epoch: 7 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:47,792-Speed 4761.51 samples/sec Loss 3.1507 Epoch: 7 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:06:58,348-Speed 4850.55 samples/sec Loss 3.1723 Epoch: 7 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:07:09,295-Speed 4677.44 samples/sec Loss 3.1587 Epoch: 7 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:07:20,117-Speed 4731.75 samples/sec Loss 3.1538 Epoch: 7 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:07:30,888-Speed 4753.38 samples/sec Loss 3.1599 Epoch: 7 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:07:41,513-Speed 4819.18 samples/sec Loss 3.1251 Epoch: 7 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:07:52,349-Speed 4725.42 samples/sec Loss 3.1279 Epoch: 7 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:08:02,957-Speed 4826.80 samples/sec Loss 3.1522 Epoch: 7 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:08:27,110-[lfw][124000]XNorm: 23.802128 Training: 2021-03-18 00:08:27,110-[lfw][124000]Accuracy-Flip: 0.99717+-0.00308 Training: 2021-03-18 00:08:27,110-[lfw][124000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:08:54,558-[cfp_fp][124000]XNorm: 20.107713 Training: 2021-03-18 00:08:54,558-[cfp_fp][124000]Accuracy-Flip: 0.97571+-0.00759 Training: 2021-03-18 00:08:54,559-[cfp_fp][124000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:09:18,251-[agedb_30][124000]XNorm: 22.991490 Training: 2021-03-18 00:09:18,251-[agedb_30][124000]Accuracy-Flip: 0.97317+-0.00545 Training: 2021-03-18 00:09:18,252-[agedb_30][124000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:09:29,058-Speed 594.66 samples/sec Loss 3.1394 Epoch: 7 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:09:40,910-Speed 4320.14 samples/sec Loss 3.1035 Epoch: 7 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:09:51,505-Speed 4832.62 samples/sec Loss 3.1579 Epoch: 7 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:02,440-Speed 4682.73 samples/sec Loss 3.1521 Epoch: 7 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:13,139-Speed 4785.91 samples/sec Loss 3.1023 Epoch: 7 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:23,800-Speed 4802.66 samples/sec Loss 3.1454 Epoch: 7 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:34,457-Speed 4804.60 samples/sec Loss 3.1460 Epoch: 7 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:45,209-Speed 4762.51 samples/sec Loss 3.1486 Epoch: 7 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:10:56,192-Speed 4661.72 samples/sec Loss 3.1508 Epoch: 7 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:11:06,850-Speed 4804.07 samples/sec Loss 3.1175 Epoch: 7 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:11:17,694-Speed 4721.83 samples/sec Loss 3.0777 Epoch: 7 Global Step: 124550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:11:28,021-Speed 4958.08 samples/sec Loss 3.0610 Epoch: 7 Global Step: 124600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:11:38,817-Speed 4742.84 samples/sec Loss 3.1373 Epoch: 7 Global Step: 124650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:11:49,315-Speed 4877.44 samples/sec Loss 3.0964 Epoch: 7 Global Step: 124700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:00,012-Speed 4786.80 samples/sec Loss 3.1066 Epoch: 7 Global Step: 124750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:10,971-Speed 4672.36 samples/sec Loss 3.0923 Epoch: 7 Global Step: 124800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:21,765-Speed 4743.44 samples/sec Loss 3.0961 Epoch: 7 Global Step: 124850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:32,621-Speed 4716.60 samples/sec Loss 3.0957 Epoch: 7 Global Step: 124900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:43,491-Speed 4710.59 samples/sec Loss 3.0783 Epoch: 7 Global Step: 124950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:12:54,309-Speed 4733.12 samples/sec Loss 3.0698 Epoch: 7 Global Step: 125000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:04,932-Speed 4819.98 samples/sec Loss 3.1378 Epoch: 7 Global Step: 125050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:15,631-Speed 4785.69 samples/sec Loss 3.0989 Epoch: 7 Global Step: 125100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:26,198-Speed 4845.87 samples/sec Loss 3.1170 Epoch: 7 Global Step: 125150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:36,806-Speed 4826.43 samples/sec Loss 3.0643 Epoch: 7 Global Step: 125200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:47,614-Speed 4737.57 samples/sec Loss 3.0872 Epoch: 7 Global Step: 125250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:13:58,438-Speed 4731.07 samples/sec Loss 3.1225 Epoch: 7 Global Step: 125300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:14:09,958-Speed 4444.85 samples/sec Loss 3.1033 Epoch: 7 Global Step: 125350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:14:20,653-Speed 4787.54 samples/sec Loss 3.0627 Epoch: 7 Global Step: 125400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:14:31,352-Speed 4785.58 samples/sec Loss 3.0630 Epoch: 7 Global Step: 125450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:14:43,017-Speed 4389.47 samples/sec Loss 3.0486 Epoch: 7 Global Step: 125500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:14:53,768-Speed 4762.84 samples/sec Loss 3.0733 Epoch: 7 Global Step: 125550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:04,533-Speed 4756.20 samples/sec Loss 3.0885 Epoch: 7 Global Step: 125600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:15,230-Speed 4786.79 samples/sec Loss 3.1002 Epoch: 7 Global Step: 125650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:25,921-Speed 4789.31 samples/sec Loss 3.0678 Epoch: 7 Global Step: 125700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:37,526-Speed 4412.40 samples/sec Loss 3.0578 Epoch: 7 Global Step: 125750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:48,079-Speed 4851.80 samples/sec Loss 3.0380 Epoch: 7 Global Step: 125800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:15:58,705-Speed 4818.86 samples/sec Loss 3.0492 Epoch: 7 Global Step: 125850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:16:10,176-Speed 4463.45 samples/sec Loss 3.0932 Epoch: 7 Global Step: 125900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:16:22,706-Speed 4086.35 samples/sec Loss 3.0954 Epoch: 7 Global Step: 125950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:16:33,347-Speed 4812.02 samples/sec Loss 3.1062 Epoch: 7 Global Step: 126000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:16:57,971-[lfw][126000]XNorm: 23.230135 Training: 2021-03-18 00:16:57,971-[lfw][126000]Accuracy-Flip: 0.99683+-0.00329 Training: 2021-03-18 00:16:57,971-[lfw][126000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:17:25,413-[cfp_fp][126000]XNorm: 19.790265 Training: 2021-03-18 00:17:25,413-[cfp_fp][126000]Accuracy-Flip: 0.97214+-0.00740 Training: 2021-03-18 00:17:25,413-[cfp_fp][126000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:17:49,108-[agedb_30][126000]XNorm: 22.580101 Training: 2021-03-18 00:17:49,108-[agedb_30][126000]Accuracy-Flip: 0.97117+-0.00619 Training: 2021-03-18 00:17:49,108-[agedb_30][126000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:17:59,737-Speed 592.67 samples/sec Loss 3.0473 Epoch: 7 Global Step: 126050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:18:10,779-Speed 4636.99 samples/sec Loss 3.0590 Epoch: 7 Global Step: 126100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:18:22,457-Speed 4384.57 samples/sec Loss 3.0501 Epoch: 7 Global Step: 126150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:18:33,170-Speed 4779.64 samples/sec Loss 3.0721 Epoch: 7 Global Step: 126200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:18:44,027-Speed 4715.81 samples/sec Loss 3.0761 Epoch: 7 Global Step: 126250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:18:54,918-Speed 4701.68 samples/sec Loss 3.0560 Epoch: 7 Global Step: 126300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:05,344-Speed 4911.12 samples/sec Loss 3.0313 Epoch: 7 Global Step: 126350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:16,310-Speed 4669.17 samples/sec Loss 3.0519 Epoch: 7 Global Step: 126400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:27,062-Speed 4762.17 samples/sec Loss 3.0287 Epoch: 7 Global Step: 126450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:37,785-Speed 4774.98 samples/sec Loss 3.0559 Epoch: 7 Global Step: 126500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:48,450-Speed 4801.47 samples/sec Loss 2.9965 Epoch: 7 Global Step: 126550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:19:59,019-Speed 4844.33 samples/sec Loss 3.0623 Epoch: 7 Global Step: 126600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:20:09,655-Speed 4814.26 samples/sec Loss 3.0325 Epoch: 7 Global Step: 126650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:20:20,350-Speed 4787.49 samples/sec Loss 3.0327 Epoch: 7 Global Step: 126700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:20:31,021-Speed 4798.51 samples/sec Loss 3.0535 Epoch: 7 Global Step: 126750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:20:41,550-Speed 4862.95 samples/sec Loss 3.0243 Epoch: 7 Global Step: 126800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:20:52,196-Speed 4809.56 samples/sec Loss 3.0413 Epoch: 7 Global Step: 126850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:02,739-Speed 4856.82 samples/sec Loss 3.0176 Epoch: 7 Global Step: 126900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:14,069-Speed 4519.25 samples/sec Loss 3.0039 Epoch: 7 Global Step: 126950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:24,873-Speed 4739.31 samples/sec Loss 3.0063 Epoch: 7 Global Step: 127000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:35,717-Speed 4721.61 samples/sec Loss 3.0081 Epoch: 7 Global Step: 127050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:46,371-Speed 4806.17 samples/sec Loss 3.0260 Epoch: 7 Global Step: 127100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:21:57,198-Speed 4729.13 samples/sec Loss 3.0105 Epoch: 7 Global Step: 127150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:22:07,900-Speed 4784.43 samples/sec Loss 3.0628 Epoch: 7 Global Step: 127200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:22:18,811-Speed 4692.86 samples/sec Loss 2.9964 Epoch: 7 Global Step: 127250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:22:29,551-Speed 4767.41 samples/sec Loss 3.0061 Epoch: 7 Global Step: 127300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:22:40,130-Speed 4839.97 samples/sec Loss 3.0507 Epoch: 7 Global Step: 127350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:22:51,012-Speed 4705.50 samples/sec Loss 2.9971 Epoch: 7 Global Step: 127400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:01,800-Speed 4746.38 samples/sec Loss 2.9871 Epoch: 7 Global Step: 127450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:12,509-Speed 4781.04 samples/sec Loss 3.0155 Epoch: 7 Global Step: 127500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:23,461-Speed 4675.54 samples/sec Loss 2.9798 Epoch: 7 Global Step: 127550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:34,351-Speed 4701.98 samples/sec Loss 2.9903 Epoch: 7 Global Step: 127600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:45,114-Speed 4757.36 samples/sec Loss 3.0169 Epoch: 7 Global Step: 127650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:23:55,896-Speed 4748.84 samples/sec Loss 2.9962 Epoch: 7 Global Step: 127700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:24:06,686-Speed 4745.36 samples/sec Loss 3.0080 Epoch: 7 Global Step: 127750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:24:17,407-Speed 4776.30 samples/sec Loss 2.9917 Epoch: 7 Global Step: 127800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:24:28,204-Speed 4741.98 samples/sec Loss 2.9905 Epoch: 7 Global Step: 127850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:24:38,933-Speed 4772.69 samples/sec Loss 3.0010 Epoch: 7 Global Step: 127900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:24:49,630-Speed 4786.83 samples/sec Loss 2.9893 Epoch: 7 Global Step: 127950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:25:00,183-Speed 4851.64 samples/sec Loss 2.9816 Epoch: 7 Global Step: 128000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:25:24,257-[lfw][128000]XNorm: 22.747712 Training: 2021-03-18 00:25:24,257-[lfw][128000]Accuracy-Flip: 0.99617+-0.00334 Training: 2021-03-18 00:25:24,257-[lfw][128000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:25:51,767-[cfp_fp][128000]XNorm: 19.618311 Training: 2021-03-18 00:25:51,767-[cfp_fp][128000]Accuracy-Flip: 0.97286+-0.00840 Training: 2021-03-18 00:25:51,767-[cfp_fp][128000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:26:15,579-[agedb_30][128000]XNorm: 22.147227 Training: 2021-03-18 00:26:15,579-[agedb_30][128000]Accuracy-Flip: 0.97317+-0.00787 Training: 2021-03-18 00:26:15,579-[agedb_30][128000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:26:26,332-Speed 594.33 samples/sec Loss 2.9651 Epoch: 7 Global Step: 128050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:26:37,141-Speed 4736.69 samples/sec Loss 2.9503 Epoch: 7 Global Step: 128100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:26:47,926-Speed 4747.86 samples/sec Loss 2.9796 Epoch: 7 Global Step: 128150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:26:59,601-Speed 4385.42 samples/sec Loss 2.9592 Epoch: 7 Global Step: 128200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:27:10,471-Speed 4710.55 samples/sec Loss 3.0246 Epoch: 7 Global Step: 128250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:27:21,124-Speed 4806.33 samples/sec Loss 2.9705 Epoch: 7 Global Step: 128300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:27:31,811-Speed 4791.38 samples/sec Loss 2.9670 Epoch: 7 Global Step: 128350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:27:43,199-Speed 4496.25 samples/sec Loss 3.0035 Epoch: 7 Global Step: 128400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:27:53,930-Speed 4771.61 samples/sec Loss 2.9496 Epoch: 7 Global Step: 128450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:04,852-Speed 4687.66 samples/sec Loss 2.9592 Epoch: 7 Global Step: 128500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:15,487-Speed 4814.98 samples/sec Loss 2.9647 Epoch: 7 Global Step: 128550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:26,270-Speed 4748.11 samples/sec Loss 2.9787 Epoch: 7 Global Step: 128600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:36,865-Speed 4832.95 samples/sec Loss 2.9451 Epoch: 7 Global Step: 128650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:47,758-Speed 4700.46 samples/sec Loss 2.9460 Epoch: 7 Global Step: 128700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:28:59,106-Speed 4512.02 samples/sec Loss 2.9654 Epoch: 7 Global Step: 128750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:29:09,935-Speed 4728.37 samples/sec Loss 2.9489 Epoch: 7 Global Step: 128800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:29:22,153-Speed 4190.87 samples/sec Loss 2.9338 Epoch: 7 Global Step: 128850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:29:33,557-Speed 4490.06 samples/sec Loss 2.9657 Epoch: 7 Global Step: 128900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:29:44,202-Speed 4810.01 samples/sec Loss 2.9804 Epoch: 7 Global Step: 128950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:29:54,907-Speed 4782.80 samples/sec Loss 2.9276 Epoch: 7 Global Step: 129000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:05,592-Speed 4792.12 samples/sec Loss 2.9456 Epoch: 7 Global Step: 129050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:16,915-Speed 4522.21 samples/sec Loss 2.9489 Epoch: 7 Global Step: 129100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:27,625-Speed 4780.92 samples/sec Loss 2.9397 Epoch: 7 Global Step: 129150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:38,482-Speed 4716.16 samples/sec Loss 2.9361 Epoch: 7 Global Step: 129200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:49,293-Speed 4735.95 samples/sec Loss 2.9582 Epoch: 7 Global Step: 129250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:30:59,940-Speed 4809.32 samples/sec Loss 2.9459 Epoch: 7 Global Step: 129300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:31:10,714-Speed 4752.25 samples/sec Loss 2.9300 Epoch: 7 Global Step: 129350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:31:21,584-Speed 4710.58 samples/sec Loss 2.9407 Epoch: 7 Global Step: 129400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:31:32,433-Speed 4719.70 samples/sec Loss 2.9311 Epoch: 7 Global Step: 129450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:31:43,203-Speed 4754.11 samples/sec Loss 2.9417 Epoch: 7 Global Step: 129500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:31:53,907-Speed 4783.67 samples/sec Loss 2.9210 Epoch: 7 Global Step: 129550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:04,742-Speed 4725.68 samples/sec Loss 2.9053 Epoch: 7 Global Step: 129600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:15,441-Speed 4785.74 samples/sec Loss 2.9180 Epoch: 7 Global Step: 129650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:27,009-Speed 4426.54 samples/sec Loss 2.9414 Epoch: 7 Global Step: 129700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:37,869-Speed 4714.77 samples/sec Loss 2.9229 Epoch: 7 Global Step: 129750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:48,706-Speed 4724.80 samples/sec Loss 2.9271 Epoch: 7 Global Step: 129800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:32:59,350-Speed 4810.52 samples/sec Loss 2.9173 Epoch: 7 Global Step: 129850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:33:10,113-Speed 4757.42 samples/sec Loss 2.9410 Epoch: 7 Global Step: 129900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:33:20,915-Speed 4740.41 samples/sec Loss 2.9062 Epoch: 7 Global Step: 129950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:33:31,656-Speed 4767.08 samples/sec Loss 2.9102 Epoch: 7 Global Step: 130000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:33:55,943-[lfw][130000]XNorm: 23.340878 Training: 2021-03-18 00:33:55,944-[lfw][130000]Accuracy-Flip: 0.99633+-0.00287 Training: 2021-03-18 00:33:55,944-[lfw][130000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:34:23,470-[cfp_fp][130000]XNorm: 20.251353 Training: 2021-03-18 00:34:23,470-[cfp_fp][130000]Accuracy-Flip: 0.97229+-0.00836 Training: 2021-03-18 00:34:23,470-[cfp_fp][130000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:34:47,136-[agedb_30][130000]XNorm: 22.826998 Training: 2021-03-18 00:34:47,137-[agedb_30][130000]Accuracy-Flip: 0.97317+-0.00560 Training: 2021-03-18 00:34:47,137-[agedb_30][130000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:34:57,699-Speed 595.06 samples/sec Loss 2.8901 Epoch: 7 Global Step: 130050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:35:08,499-Speed 4740.85 samples/sec Loss 2.9035 Epoch: 7 Global Step: 130100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:35:19,484-Speed 4661.55 samples/sec Loss 2.8882 Epoch: 7 Global Step: 130150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:35:30,260-Speed 4751.52 samples/sec Loss 2.8875 Epoch: 7 Global Step: 130200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:35:40,935-Speed 4796.45 samples/sec Loss 2.8913 Epoch: 7 Global Step: 130250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:35:51,613-Speed 4795.42 samples/sec Loss 2.8823 Epoch: 7 Global Step: 130300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:02,523-Speed 4693.11 samples/sec Loss 2.8781 Epoch: 7 Global Step: 130350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:13,185-Speed 4802.54 samples/sec Loss 2.8673 Epoch: 7 Global Step: 130400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:24,185-Speed 4654.70 samples/sec Loss 2.8856 Epoch: 7 Global Step: 130450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:34,863-Speed 4795.25 samples/sec Loss 2.9244 Epoch: 7 Global Step: 130500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:45,605-Speed 4766.49 samples/sec Loss 2.8579 Epoch: 7 Global Step: 130550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-18 00:36:56,297-Speed 4788.97 samples/sec Loss 2.9199 Epoch: 7 Global Step: 130600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:37:06,993-Speed 4787.13 samples/sec Loss 2.8690 Epoch: 7 Global Step: 130650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:37:17,874-Speed 4705.90 samples/sec Loss 2.8923 Epoch: 7 Global Step: 130700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:37:28,655-Speed 4749.07 samples/sec Loss 2.8790 Epoch: 7 Global Step: 130750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:37:39,491-Speed 4725.51 samples/sec Loss 2.8916 Epoch: 7 Global Step: 130800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:37:50,366-Speed 4708.21 samples/sec Loss 2.8821 Epoch: 7 Global Step: 130850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:01,092-Speed 4773.92 samples/sec Loss 2.8523 Epoch: 7 Global Step: 130900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:11,849-Speed 4760.07 samples/sec Loss 2.8571 Epoch: 7 Global Step: 130950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:22,682-Speed 4726.44 samples/sec Loss 2.8964 Epoch: 7 Global Step: 131000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:33,319-Speed 4813.96 samples/sec Loss 2.8507 Epoch: 7 Global Step: 131050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:43,936-Speed 4822.94 samples/sec Loss 2.8714 Epoch: 7 Global Step: 131100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:38:55,336-Speed 4491.34 samples/sec Loss 2.8618 Epoch: 7 Global Step: 131150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:39:06,219-Speed 4705.17 samples/sec Loss 2.8862 Epoch: 7 Global Step: 131200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:39:16,926-Speed 4782.06 samples/sec Loss 2.8386 Epoch: 7 Global Step: 131250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:39:28,206-Speed 4539.23 samples/sec Loss 2.8948 Epoch: 7 Global Step: 131300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:39:38,998-Speed 4744.93 samples/sec Loss 2.8754 Epoch: 7 Global Step: 131350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:39:49,587-Speed 4835.33 samples/sec Loss 2.8423 Epoch: 7 Global Step: 131400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:00,515-Speed 4685.43 samples/sec Loss 2.8583 Epoch: 7 Global Step: 131450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:11,465-Speed 4676.11 samples/sec Loss 2.8480 Epoch: 7 Global Step: 131500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:22,303-Speed 4724.51 samples/sec Loss 2.8826 Epoch: 7 Global Step: 131550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:33,893-Speed 4417.59 samples/sec Loss 2.8869 Epoch: 7 Global Step: 131600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:44,633-Speed 4767.47 samples/sec Loss 2.8541 Epoch: 7 Global Step: 131650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:40:56,149-Speed 4446.37 samples/sec Loss 2.8705 Epoch: 7 Global Step: 131700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:41:08,760-Speed 4060.19 samples/sec Loss 2.8690 Epoch: 7 Global Step: 131750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:41:19,501-Speed 4767.26 samples/sec Loss 2.8605 Epoch: 7 Global Step: 131800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:41:30,436-Speed 4682.39 samples/sec Loss 2.8490 Epoch: 7 Global Step: 131850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:41:41,446-Speed 4650.56 samples/sec Loss 2.8375 Epoch: 7 Global Step: 131900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:41:52,931-Speed 4458.33 samples/sec Loss 2.8630 Epoch: 7 Global Step: 131950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:42:03,693-Speed 4757.86 samples/sec Loss 2.8612 Epoch: 7 Global Step: 132000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:42:27,414-[lfw][132000]XNorm: 22.472957 Training: 2021-03-18 00:42:27,415-[lfw][132000]Accuracy-Flip: 0.99717+-0.00236 Training: 2021-03-18 00:42:27,415-[lfw][132000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:42:54,945-[cfp_fp][132000]XNorm: 19.609068 Training: 2021-03-18 00:42:54,946-[cfp_fp][132000]Accuracy-Flip: 0.97414+-0.00882 Training: 2021-03-18 00:42:54,946-[cfp_fp][132000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:43:18,839-[agedb_30][132000]XNorm: 22.112945 Training: 2021-03-18 00:43:18,839-[agedb_30][132000]Accuracy-Flip: 0.97283+-0.00582 Training: 2021-03-18 00:43:18,839-[agedb_30][132000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:43:29,502-Speed 596.68 samples/sec Loss 2.8419 Epoch: 7 Global Step: 132050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:43:40,312-Speed 4736.72 samples/sec Loss 2.8714 Epoch: 7 Global Step: 132100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:43:51,042-Speed 4771.98 samples/sec Loss 2.8348 Epoch: 7 Global Step: 132150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:01,973-Speed 4683.89 samples/sec Loss 2.8547 Epoch: 7 Global Step: 132200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:12,801-Speed 4729.03 samples/sec Loss 2.8169 Epoch: 7 Global Step: 132250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:23,692-Speed 4701.38 samples/sec Loss 2.8330 Epoch: 7 Global Step: 132300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:34,305-Speed 4824.73 samples/sec Loss 2.8326 Epoch: 7 Global Step: 132350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:44,993-Speed 4790.88 samples/sec Loss 2.8453 Epoch: 7 Global Step: 132400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:44:55,817-Speed 4730.24 samples/sec Loss 2.8444 Epoch: 7 Global Step: 132450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:45:06,564-Speed 4764.25 samples/sec Loss 2.8173 Epoch: 7 Global Step: 132500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:45:17,317-Speed 4761.75 samples/sec Loss 2.8238 Epoch: 7 Global Step: 132550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:45:28,721-Speed 4490.10 samples/sec Loss 2.8387 Epoch: 7 Global Step: 132600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:45:39,490-Speed 4754.67 samples/sec Loss 2.8229 Epoch: 7 Global Step: 132650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:45:50,310-Speed 4732.39 samples/sec Loss 2.8332 Epoch: 7 Global Step: 132700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:01,334-Speed 4644.55 samples/sec Loss 2.8203 Epoch: 7 Global Step: 132750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:12,140-Speed 4738.40 samples/sec Loss 2.8025 Epoch: 7 Global Step: 132800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:23,067-Speed 4686.30 samples/sec Loss 2.8136 Epoch: 7 Global Step: 132850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:33,879-Speed 4735.64 samples/sec Loss 2.7975 Epoch: 7 Global Step: 132900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:44,788-Speed 4693.85 samples/sec Loss 2.8338 Epoch: 7 Global Step: 132950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:46:55,636-Speed 4720.00 samples/sec Loss 2.8223 Epoch: 7 Global Step: 133000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:47:06,532-Speed 4699.56 samples/sec Loss 2.7929 Epoch: 7 Global Step: 133050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:47:17,332-Speed 4741.03 samples/sec Loss 2.8251 Epoch: 7 Global Step: 133100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:47:28,125-Speed 4743.99 samples/sec Loss 2.8718 Epoch: 7 Global Step: 133150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:47:39,111-Speed 4660.66 samples/sec Loss 2.7714 Epoch: 7 Global Step: 133200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:47:49,845-Speed 4770.49 samples/sec Loss 2.8091 Epoch: 7 Global Step: 133250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:48:00,800-Speed 4673.82 samples/sec Loss 2.7803 Epoch: 7 Global Step: 133300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:48:11,560-Speed 4758.76 samples/sec Loss 2.8118 Epoch: 7 Global Step: 133350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:48:22,402-Speed 4722.53 samples/sec Loss 2.7942 Epoch: 7 Global Step: 133400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:48:33,380-Speed 4664.32 samples/sec Loss 2.8085 Epoch: 7 Global Step: 133450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:48:44,329-Speed 4676.42 samples/sec Loss 2.7737 Epoch: 7 Global Step: 133500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:49:07,800-Speed 2181.49 samples/sec Loss 2.6537 Epoch: 8 Global Step: 133550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:49:18,691-Speed 4701.73 samples/sec Loss 2.4486 Epoch: 8 Global Step: 133600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:49:29,541-Speed 4719.14 samples/sec Loss 2.4645 Epoch: 8 Global Step: 133650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:49:40,626-Speed 4619.13 samples/sec Loss 2.4708 Epoch: 8 Global Step: 133700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:49:51,362-Speed 4769.33 samples/sec Loss 2.4556 Epoch: 8 Global Step: 133750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:50:02,229-Speed 4711.63 samples/sec Loss 2.4243 Epoch: 8 Global Step: 133800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:50:13,235-Speed 4652.31 samples/sec Loss 2.4546 Epoch: 8 Global Step: 133850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:50:24,056-Speed 4732.05 samples/sec Loss 2.4376 Epoch: 8 Global Step: 133900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:50:34,975-Speed 4689.41 samples/sec Loss 2.4588 Epoch: 8 Global Step: 133950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:50:45,678-Speed 4783.99 samples/sec Loss 2.4525 Epoch: 8 Global Step: 134000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:51:10,145-[lfw][134000]XNorm: 23.333119 Training: 2021-03-18 00:51:10,145-[lfw][134000]Accuracy-Flip: 0.99683+-0.00293 Training: 2021-03-18 00:51:10,145-[lfw][134000]Accuracy-Highest: 0.99767 Training: 2021-03-18 00:51:37,695-[cfp_fp][134000]XNorm: 19.985051 Training: 2021-03-18 00:51:37,696-[cfp_fp][134000]Accuracy-Flip: 0.97400+-0.00964 Training: 2021-03-18 00:51:37,696-[cfp_fp][134000]Accuracy-Highest: 0.97571 Training: 2021-03-18 00:52:01,463-[agedb_30][134000]XNorm: 22.471095 Training: 2021-03-18 00:52:01,464-[agedb_30][134000]Accuracy-Flip: 0.97083+-0.00811 Training: 2021-03-18 00:52:01,464-[agedb_30][134000]Accuracy-Highest: 0.97317 Training: 2021-03-18 00:52:13,268-Speed 584.55 samples/sec Loss 2.4311 Epoch: 8 Global Step: 134050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:52:24,204-Speed 4681.98 samples/sec Loss 2.4467 Epoch: 8 Global Step: 134100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:52:35,082-Speed 4707.29 samples/sec Loss 2.4736 Epoch: 8 Global Step: 134150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:52:46,856-Speed 4348.55 samples/sec Loss 2.4670 Epoch: 8 Global Step: 134200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:52:57,656-Speed 4741.29 samples/sec Loss 2.4602 Epoch: 8 Global Step: 134250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:53:08,718-Speed 4628.95 samples/sec Loss 2.4518 Epoch: 8 Global Step: 134300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:53:19,551-Speed 4726.30 samples/sec Loss 2.4769 Epoch: 8 Global Step: 134350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:53:30,334-Speed 4748.56 samples/sec Loss 2.4494 Epoch: 8 Global Step: 134400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:53:41,870-Speed 4438.73 samples/sec Loss 2.4379 Epoch: 8 Global Step: 134450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:53:52,563-Speed 4788.43 samples/sec Loss 2.4562 Epoch: 8 Global Step: 134500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:05,496-Speed 3959.09 samples/sec Loss 2.4671 Epoch: 8 Global Step: 134550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:17,087-Speed 4417.61 samples/sec Loss 2.4786 Epoch: 8 Global Step: 134600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:27,778-Speed 4789.03 samples/sec Loss 2.4771 Epoch: 8 Global Step: 134650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:38,546-Speed 4755.49 samples/sec Loss 2.4716 Epoch: 8 Global Step: 134700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:49,161-Speed 4823.78 samples/sec Loss 2.4469 Epoch: 8 Global Step: 134750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:54:59,822-Speed 4802.61 samples/sec Loss 2.5028 Epoch: 8 Global Step: 134800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:55:11,660-Speed 4325.56 samples/sec Loss 2.4954 Epoch: 8 Global Step: 134850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:55:22,415-Speed 4760.50 samples/sec Loss 2.4713 Epoch: 8 Global Step: 134900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:55:33,356-Speed 4680.03 samples/sec Loss 2.4623 Epoch: 8 Global Step: 134950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:55:44,077-Speed 4776.20 samples/sec Loss 2.4789 Epoch: 8 Global Step: 135000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:55:54,962-Speed 4703.87 samples/sec Loss 2.4574 Epoch: 8 Global Step: 135050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:05,542-Speed 4839.87 samples/sec Loss 2.4521 Epoch: 8 Global Step: 135100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:16,382-Speed 4723.47 samples/sec Loss 2.4684 Epoch: 8 Global Step: 135150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:27,056-Speed 4796.79 samples/sec Loss 2.5122 Epoch: 8 Global Step: 135200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:37,874-Speed 4733.53 samples/sec Loss 2.5087 Epoch: 8 Global Step: 135250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:48,747-Speed 4708.91 samples/sec Loss 2.4980 Epoch: 8 Global Step: 135300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:56:59,512-Speed 4756.73 samples/sec Loss 2.4850 Epoch: 8 Global Step: 135350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:57:11,288-Speed 4347.81 samples/sec Loss 2.4776 Epoch: 8 Global Step: 135400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:57:21,973-Speed 4792.01 samples/sec Loss 2.4750 Epoch: 8 Global Step: 135450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:57:32,960-Speed 4660.48 samples/sec Loss 2.4880 Epoch: 8 Global Step: 135500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:57:43,695-Speed 4769.90 samples/sec Loss 2.4653 Epoch: 8 Global Step: 135550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:57:54,332-Speed 4813.38 samples/sec Loss 2.4824 Epoch: 8 Global Step: 135600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:05,207-Speed 4708.64 samples/sec Loss 2.5045 Epoch: 8 Global Step: 135650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:15,844-Speed 4813.30 samples/sec Loss 2.5170 Epoch: 8 Global Step: 135700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:26,866-Speed 4645.60 samples/sec Loss 2.4934 Epoch: 8 Global Step: 135750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:37,321-Speed 4897.90 samples/sec Loss 2.4544 Epoch: 8 Global Step: 135800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:47,825-Speed 4874.79 samples/sec Loss 2.4900 Epoch: 8 Global Step: 135850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:58:58,391-Speed 4845.98 samples/sec Loss 2.4675 Epoch: 8 Global Step: 135900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:59:09,160-Speed 4754.60 samples/sec Loss 2.5132 Epoch: 8 Global Step: 135950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:59:19,692-Speed 4861.70 samples/sec Loss 2.5123 Epoch: 8 Global Step: 136000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 00:59:43,823-[lfw][136000]XNorm: 23.194706 Training: 2021-03-18 00:59:43,823-[lfw][136000]Accuracy-Flip: 0.99600+-0.00335 Training: 2021-03-18 00:59:43,824-[lfw][136000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:00:11,335-[cfp_fp][136000]XNorm: 20.064068 Training: 2021-03-18 01:00:11,336-[cfp_fp][136000]Accuracy-Flip: 0.97329+-0.00712 Training: 2021-03-18 01:00:11,337-[cfp_fp][136000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:00:35,562-[agedb_30][136000]XNorm: 22.531059 Training: 2021-03-18 01:00:35,562-[agedb_30][136000]Accuracy-Flip: 0.97083+-0.00800 Training: 2021-03-18 01:00:35,562-[agedb_30][136000]Accuracy-Highest: 0.97317 Training: 2021-03-18 01:00:46,437-Speed 590.24 samples/sec Loss 2.4916 Epoch: 8 Global Step: 136050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:00:57,279-Speed 4723.07 samples/sec Loss 2.4826 Epoch: 8 Global Step: 136100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:01:08,119-Speed 4723.21 samples/sec Loss 2.4965 Epoch: 8 Global Step: 136150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:01:18,882-Speed 4757.59 samples/sec Loss 2.4526 Epoch: 8 Global Step: 136200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:01:29,676-Speed 4743.70 samples/sec Loss 2.4736 Epoch: 8 Global Step: 136250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:01:40,457-Speed 4749.55 samples/sec Loss 2.5028 Epoch: 8 Global Step: 136300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:01:51,237-Speed 4749.69 samples/sec Loss 2.4446 Epoch: 8 Global Step: 136350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:01,994-Speed 4759.81 samples/sec Loss 2.4466 Epoch: 8 Global Step: 136400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:12,741-Speed 4764.28 samples/sec Loss 2.4891 Epoch: 8 Global Step: 136450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:23,648-Speed 4694.47 samples/sec Loss 2.4735 Epoch: 8 Global Step: 136500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:34,351-Speed 4784.28 samples/sec Loss 2.4707 Epoch: 8 Global Step: 136550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:45,262-Speed 4692.52 samples/sec Loss 2.4922 Epoch: 8 Global Step: 136600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:02:55,977-Speed 4778.69 samples/sec Loss 2.5012 Epoch: 8 Global Step: 136650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:03:06,648-Speed 4798.48 samples/sec Loss 2.5079 Epoch: 8 Global Step: 136700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:03:17,430-Speed 4748.83 samples/sec Loss 2.4556 Epoch: 8 Global Step: 136750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:03:28,313-Speed 4704.75 samples/sec Loss 2.4533 Epoch: 8 Global Step: 136800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:03:39,227-Speed 4691.60 samples/sec Loss 2.5159 Epoch: 8 Global Step: 136850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:03:50,559-Speed 4518.35 samples/sec Loss 2.4833 Epoch: 8 Global Step: 136900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:01,326-Speed 4755.58 samples/sec Loss 2.5277 Epoch: 8 Global Step: 136950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:12,155-Speed 4728.19 samples/sec Loss 2.4968 Epoch: 8 Global Step: 137000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:23,106-Speed 4675.61 samples/sec Loss 2.5348 Epoch: 8 Global Step: 137050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:34,710-Speed 4412.68 samples/sec Loss 2.4969 Epoch: 8 Global Step: 137100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:45,518-Speed 4737.53 samples/sec Loss 2.5310 Epoch: 8 Global Step: 137150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:04:56,269-Speed 4762.79 samples/sec Loss 2.4860 Epoch: 8 Global Step: 137200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:05:07,209-Speed 4680.45 samples/sec Loss 2.5199 Epoch: 8 Global Step: 137250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:05:18,773-Speed 4427.91 samples/sec Loss 2.4995 Epoch: 8 Global Step: 137300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:05:29,594-Speed 4731.60 samples/sec Loss 2.5030 Epoch: 8 Global Step: 137350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:05:41,351-Speed 4355.52 samples/sec Loss 2.4988 Epoch: 8 Global Step: 137400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:05:52,776-Speed 4481.61 samples/sec Loss 2.4810 Epoch: 8 Global Step: 137450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:04,104-Speed 4519.85 samples/sec Loss 2.4892 Epoch: 8 Global Step: 137500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:15,616-Speed 4447.92 samples/sec Loss 2.4942 Epoch: 8 Global Step: 137550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:26,631-Speed 4648.56 samples/sec Loss 2.5113 Epoch: 8 Global Step: 137600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:37,335-Speed 4783.19 samples/sec Loss 2.4769 Epoch: 8 Global Step: 137650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:48,053-Speed 4777.40 samples/sec Loss 2.4897 Epoch: 8 Global Step: 137700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:06:58,881-Speed 4729.01 samples/sec Loss 2.4794 Epoch: 8 Global Step: 137750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:07:10,345-Speed 4466.29 samples/sec Loss 2.4644 Epoch: 8 Global Step: 137800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:07:21,249-Speed 4696.14 samples/sec Loss 2.4829 Epoch: 8 Global Step: 137850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:07:32,015-Speed 4755.72 samples/sec Loss 2.5034 Epoch: 8 Global Step: 137900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:07:42,567-Speed 4852.66 samples/sec Loss 2.5141 Epoch: 8 Global Step: 137950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:07:53,245-Speed 4795.31 samples/sec Loss 2.5094 Epoch: 8 Global Step: 138000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:08:17,697-[lfw][138000]XNorm: 23.721051 Training: 2021-03-18 01:08:17,697-[lfw][138000]Accuracy-Flip: 0.99600+-0.00300 Training: 2021-03-18 01:08:17,697-[lfw][138000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:08:45,381-[cfp_fp][138000]XNorm: 20.284733 Training: 2021-03-18 01:08:45,381-[cfp_fp][138000]Accuracy-Flip: 0.97229+-0.00726 Training: 2021-03-18 01:08:45,381-[cfp_fp][138000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:09:09,180-[agedb_30][138000]XNorm: 23.237878 Training: 2021-03-18 01:09:09,181-[agedb_30][138000]Accuracy-Flip: 0.97133+-0.00722 Training: 2021-03-18 01:09:09,181-[agedb_30][138000]Accuracy-Highest: 0.97317 Training: 2021-03-18 01:09:19,976-Speed 590.33 samples/sec Loss 2.5200 Epoch: 8 Global Step: 138050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:09:30,671-Speed 4787.74 samples/sec Loss 2.4950 Epoch: 8 Global Step: 138100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:09:42,266-Speed 4416.04 samples/sec Loss 2.4777 Epoch: 8 Global Step: 138150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:09:52,924-Speed 4804.44 samples/sec Loss 2.4858 Epoch: 8 Global Step: 138200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:03,828-Speed 4695.64 samples/sec Loss 2.5071 Epoch: 8 Global Step: 138250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:14,762-Speed 4683.08 samples/sec Loss 2.4953 Epoch: 8 Global Step: 138300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:26,006-Speed 4553.81 samples/sec Loss 2.5468 Epoch: 8 Global Step: 138350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:36,861-Speed 4716.99 samples/sec Loss 2.5061 Epoch: 8 Global Step: 138400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:47,471-Speed 4826.08 samples/sec Loss 2.4887 Epoch: 8 Global Step: 138450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:10:58,140-Speed 4799.23 samples/sec Loss 2.4665 Epoch: 8 Global Step: 138500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:11:09,130-Speed 4658.95 samples/sec Loss 2.5123 Epoch: 8 Global Step: 138550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:11:19,939-Speed 4736.90 samples/sec Loss 2.4951 Epoch: 8 Global Step: 138600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:11:30,847-Speed 4694.44 samples/sec Loss 2.4528 Epoch: 8 Global Step: 138650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:11:41,785-Speed 4680.87 samples/sec Loss 2.5007 Epoch: 8 Global Step: 138700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:11:52,719-Speed 4683.29 samples/sec Loss 2.5014 Epoch: 8 Global Step: 138750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:03,686-Speed 4668.57 samples/sec Loss 2.4931 Epoch: 8 Global Step: 138800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:14,533-Speed 4720.69 samples/sec Loss 2.5078 Epoch: 8 Global Step: 138850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:25,477-Speed 4678.84 samples/sec Loss 2.4845 Epoch: 8 Global Step: 138900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:36,285-Speed 4737.47 samples/sec Loss 2.4616 Epoch: 8 Global Step: 138950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:47,380-Speed 4614.75 samples/sec Loss 2.4760 Epoch: 8 Global Step: 139000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:12:58,133-Speed 4761.70 samples/sec Loss 2.5019 Epoch: 8 Global Step: 139050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:13:08,973-Speed 4723.54 samples/sec Loss 2.4839 Epoch: 8 Global Step: 139100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:13:19,924-Speed 4675.78 samples/sec Loss 2.4764 Epoch: 8 Global Step: 139150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:13:30,799-Speed 4708.22 samples/sec Loss 2.4586 Epoch: 8 Global Step: 139200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:13:41,844-Speed 4636.10 samples/sec Loss 2.4924 Epoch: 8 Global Step: 139250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:13:52,660-Speed 4733.78 samples/sec Loss 2.5141 Epoch: 8 Global Step: 139300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:03,718-Speed 4630.38 samples/sec Loss 2.4804 Epoch: 8 Global Step: 139350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:14,475-Speed 4760.15 samples/sec Loss 2.4784 Epoch: 8 Global Step: 139400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:25,142-Speed 4799.86 samples/sec Loss 2.4562 Epoch: 8 Global Step: 139450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:35,875-Speed 4770.89 samples/sec Loss 2.5220 Epoch: 8 Global Step: 139500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:46,606-Speed 4771.41 samples/sec Loss 2.4915 Epoch: 8 Global Step: 139550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:14:57,261-Speed 4805.53 samples/sec Loss 2.5170 Epoch: 8 Global Step: 139600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:15:08,199-Speed 4681.43 samples/sec Loss 2.4861 Epoch: 8 Global Step: 139650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:15:19,013-Speed 4734.52 samples/sec Loss 2.4715 Epoch: 8 Global Step: 139700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:15:30,847-Speed 4327.03 samples/sec Loss 2.4859 Epoch: 8 Global Step: 139750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:15:41,464-Speed 4822.72 samples/sec Loss 2.4828 Epoch: 8 Global Step: 139800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:15:52,101-Speed 4813.57 samples/sec Loss 2.4740 Epoch: 8 Global Step: 139850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:16:03,062-Speed 4671.43 samples/sec Loss 2.4962 Epoch: 8 Global Step: 139900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:16:13,680-Speed 4822.45 samples/sec Loss 2.4699 Epoch: 8 Global Step: 139950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:16:25,468-Speed 4343.58 samples/sec Loss 2.5024 Epoch: 8 Global Step: 140000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:16:49,814-[lfw][140000]XNorm: 23.064227 Training: 2021-03-18 01:16:49,814-[lfw][140000]Accuracy-Flip: 0.99683+-0.00283 Training: 2021-03-18 01:16:49,814-[lfw][140000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:17:17,532-[cfp_fp][140000]XNorm: 19.919815 Training: 2021-03-18 01:17:17,532-[cfp_fp][140000]Accuracy-Flip: 0.97043+-0.00828 Training: 2021-03-18 01:17:17,533-[cfp_fp][140000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:17:41,296-[agedb_30][140000]XNorm: 22.350515 Training: 2021-03-18 01:17:41,297-[agedb_30][140000]Accuracy-Flip: 0.97400+-0.00746 Training: 2021-03-18 01:17:41,297-[agedb_30][140000]Accuracy-Highest: 0.97400 Training: 2021-03-18 01:17:51,813-Speed 592.98 samples/sec Loss 2.5015 Epoch: 8 Global Step: 140050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:03,321-Speed 4449.06 samples/sec Loss 2.4934 Epoch: 8 Global Step: 140100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:13,964-Speed 4811.24 samples/sec Loss 2.5008 Epoch: 8 Global Step: 140150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:24,774-Speed 4736.80 samples/sec Loss 2.5215 Epoch: 8 Global Step: 140200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:36,501-Speed 4366.10 samples/sec Loss 2.5335 Epoch: 8 Global Step: 140250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:47,414-Speed 4691.70 samples/sec Loss 2.4994 Epoch: 8 Global Step: 140300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:18:58,931-Speed 4446.05 samples/sec Loss 2.5397 Epoch: 8 Global Step: 140350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:19:09,664-Speed 4770.90 samples/sec Loss 2.4926 Epoch: 8 Global Step: 140400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:19:21,265-Speed 4413.61 samples/sec Loss 2.5312 Epoch: 8 Global Step: 140450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:19:32,192-Speed 4686.07 samples/sec Loss 2.4879 Epoch: 8 Global Step: 140500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:19:43,027-Speed 4725.82 samples/sec Loss 2.5014 Epoch: 8 Global Step: 140550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:19:53,869-Speed 4722.69 samples/sec Loss 2.5022 Epoch: 8 Global Step: 140600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:05,380-Speed 4448.30 samples/sec Loss 2.5200 Epoch: 8 Global Step: 140650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:16,229-Speed 4719.51 samples/sec Loss 2.4792 Epoch: 8 Global Step: 140700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:27,186-Speed 4673.06 samples/sec Loss 2.4911 Epoch: 8 Global Step: 140750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:37,857-Speed 4798.33 samples/sec Loss 2.4994 Epoch: 8 Global Step: 140800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:48,663-Speed 4738.70 samples/sec Loss 2.4665 Epoch: 8 Global Step: 140850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:20:59,385-Speed 4775.30 samples/sec Loss 2.5178 Epoch: 8 Global Step: 140900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:21:10,242-Speed 4716.16 samples/sec Loss 2.4686 Epoch: 8 Global Step: 140950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:21:22,014-Speed 4349.83 samples/sec Loss 2.4941 Epoch: 8 Global Step: 141000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:21:32,986-Speed 4666.51 samples/sec Loss 2.5009 Epoch: 8 Global Step: 141050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:21:43,849-Speed 4713.92 samples/sec Loss 2.4567 Epoch: 8 Global Step: 141100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:21:54,540-Speed 4789.16 samples/sec Loss 2.4909 Epoch: 8 Global Step: 141150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:05,229-Speed 4790.37 samples/sec Loss 2.5072 Epoch: 8 Global Step: 141200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:16,041-Speed 4735.77 samples/sec Loss 2.4962 Epoch: 8 Global Step: 141250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:27,031-Speed 4659.48 samples/sec Loss 2.4950 Epoch: 8 Global Step: 141300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:37,809-Speed 4750.56 samples/sec Loss 2.4938 Epoch: 8 Global Step: 141350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:48,782-Speed 4666.18 samples/sec Loss 2.4586 Epoch: 8 Global Step: 141400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:22:59,515-Speed 4770.68 samples/sec Loss 2.4613 Epoch: 8 Global Step: 141450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:23:10,323-Speed 4737.48 samples/sec Loss 2.4532 Epoch: 8 Global Step: 141500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:23:21,137-Speed 4735.06 samples/sec Loss 2.5217 Epoch: 8 Global Step: 141550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:23:32,063-Speed 4686.29 samples/sec Loss 2.5304 Epoch: 8 Global Step: 141600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:23:42,826-Speed 4757.25 samples/sec Loss 2.4825 Epoch: 8 Global Step: 141650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:23:53,671-Speed 4721.63 samples/sec Loss 2.4552 Epoch: 8 Global Step: 141700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:04,333-Speed 4802.08 samples/sec Loss 2.4797 Epoch: 8 Global Step: 141750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:15,039-Speed 4782.98 samples/sec Loss 2.4854 Epoch: 8 Global Step: 141800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:25,854-Speed 4734.62 samples/sec Loss 2.4891 Epoch: 8 Global Step: 141850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:36,717-Speed 4713.25 samples/sec Loss 2.4992 Epoch: 8 Global Step: 141900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:47,939-Speed 4562.68 samples/sec Loss 2.5221 Epoch: 8 Global Step: 141950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:24:58,678-Speed 4768.05 samples/sec Loss 2.4922 Epoch: 8 Global Step: 142000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:25:23,173-[lfw][142000]XNorm: 23.246921 Training: 2021-03-18 01:25:23,173-[lfw][142000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-18 01:25:23,173-[lfw][142000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:25:50,868-[cfp_fp][142000]XNorm: 20.330184 Training: 2021-03-18 01:25:50,868-[cfp_fp][142000]Accuracy-Flip: 0.97229+-0.00952 Training: 2021-03-18 01:25:50,868-[cfp_fp][142000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:26:14,781-[agedb_30][142000]XNorm: 22.603372 Training: 2021-03-18 01:26:14,781-[agedb_30][142000]Accuracy-Flip: 0.97233+-0.00564 Training: 2021-03-18 01:26:14,781-[agedb_30][142000]Accuracy-Highest: 0.97400 Training: 2021-03-18 01:26:25,951-Speed 586.67 samples/sec Loss 2.4916 Epoch: 8 Global Step: 142050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:26:36,906-Speed 4674.03 samples/sec Loss 2.5069 Epoch: 8 Global Step: 142100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:26:47,546-Speed 4812.38 samples/sec Loss 2.4689 Epoch: 8 Global Step: 142150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:26:58,574-Speed 4643.00 samples/sec Loss 2.4864 Epoch: 8 Global Step: 142200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:27:09,382-Speed 4737.49 samples/sec Loss 2.4806 Epoch: 8 Global Step: 142250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:27:20,183-Speed 4740.63 samples/sec Loss 2.4970 Epoch: 8 Global Step: 142300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:27:30,907-Speed 4774.94 samples/sec Loss 2.4946 Epoch: 8 Global Step: 142350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:27:41,696-Speed 4745.57 samples/sec Loss 2.4708 Epoch: 8 Global Step: 142400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:27:52,207-Speed 4871.68 samples/sec Loss 2.4818 Epoch: 8 Global Step: 142450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:02,941-Speed 4769.92 samples/sec Loss 2.5074 Epoch: 8 Global Step: 142500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:14,383-Speed 4475.05 samples/sec Loss 2.4784 Epoch: 8 Global Step: 142550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:25,398-Speed 4648.30 samples/sec Loss 2.4843 Epoch: 8 Global Step: 142600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:36,158-Speed 4758.67 samples/sec Loss 2.4728 Epoch: 8 Global Step: 142650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:47,005-Speed 4720.60 samples/sec Loss 2.5184 Epoch: 8 Global Step: 142700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:28:57,958-Speed 4675.09 samples/sec Loss 2.4770 Epoch: 8 Global Step: 142750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:29:08,903-Speed 4678.24 samples/sec Loss 2.4858 Epoch: 8 Global Step: 142800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:29:19,543-Speed 4812.38 samples/sec Loss 2.5031 Epoch: 8 Global Step: 142850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:29:31,207-Speed 4389.53 samples/sec Loss 2.4952 Epoch: 8 Global Step: 142900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:29:43,207-Speed 4267.07 samples/sec Loss 2.4958 Epoch: 8 Global Step: 142950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:29:53,867-Speed 4803.09 samples/sec Loss 2.4518 Epoch: 8 Global Step: 143000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:30:04,949-Speed 4620.64 samples/sec Loss 2.4876 Epoch: 8 Global Step: 143050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:30:15,715-Speed 4756.09 samples/sec Loss 2.4871 Epoch: 8 Global Step: 143100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:30:26,567-Speed 4718.23 samples/sec Loss 2.4741 Epoch: 8 Global Step: 143150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:30:38,874-Speed 4160.43 samples/sec Loss 2.4829 Epoch: 8 Global Step: 143200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:30:49,664-Speed 4745.44 samples/sec Loss 2.5049 Epoch: 8 Global Step: 143250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:00,532-Speed 4711.59 samples/sec Loss 2.4737 Epoch: 8 Global Step: 143300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:12,219-Speed 4381.17 samples/sec Loss 2.4962 Epoch: 8 Global Step: 143350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:23,256-Speed 4639.30 samples/sec Loss 2.4903 Epoch: 8 Global Step: 143400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:34,023-Speed 4755.25 samples/sec Loss 2.4772 Epoch: 8 Global Step: 143450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:44,875-Speed 4718.48 samples/sec Loss 2.4906 Epoch: 8 Global Step: 143500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:31:55,722-Speed 4720.59 samples/sec Loss 2.4873 Epoch: 8 Global Step: 143550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:32:07,155-Speed 4478.41 samples/sec Loss 2.4657 Epoch: 8 Global Step: 143600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:32:17,748-Speed 4833.67 samples/sec Loss 2.4821 Epoch: 8 Global Step: 143650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:32:28,697-Speed 4676.60 samples/sec Loss 2.5016 Epoch: 8 Global Step: 143700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:32:40,156-Speed 4468.29 samples/sec Loss 2.5037 Epoch: 8 Global Step: 143750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:32:50,842-Speed 4791.86 samples/sec Loss 2.4812 Epoch: 8 Global Step: 143800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:33:01,309-Speed 4891.70 samples/sec Loss 2.4358 Epoch: 8 Global Step: 143850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:33:12,217-Speed 4694.29 samples/sec Loss 2.4660 Epoch: 8 Global Step: 143900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:33:23,146-Speed 4684.99 samples/sec Loss 2.4508 Epoch: 8 Global Step: 143950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:33:33,917-Speed 4753.89 samples/sec Loss 2.4733 Epoch: 8 Global Step: 144000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:33:57,920-[lfw][144000]XNorm: 21.122772 Training: 2021-03-18 01:33:57,920-[lfw][144000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-18 01:33:57,920-[lfw][144000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:34:25,353-[cfp_fp][144000]XNorm: 18.052257 Training: 2021-03-18 01:34:25,353-[cfp_fp][144000]Accuracy-Flip: 0.97171+-0.00629 Training: 2021-03-18 01:34:25,353-[cfp_fp][144000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:34:49,059-[agedb_30][144000]XNorm: 21.085779 Training: 2021-03-18 01:34:49,059-[agedb_30][144000]Accuracy-Flip: 0.97350+-0.00728 Training: 2021-03-18 01:34:49,059-[agedb_30][144000]Accuracy-Highest: 0.97400 Training: 2021-03-18 01:34:59,581-Speed 597.69 samples/sec Loss 2.4748 Epoch: 8 Global Step: 144050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:35:10,606-Speed 4644.18 samples/sec Loss 2.4702 Epoch: 8 Global Step: 144100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:35:21,432-Speed 4729.73 samples/sec Loss 2.4902 Epoch: 8 Global Step: 144150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:35:32,306-Speed 4708.39 samples/sec Loss 2.5234 Epoch: 8 Global Step: 144200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:35:43,141-Speed 4726.00 samples/sec Loss 2.4546 Epoch: 8 Global Step: 144250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:35:54,130-Speed 4659.24 samples/sec Loss 2.4718 Epoch: 8 Global Step: 144300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:04,910-Speed 4750.08 samples/sec Loss 2.4974 Epoch: 8 Global Step: 144350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:15,586-Speed 4795.96 samples/sec Loss 2.4952 Epoch: 8 Global Step: 144400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:26,532-Speed 4677.83 samples/sec Loss 2.4815 Epoch: 8 Global Step: 144450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:37,411-Speed 4706.63 samples/sec Loss 2.4980 Epoch: 8 Global Step: 144500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:48,484-Speed 4624.43 samples/sec Loss 2.4945 Epoch: 8 Global Step: 144550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-18 01:36:59,211-Speed 4773.27 samples/sec Loss 2.4784 Epoch: 8 Global Step: 144600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:37:09,901-Speed 4789.94 samples/sec Loss 2.4917 Epoch: 8 Global Step: 144650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:37:20,960-Speed 4629.89 samples/sec Loss 2.4746 Epoch: 8 Global Step: 144700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:37:31,748-Speed 4746.22 samples/sec Loss 2.4601 Epoch: 8 Global Step: 144750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:37:42,618-Speed 4710.35 samples/sec Loss 2.4973 Epoch: 8 Global Step: 144800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:37:53,376-Speed 4759.48 samples/sec Loss 2.4767 Epoch: 8 Global Step: 144850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:04,405-Speed 4642.82 samples/sec Loss 2.4969 Epoch: 8 Global Step: 144900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:15,119-Speed 4779.25 samples/sec Loss 2.4919 Epoch: 8 Global Step: 144950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:25,814-Speed 4787.55 samples/sec Loss 2.4655 Epoch: 8 Global Step: 145000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:36,666-Speed 4718.28 samples/sec Loss 2.4696 Epoch: 8 Global Step: 145050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:47,578-Speed 4692.07 samples/sec Loss 2.4604 Epoch: 8 Global Step: 145100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:38:58,352-Speed 4752.49 samples/sec Loss 2.4889 Epoch: 8 Global Step: 145150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:39:09,475-Speed 4603.59 samples/sec Loss 2.4519 Epoch: 8 Global Step: 145200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:39:20,389-Speed 4691.91 samples/sec Loss 2.5069 Epoch: 8 Global Step: 145250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:39:31,151-Speed 4757.62 samples/sec Loss 2.4666 Epoch: 8 Global Step: 145300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:39:42,960-Speed 4336.13 samples/sec Loss 2.4731 Epoch: 8 Global Step: 145350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:39:53,816-Speed 4716.56 samples/sec Loss 2.4628 Epoch: 8 Global Step: 145400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:04,885-Speed 4625.60 samples/sec Loss 2.4808 Epoch: 8 Global Step: 145450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:15,586-Speed 4784.95 samples/sec Loss 2.4579 Epoch: 8 Global Step: 145500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:26,490-Speed 4695.88 samples/sec Loss 2.5004 Epoch: 8 Global Step: 145550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:37,122-Speed 4815.87 samples/sec Loss 2.4782 Epoch: 8 Global Step: 145600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:47,954-Speed 4727.14 samples/sec Loss 2.4813 Epoch: 8 Global Step: 145650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:40:58,776-Speed 4731.29 samples/sec Loss 2.5028 Epoch: 8 Global Step: 145700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:41:09,732-Speed 4673.77 samples/sec Loss 2.4370 Epoch: 8 Global Step: 145750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:41:20,359-Speed 4818.28 samples/sec Loss 2.4697 Epoch: 8 Global Step: 145800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:41:31,202-Speed 4722.00 samples/sec Loss 2.4681 Epoch: 8 Global Step: 145850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:41:43,643-Speed 4115.58 samples/sec Loss 2.4954 Epoch: 8 Global Step: 145900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:41:54,526-Speed 4704.81 samples/sec Loss 2.4917 Epoch: 8 Global Step: 145950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:42:05,467-Speed 4679.91 samples/sec Loss 2.4929 Epoch: 8 Global Step: 146000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:42:29,835-[lfw][146000]XNorm: 22.849433 Training: 2021-03-18 01:42:29,836-[lfw][146000]Accuracy-Flip: 0.99750+-0.00318 Training: 2021-03-18 01:42:29,836-[lfw][146000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:42:57,333-[cfp_fp][146000]XNorm: 20.062489 Training: 2021-03-18 01:42:57,333-[cfp_fp][146000]Accuracy-Flip: 0.96971+-0.00852 Training: 2021-03-18 01:42:57,333-[cfp_fp][146000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:43:21,089-[agedb_30][146000]XNorm: 22.120177 Training: 2021-03-18 01:43:21,089-[agedb_30][146000]Accuracy-Flip: 0.97450+-0.00671 Training: 2021-03-18 01:43:21,089-[agedb_30][146000]Accuracy-Highest: 0.97450 Training: 2021-03-18 01:43:32,791-Speed 586.33 samples/sec Loss 2.5229 Epoch: 8 Global Step: 146050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:43:44,501-Speed 4372.39 samples/sec Loss 2.4850 Epoch: 8 Global Step: 146100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:43:55,413-Speed 4692.56 samples/sec Loss 2.4581 Epoch: 8 Global Step: 146150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:44:06,237-Speed 4730.53 samples/sec Loss 2.4915 Epoch: 8 Global Step: 146200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:44:17,872-Speed 4400.79 samples/sec Loss 2.4856 Epoch: 8 Global Step: 146250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:44:28,760-Speed 4702.78 samples/sec Loss 2.4906 Epoch: 8 Global Step: 146300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:44:39,629-Speed 4710.95 samples/sec Loss 2.4486 Epoch: 8 Global Step: 146350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:44:50,389-Speed 4758.38 samples/sec Loss 2.4598 Epoch: 8 Global Step: 146400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:01,190-Speed 4740.72 samples/sec Loss 2.5196 Epoch: 8 Global Step: 146450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:12,765-Speed 4423.69 samples/sec Loss 2.4473 Epoch: 8 Global Step: 146500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:23,623-Speed 4715.45 samples/sec Loss 2.4799 Epoch: 8 Global Step: 146550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:34,701-Speed 4622.17 samples/sec Loss 2.4980 Epoch: 8 Global Step: 146600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:46,352-Speed 4394.61 samples/sec Loss 2.4793 Epoch: 8 Global Step: 146650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:45:57,286-Speed 4683.00 samples/sec Loss 2.4831 Epoch: 8 Global Step: 146700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:46:08,239-Speed 4674.84 samples/sec Loss 2.4767 Epoch: 8 Global Step: 146750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:46:19,264-Speed 4644.32 samples/sec Loss 2.4616 Epoch: 8 Global Step: 146800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:46:30,047-Speed 4748.50 samples/sec Loss 2.4446 Epoch: 8 Global Step: 146850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:46:40,830-Speed 4748.82 samples/sec Loss 2.4789 Epoch: 8 Global Step: 146900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:46:51,748-Speed 4689.68 samples/sec Loss 2.4857 Epoch: 8 Global Step: 146950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:02,704-Speed 4673.32 samples/sec Loss 2.4742 Epoch: 8 Global Step: 147000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:13,804-Speed 4613.25 samples/sec Loss 2.4762 Epoch: 8 Global Step: 147050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:24,682-Speed 4706.99 samples/sec Loss 2.4863 Epoch: 8 Global Step: 147100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:35,499-Speed 4733.42 samples/sec Loss 2.4765 Epoch: 8 Global Step: 147150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:46,313-Speed 4734.95 samples/sec Loss 2.4676 Epoch: 8 Global Step: 147200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:47:57,448-Speed 4598.49 samples/sec Loss 2.5009 Epoch: 8 Global Step: 147250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:48:08,456-Speed 4651.47 samples/sec Loss 2.4902 Epoch: 8 Global Step: 147300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:48:19,388-Speed 4683.53 samples/sec Loss 2.4664 Epoch: 8 Global Step: 147350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:48:30,329-Speed 4680.23 samples/sec Loss 2.4642 Epoch: 8 Global Step: 147400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:48:41,264-Speed 4682.56 samples/sec Loss 2.4904 Epoch: 8 Global Step: 147450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:48:52,080-Speed 4733.87 samples/sec Loss 2.4674 Epoch: 8 Global Step: 147500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:03,174-Speed 4615.28 samples/sec Loss 2.4670 Epoch: 8 Global Step: 147550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:14,284-Speed 4608.76 samples/sec Loss 2.4885 Epoch: 8 Global Step: 147600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:25,398-Speed 4607.06 samples/sec Loss 2.4946 Epoch: 8 Global Step: 147650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:36,376-Speed 4664.44 samples/sec Loss 2.4605 Epoch: 8 Global Step: 147700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:47,112-Speed 4769.18 samples/sec Loss 2.4504 Epoch: 8 Global Step: 147750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:49:58,022-Speed 4693.24 samples/sec Loss 2.4659 Epoch: 8 Global Step: 147800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:50:08,949-Speed 4686.04 samples/sec Loss 2.4532 Epoch: 8 Global Step: 147850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:50:19,674-Speed 4773.93 samples/sec Loss 2.4673 Epoch: 8 Global Step: 147900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:50:30,359-Speed 4792.10 samples/sec Loss 2.4591 Epoch: 8 Global Step: 147950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:50:41,364-Speed 4653.05 samples/sec Loss 2.4503 Epoch: 8 Global Step: 148000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:51:05,710-[lfw][148000]XNorm: 23.307557 Training: 2021-03-18 01:51:05,710-[lfw][148000]Accuracy-Flip: 0.99717+-0.00325 Training: 2021-03-18 01:51:05,710-[lfw][148000]Accuracy-Highest: 0.99767 Training: 2021-03-18 01:51:33,297-[cfp_fp][148000]XNorm: 20.118992 Training: 2021-03-18 01:51:33,297-[cfp_fp][148000]Accuracy-Flip: 0.97486+-0.00801 Training: 2021-03-18 01:51:33,297-[cfp_fp][148000]Accuracy-Highest: 0.97571 Training: 2021-03-18 01:51:57,127-[agedb_30][148000]XNorm: 22.652231 Training: 2021-03-18 01:51:57,127-[agedb_30][148000]Accuracy-Flip: 0.97350+-0.00560 Training: 2021-03-18 01:51:57,127-[agedb_30][148000]Accuracy-Highest: 0.97450 Training: 2021-03-18 01:52:07,981-Speed 591.11 samples/sec Loss 2.4464 Epoch: 8 Global Step: 148050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:52:19,009-Speed 4643.04 samples/sec Loss 2.4652 Epoch: 8 Global Step: 148100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:52:30,066-Speed 4631.12 samples/sec Loss 2.4525 Epoch: 8 Global Step: 148150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:52:40,889-Speed 4730.77 samples/sec Loss 2.4818 Epoch: 8 Global Step: 148200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:52:52,500-Speed 4409.82 samples/sec Loss 2.4735 Epoch: 8 Global Step: 148250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:03,465-Speed 4670.09 samples/sec Loss 2.4825 Epoch: 8 Global Step: 148300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:14,317-Speed 4718.10 samples/sec Loss 2.4634 Epoch: 8 Global Step: 148350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:25,078-Speed 4758.28 samples/sec Loss 2.4469 Epoch: 8 Global Step: 148400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:35,776-Speed 4786.28 samples/sec Loss 2.4869 Epoch: 8 Global Step: 148450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:46,479-Speed 4784.20 samples/sec Loss 2.4692 Epoch: 8 Global Step: 148500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:53:57,320-Speed 4723.25 samples/sec Loss 2.4921 Epoch: 8 Global Step: 148550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:54:08,061-Speed 4767.03 samples/sec Loss 2.4682 Epoch: 8 Global Step: 148600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:54:18,743-Speed 4793.24 samples/sec Loss 2.4689 Epoch: 8 Global Step: 148650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:54:30,654-Speed 4298.94 samples/sec Loss 2.4737 Epoch: 8 Global Step: 148700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:54:42,368-Speed 4371.10 samples/sec Loss 2.4461 Epoch: 8 Global Step: 148750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:54:53,284-Speed 4690.82 samples/sec Loss 2.4642 Epoch: 8 Global Step: 148800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:03,814-Speed 4862.76 samples/sec Loss 2.4959 Epoch: 8 Global Step: 148850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:16,491-Speed 4039.13 samples/sec Loss 2.4616 Epoch: 8 Global Step: 148900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:27,187-Speed 4787.04 samples/sec Loss 2.4528 Epoch: 8 Global Step: 148950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:37,919-Speed 4770.80 samples/sec Loss 2.4403 Epoch: 8 Global Step: 149000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:48,641-Speed 4776.09 samples/sec Loss 2.4438 Epoch: 8 Global Step: 149050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:55:59,511-Speed 4710.24 samples/sec Loss 2.4766 Epoch: 8 Global Step: 149100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:56:10,996-Speed 4458.30 samples/sec Loss 2.4852 Epoch: 8 Global Step: 149150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:56:21,756-Speed 4758.68 samples/sec Loss 2.4597 Epoch: 8 Global Step: 149200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:56:33,392-Speed 4400.51 samples/sec Loss 2.4525 Epoch: 8 Global Step: 149250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:56:44,055-Speed 4801.58 samples/sec Loss 2.4496 Epoch: 8 Global Step: 149300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:56:54,728-Speed 4797.70 samples/sec Loss 2.4845 Epoch: 8 Global Step: 149350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:57:05,389-Speed 4802.55 samples/sec Loss 2.4722 Epoch: 8 Global Step: 149400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:57:16,495-Speed 4610.44 samples/sec Loss 2.4547 Epoch: 8 Global Step: 149450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:57:28,048-Speed 4431.99 samples/sec Loss 2.4495 Epoch: 8 Global Step: 149500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:57:38,969-Speed 4688.68 samples/sec Loss 2.4860 Epoch: 8 Global Step: 149550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:57:49,851-Speed 4705.17 samples/sec Loss 2.4318 Epoch: 8 Global Step: 149600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:01,279-Speed 4480.41 samples/sec Loss 2.4403 Epoch: 8 Global Step: 149650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:11,988-Speed 4781.27 samples/sec Loss 2.4568 Epoch: 8 Global Step: 149700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:22,956-Speed 4668.39 samples/sec Loss 2.4587 Epoch: 8 Global Step: 149750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:33,672-Speed 4778.49 samples/sec Loss 2.4573 Epoch: 8 Global Step: 149800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:44,492-Speed 4731.91 samples/sec Loss 2.4937 Epoch: 8 Global Step: 149850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:58:55,417-Speed 4686.90 samples/sec Loss 2.4734 Epoch: 8 Global Step: 149900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:59:06,275-Speed 4715.57 samples/sec Loss 2.4716 Epoch: 8 Global Step: 149950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:59:17,410-Speed 4598.76 samples/sec Loss 2.4353 Epoch: 8 Global Step: 150000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 01:59:41,729-[lfw][150000]XNorm: 23.035239 Training: 2021-03-18 01:59:41,730-[lfw][150000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-18 01:59:41,730-[lfw][150000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:00:09,157-[cfp_fp][150000]XNorm: 19.882059 Training: 2021-03-18 02:00:09,157-[cfp_fp][150000]Accuracy-Flip: 0.97314+-0.00910 Training: 2021-03-18 02:00:09,158-[cfp_fp][150000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:00:32,845-[agedb_30][150000]XNorm: 22.542566 Training: 2021-03-18 02:00:32,846-[agedb_30][150000]Accuracy-Flip: 0.97317+-0.00462 Training: 2021-03-18 02:00:32,846-[agedb_30][150000]Accuracy-Highest: 0.97450 Training: 2021-03-18 02:00:43,823-Speed 592.51 samples/sec Loss 2.4586 Epoch: 8 Global Step: 150050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:00:54,635-Speed 4735.56 samples/sec Loss 2.4782 Epoch: 8 Global Step: 150100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:01:05,545-Speed 4693.42 samples/sec Loss 2.4404 Epoch: 8 Global Step: 150150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:01:16,785-Speed 4555.27 samples/sec Loss 2.4747 Epoch: 8 Global Step: 150200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:01:40,467-Speed 2162.07 samples/sec Loss 2.2698 Epoch: 9 Global Step: 150250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:01:51,860-Speed 4494.34 samples/sec Loss 2.1387 Epoch: 9 Global Step: 150300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:03,162-Speed 4530.60 samples/sec Loss 2.1498 Epoch: 9 Global Step: 150350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:14,206-Speed 4636.72 samples/sec Loss 2.1225 Epoch: 9 Global Step: 150400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:25,401-Speed 4573.67 samples/sec Loss 2.1293 Epoch: 9 Global Step: 150450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:36,351-Speed 4676.07 samples/sec Loss 2.1143 Epoch: 9 Global Step: 150500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:47,573-Speed 4563.06 samples/sec Loss 2.1523 Epoch: 9 Global Step: 150550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:02:58,844-Speed 4542.62 samples/sec Loss 2.1249 Epoch: 9 Global Step: 150600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:03:10,065-Speed 4562.98 samples/sec Loss 2.1817 Epoch: 9 Global Step: 150650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:03:21,255-Speed 4575.90 samples/sec Loss 2.1277 Epoch: 9 Global Step: 150700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:03:32,359-Speed 4611.40 samples/sec Loss 2.1670 Epoch: 9 Global Step: 150750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:03:43,479-Speed 4604.32 samples/sec Loss 2.1517 Epoch: 9 Global Step: 150800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:03:54,394-Speed 4691.27 samples/sec Loss 2.1745 Epoch: 9 Global Step: 150850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:04:05,284-Speed 4701.74 samples/sec Loss 2.1936 Epoch: 9 Global Step: 150900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:04:16,242-Speed 4672.77 samples/sec Loss 2.1474 Epoch: 9 Global Step: 150950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:04:27,401-Speed 4588.46 samples/sec Loss 2.1842 Epoch: 9 Global Step: 151000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:04:38,344-Speed 4678.96 samples/sec Loss 2.1677 Epoch: 9 Global Step: 151050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:04:49,722-Speed 4500.17 samples/sec Loss 2.1815 Epoch: 9 Global Step: 151100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:01,586-Speed 4315.82 samples/sec Loss 2.1545 Epoch: 9 Global Step: 151150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:12,876-Speed 4535.54 samples/sec Loss 2.1749 Epoch: 9 Global Step: 151200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:24,068-Speed 4574.76 samples/sec Loss 2.2055 Epoch: 9 Global Step: 151250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:35,363-Speed 4533.51 samples/sec Loss 2.1890 Epoch: 9 Global Step: 151300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:46,439-Speed 4622.51 samples/sec Loss 2.1565 Epoch: 9 Global Step: 151350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:05:57,676-Speed 4556.78 samples/sec Loss 2.1891 Epoch: 9 Global Step: 151400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:06:08,518-Speed 4722.86 samples/sec Loss 2.1602 Epoch: 9 Global Step: 151450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:06:19,684-Speed 4585.59 samples/sec Loss 2.1635 Epoch: 9 Global Step: 151500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:06:31,323-Speed 4399.25 samples/sec Loss 2.1754 Epoch: 9 Global Step: 151550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:06:42,151-Speed 4728.86 samples/sec Loss 2.1662 Epoch: 9 Global Step: 151600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:06:53,925-Speed 4348.65 samples/sec Loss 2.1905 Epoch: 9 Global Step: 151650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:07:04,970-Speed 4636.26 samples/sec Loss 2.1903 Epoch: 9 Global Step: 151700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:07:17,408-Speed 4116.43 samples/sec Loss 2.1633 Epoch: 9 Global Step: 151750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:07:29,103-Speed 4378.42 samples/sec Loss 2.1864 Epoch: 9 Global Step: 151800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:07:40,435-Speed 4518.14 samples/sec Loss 2.2124 Epoch: 9 Global Step: 151850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:07:51,482-Speed 4635.24 samples/sec Loss 2.1828 Epoch: 9 Global Step: 151900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:08:02,783-Speed 4530.66 samples/sec Loss 2.2052 Epoch: 9 Global Step: 151950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:08:15,060-Speed 4170.65 samples/sec Loss 2.1894 Epoch: 9 Global Step: 152000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:08:38,976-[lfw][152000]XNorm: 22.785572 Training: 2021-03-18 02:08:38,976-[lfw][152000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-18 02:08:38,976-[lfw][152000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:09:06,413-[cfp_fp][152000]XNorm: 20.037515 Training: 2021-03-18 02:09:06,413-[cfp_fp][152000]Accuracy-Flip: 0.97314+-0.00987 Training: 2021-03-18 02:09:06,413-[cfp_fp][152000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:09:30,089-[agedb_30][152000]XNorm: 22.291316 Training: 2021-03-18 02:09:30,089-[agedb_30][152000]Accuracy-Flip: 0.97517+-0.00545 Training: 2021-03-18 02:09:30,091-[agedb_30][152000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:09:41,826-Speed 590.11 samples/sec Loss 2.1586 Epoch: 9 Global Step: 152050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:09:52,852-Speed 4643.56 samples/sec Loss 2.2304 Epoch: 9 Global Step: 152100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:10:04,282-Speed 4479.99 samples/sec Loss 2.1994 Epoch: 9 Global Step: 152150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:10:15,375-Speed 4615.77 samples/sec Loss 2.2167 Epoch: 9 Global Step: 152200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:10:26,356-Speed 4662.58 samples/sec Loss 2.2103 Epoch: 9 Global Step: 152250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:10:37,268-Speed 4692.57 samples/sec Loss 2.2275 Epoch: 9 Global Step: 152300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:10:49,516-Speed 4180.57 samples/sec Loss 2.1930 Epoch: 9 Global Step: 152350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:00,877-Speed 4506.89 samples/sec Loss 2.2319 Epoch: 9 Global Step: 152400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:12,028-Speed 4591.63 samples/sec Loss 2.2173 Epoch: 9 Global Step: 152450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:23,088-Speed 4629.55 samples/sec Loss 2.2236 Epoch: 9 Global Step: 152500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:34,294-Speed 4569.35 samples/sec Loss 2.1934 Epoch: 9 Global Step: 152550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:45,429-Speed 4598.36 samples/sec Loss 2.2168 Epoch: 9 Global Step: 152600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:11:56,820-Speed 4494.81 samples/sec Loss 2.2152 Epoch: 9 Global Step: 152650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:12:07,966-Speed 4593.81 samples/sec Loss 2.2144 Epoch: 9 Global Step: 152700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:12:19,085-Speed 4605.28 samples/sec Loss 2.2317 Epoch: 9 Global Step: 152750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:12:30,365-Speed 4539.19 samples/sec Loss 2.2424 Epoch: 9 Global Step: 152800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:12:41,460-Speed 4614.85 samples/sec Loss 2.2234 Epoch: 9 Global Step: 152850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:12:52,824-Speed 4505.92 samples/sec Loss 2.2073 Epoch: 9 Global Step: 152900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:03,953-Speed 4600.78 samples/sec Loss 2.2076 Epoch: 9 Global Step: 152950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:15,209-Speed 4548.91 samples/sec Loss 2.2226 Epoch: 9 Global Step: 153000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:26,266-Speed 4630.93 samples/sec Loss 2.2530 Epoch: 9 Global Step: 153050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:37,533-Speed 4544.46 samples/sec Loss 2.2225 Epoch: 9 Global Step: 153100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:48,569-Speed 4639.57 samples/sec Loss 2.2466 Epoch: 9 Global Step: 153150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:13:59,893-Speed 4521.89 samples/sec Loss 2.2498 Epoch: 9 Global Step: 153200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:14:10,992-Speed 4613.23 samples/sec Loss 2.2824 Epoch: 9 Global Step: 153250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:14:22,149-Speed 4589.35 samples/sec Loss 2.2273 Epoch: 9 Global Step: 153300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:14:33,458-Speed 4527.52 samples/sec Loss 2.2605 Epoch: 9 Global Step: 153350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:14:44,878-Speed 4483.84 samples/sec Loss 2.2226 Epoch: 9 Global Step: 153400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:14:56,179-Speed 4531.03 samples/sec Loss 2.2448 Epoch: 9 Global Step: 153450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:15:07,451-Speed 4542.74 samples/sec Loss 2.2451 Epoch: 9 Global Step: 153500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:15:18,780-Speed 4519.32 samples/sec Loss 2.2339 Epoch: 9 Global Step: 153550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:15:29,794-Speed 4649.00 samples/sec Loss 2.2277 Epoch: 9 Global Step: 153600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:15:40,991-Speed 4573.12 samples/sec Loss 2.2538 Epoch: 9 Global Step: 153650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:15:52,230-Speed 4555.66 samples/sec Loss 2.2547 Epoch: 9 Global Step: 153700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:03,292-Speed 4628.75 samples/sec Loss 2.2674 Epoch: 9 Global Step: 153750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:14,111-Speed 4732.81 samples/sec Loss 2.2288 Epoch: 9 Global Step: 153800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:25,529-Speed 4484.30 samples/sec Loss 2.2257 Epoch: 9 Global Step: 153850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:36,690-Speed 4587.96 samples/sec Loss 2.2600 Epoch: 9 Global Step: 153900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:47,873-Speed 4578.59 samples/sec Loss 2.2477 Epoch: 9 Global Step: 153950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:16:59,124-Speed 4551.04 samples/sec Loss 2.2240 Epoch: 9 Global Step: 154000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:17:23,634-[lfw][154000]XNorm: 22.697383 Training: 2021-03-18 02:17:23,634-[lfw][154000]Accuracy-Flip: 0.99667+-0.00258 Training: 2021-03-18 02:17:23,634-[lfw][154000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:17:51,447-[cfp_fp][154000]XNorm: 20.032131 Training: 2021-03-18 02:17:51,447-[cfp_fp][154000]Accuracy-Flip: 0.96986+-0.00911 Training: 2021-03-18 02:17:51,447-[cfp_fp][154000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:18:15,231-[agedb_30][154000]XNorm: 22.237084 Training: 2021-03-18 02:18:15,232-[agedb_30][154000]Accuracy-Flip: 0.97300+-0.00702 Training: 2021-03-18 02:18:15,232-[agedb_30][154000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:18:27,206-Speed 581.28 samples/sec Loss 2.2926 Epoch: 9 Global Step: 154050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:18:38,274-Speed 4626.41 samples/sec Loss 2.2423 Epoch: 9 Global Step: 154100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:18:49,112-Speed 4724.64 samples/sec Loss 2.2513 Epoch: 9 Global Step: 154150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:00,388-Speed 4540.88 samples/sec Loss 2.2271 Epoch: 9 Global Step: 154200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:11,699-Speed 4526.81 samples/sec Loss 2.2631 Epoch: 9 Global Step: 154250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:23,680-Speed 4273.58 samples/sec Loss 2.2362 Epoch: 9 Global Step: 154300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:34,972-Speed 4534.48 samples/sec Loss 2.2572 Epoch: 9 Global Step: 154350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:46,079-Speed 4610.01 samples/sec Loss 2.2593 Epoch: 9 Global Step: 154400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:19:57,432-Speed 4510.39 samples/sec Loss 2.2531 Epoch: 9 Global Step: 154450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:20:08,717-Speed 4537.27 samples/sec Loss 2.2632 Epoch: 9 Global Step: 154500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:20:21,363-Speed 4049.07 samples/sec Loss 2.2639 Epoch: 9 Global Step: 154550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:20:32,463-Speed 4612.83 samples/sec Loss 2.2715 Epoch: 9 Global Step: 154600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:20:44,983-Speed 4089.86 samples/sec Loss 2.2609 Epoch: 9 Global Step: 154650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:20:56,255-Speed 4542.35 samples/sec Loss 2.2882 Epoch: 9 Global Step: 154700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:21:08,413-Speed 4211.45 samples/sec Loss 2.2536 Epoch: 9 Global Step: 154750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:21:19,382-Speed 4668.13 samples/sec Loss 2.2787 Epoch: 9 Global Step: 154800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:21:30,955-Speed 4424.31 samples/sec Loss 2.2768 Epoch: 9 Global Step: 154850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:21:42,031-Speed 4622.94 samples/sec Loss 2.2826 Epoch: 9 Global Step: 154900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:21:53,072-Speed 4637.35 samples/sec Loss 2.2846 Epoch: 9 Global Step: 154950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:22:04,515-Speed 4474.82 samples/sec Loss 2.2604 Epoch: 9 Global Step: 155000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:22:15,685-Speed 4584.00 samples/sec Loss 2.2803 Epoch: 9 Global Step: 155050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:22:27,446-Speed 4353.40 samples/sec Loss 2.2670 Epoch: 9 Global Step: 155100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:22:38,678-Speed 4558.96 samples/sec Loss 2.2938 Epoch: 9 Global Step: 155150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:22:50,064-Speed 4496.81 samples/sec Loss 2.2643 Epoch: 9 Global Step: 155200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:01,071-Speed 4652.07 samples/sec Loss 2.2622 Epoch: 9 Global Step: 155250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:12,158-Speed 4617.96 samples/sec Loss 2.2842 Epoch: 9 Global Step: 155300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:23,550-Speed 4494.70 samples/sec Loss 2.2959 Epoch: 9 Global Step: 155350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:34,627-Speed 4622.75 samples/sec Loss 2.2754 Epoch: 9 Global Step: 155400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:45,554-Speed 4685.61 samples/sec Loss 2.2943 Epoch: 9 Global Step: 155450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:23:56,821-Speed 4544.85 samples/sec Loss 2.2860 Epoch: 9 Global Step: 155500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:24:08,139-Speed 4523.89 samples/sec Loss 2.2635 Epoch: 9 Global Step: 155550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:24:19,611-Speed 4463.27 samples/sec Loss 2.2629 Epoch: 9 Global Step: 155600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:24:30,530-Speed 4689.89 samples/sec Loss 2.2807 Epoch: 9 Global Step: 155650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:24:41,893-Speed 4506.04 samples/sec Loss 2.2865 Epoch: 9 Global Step: 155700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:24:53,016-Speed 4603.19 samples/sec Loss 2.2901 Epoch: 9 Global Step: 155750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:25:03,992-Speed 4665.07 samples/sec Loss 2.2862 Epoch: 9 Global Step: 155800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:25:15,229-Speed 4556.79 samples/sec Loss 2.2966 Epoch: 9 Global Step: 155850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:25:26,280-Speed 4633.51 samples/sec Loss 2.2737 Epoch: 9 Global Step: 155900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:25:37,437-Speed 4589.07 samples/sec Loss 2.2839 Epoch: 9 Global Step: 155950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:25:48,885-Speed 4472.93 samples/sec Loss 2.2832 Epoch: 9 Global Step: 156000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:26:13,203-[lfw][156000]XNorm: 23.575960 Training: 2021-03-18 02:26:13,203-[lfw][156000]Accuracy-Flip: 0.99667+-0.00307 Training: 2021-03-18 02:26:13,203-[lfw][156000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:26:41,253-[cfp_fp][156000]XNorm: 20.663555 Training: 2021-03-18 02:26:41,253-[cfp_fp][156000]Accuracy-Flip: 0.97114+-0.00912 Training: 2021-03-18 02:26:41,253-[cfp_fp][156000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:27:05,201-[agedb_30][156000]XNorm: 23.124797 Training: 2021-03-18 02:27:05,202-[agedb_30][156000]Accuracy-Flip: 0.97450+-0.00597 Training: 2021-03-18 02:27:05,202-[agedb_30][156000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:27:16,314-Speed 585.62 samples/sec Loss 2.2875 Epoch: 9 Global Step: 156050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:27:27,402-Speed 4617.74 samples/sec Loss 2.2902 Epoch: 9 Global Step: 156100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:27:38,727-Speed 4521.51 samples/sec Loss 2.3046 Epoch: 9 Global Step: 156150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:27:49,707-Speed 4663.41 samples/sec Loss 2.2864 Epoch: 9 Global Step: 156200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:01,007-Speed 4530.95 samples/sec Loss 2.2989 Epoch: 9 Global Step: 156250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:11,979-Speed 4666.76 samples/sec Loss 2.2828 Epoch: 9 Global Step: 156300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:23,128-Speed 4592.62 samples/sec Loss 2.3125 Epoch: 9 Global Step: 156350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:34,307-Speed 4580.58 samples/sec Loss 2.3050 Epoch: 9 Global Step: 156400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:45,336-Speed 4642.62 samples/sec Loss 2.2769 Epoch: 9 Global Step: 156450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:28:56,419-Speed 4619.63 samples/sec Loss 2.2736 Epoch: 9 Global Step: 156500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:29:07,347-Speed 4685.44 samples/sec Loss 2.3053 Epoch: 9 Global Step: 156550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:29:18,728-Speed 4499.22 samples/sec Loss 2.3203 Epoch: 9 Global Step: 156600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:29:29,907-Speed 4580.39 samples/sec Loss 2.2799 Epoch: 9 Global Step: 156650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:29:41,392-Speed 4458.19 samples/sec Loss 2.3254 Epoch: 9 Global Step: 156700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:29:52,637-Speed 4553.56 samples/sec Loss 2.3049 Epoch: 9 Global Step: 156750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:30:04,127-Speed 4456.25 samples/sec Loss 2.2888 Epoch: 9 Global Step: 156800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:30:16,142-Speed 4261.42 samples/sec Loss 2.3010 Epoch: 9 Global Step: 156850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:30:27,191-Speed 4634.17 samples/sec Loss 2.3154 Epoch: 9 Global Step: 156900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:30:38,260-Speed 4626.12 samples/sec Loss 2.2823 Epoch: 9 Global Step: 156950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:30:49,409-Speed 4592.54 samples/sec Loss 2.3009 Epoch: 9 Global Step: 157000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:01,217-Speed 4336.54 samples/sec Loss 2.2696 Epoch: 9 Global Step: 157050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:12,486-Speed 4543.46 samples/sec Loss 2.3289 Epoch: 9 Global Step: 157100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:23,806-Speed 4523.35 samples/sec Loss 2.3466 Epoch: 9 Global Step: 157150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:35,167-Speed 4506.98 samples/sec Loss 2.3130 Epoch: 9 Global Step: 157200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:47,041-Speed 4312.21 samples/sec Loss 2.2987 Epoch: 9 Global Step: 157250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:31:58,421-Speed 4499.20 samples/sec Loss 2.3132 Epoch: 9 Global Step: 157300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:32:09,570-Speed 4592.79 samples/sec Loss 2.3000 Epoch: 9 Global Step: 157350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:32:21,809-Speed 4183.54 samples/sec Loss 2.2779 Epoch: 9 Global Step: 157400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:32:32,773-Speed 4670.46 samples/sec Loss 2.3081 Epoch: 9 Global Step: 157450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:32:44,316-Speed 4435.83 samples/sec Loss 2.3036 Epoch: 9 Global Step: 157500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:32:57,046-Speed 4022.30 samples/sec Loss 2.2895 Epoch: 9 Global Step: 157550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:33:08,927-Speed 4309.62 samples/sec Loss 2.3336 Epoch: 9 Global Step: 157600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:33:19,995-Speed 4626.27 samples/sec Loss 2.3179 Epoch: 9 Global Step: 157650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:33:31,277-Speed 4538.69 samples/sec Loss 2.2986 Epoch: 9 Global Step: 157700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:33:42,479-Speed 4571.06 samples/sec Loss 2.3181 Epoch: 9 Global Step: 157750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:33:53,589-Speed 4608.46 samples/sec Loss 2.3399 Epoch: 9 Global Step: 157800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:34:04,656-Speed 4626.70 samples/sec Loss 2.3251 Epoch: 9 Global Step: 157850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:34:15,992-Speed 4516.92 samples/sec Loss 2.3200 Epoch: 9 Global Step: 157900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:34:27,389-Speed 4493.06 samples/sec Loss 2.3137 Epoch: 9 Global Step: 157950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:34:39,242-Speed 4319.68 samples/sec Loss 2.2974 Epoch: 9 Global Step: 158000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:35:03,989-[lfw][158000]XNorm: 23.740186 Training: 2021-03-18 02:35:03,989-[lfw][158000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-18 02:35:03,989-[lfw][158000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:35:31,780-[cfp_fp][158000]XNorm: 20.403716 Training: 2021-03-18 02:35:31,780-[cfp_fp][158000]Accuracy-Flip: 0.97100+-0.00963 Training: 2021-03-18 02:35:31,780-[cfp_fp][158000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:35:55,697-[agedb_30][158000]XNorm: 23.301865 Training: 2021-03-18 02:35:55,698-[agedb_30][158000]Accuracy-Flip: 0.97517+-0.00643 Training: 2021-03-18 02:35:55,698-[agedb_30][158000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:36:06,812-Speed 584.68 samples/sec Loss 2.3209 Epoch: 9 Global Step: 158050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:36:17,863-Speed 4633.44 samples/sec Loss 2.2910 Epoch: 9 Global Step: 158100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:36:29,205-Speed 4514.30 samples/sec Loss 2.2939 Epoch: 9 Global Step: 158150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:36:40,268-Speed 4628.36 samples/sec Loss 2.3405 Epoch: 9 Global Step: 158200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:36:51,435-Speed 4585.27 samples/sec Loss 2.3031 Epoch: 9 Global Step: 158250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:02,535-Speed 4612.85 samples/sec Loss 2.3293 Epoch: 9 Global Step: 158300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:13,843-Speed 4527.98 samples/sec Loss 2.3387 Epoch: 9 Global Step: 158350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:24,881-Speed 4638.95 samples/sec Loss 2.3212 Epoch: 9 Global Step: 158400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:35,905-Speed 4644.72 samples/sec Loss 2.3204 Epoch: 9 Global Step: 158450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:47,160-Speed 4549.28 samples/sec Loss 2.3365 Epoch: 9 Global Step: 158500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:37:58,350-Speed 4575.89 samples/sec Loss 2.3278 Epoch: 9 Global Step: 158550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:38:09,315-Speed 4669.64 samples/sec Loss 2.3008 Epoch: 9 Global Step: 158600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:38:20,183-Speed 4711.48 samples/sec Loss 2.2892 Epoch: 9 Global Step: 158650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:38:31,199-Speed 4648.13 samples/sec Loss 2.3285 Epoch: 9 Global Step: 158700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:38:42,413-Speed 4566.02 samples/sec Loss 2.3176 Epoch: 9 Global Step: 158750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:38:53,666-Speed 4550.23 samples/sec Loss 2.3331 Epoch: 9 Global Step: 158800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:39:04,621-Speed 4674.05 samples/sec Loss 2.3166 Epoch: 9 Global Step: 158850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-18 02:39:16,147-Speed 4442.18 samples/sec Loss 2.3385 Epoch: 9 Global Step: 158900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:39:27,245-Speed 4613.90 samples/sec Loss 2.3100 Epoch: 9 Global Step: 158950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:39:38,621-Speed 4501.08 samples/sec Loss 2.3264 Epoch: 9 Global Step: 159000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:39:49,760-Speed 4596.46 samples/sec Loss 2.3110 Epoch: 9 Global Step: 159050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:01,157-Speed 4492.76 samples/sec Loss 2.3279 Epoch: 9 Global Step: 159100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:12,180-Speed 4645.41 samples/sec Loss 2.3308 Epoch: 9 Global Step: 159150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:23,374-Speed 4574.16 samples/sec Loss 2.3189 Epoch: 9 Global Step: 159200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:34,282-Speed 4694.35 samples/sec Loss 2.3342 Epoch: 9 Global Step: 159250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:45,425-Speed 4594.92 samples/sec Loss 2.3439 Epoch: 9 Global Step: 159300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:40:56,596-Speed 4583.58 samples/sec Loss 2.3594 Epoch: 9 Global Step: 159350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:41:07,751-Speed 4590.07 samples/sec Loss 2.3328 Epoch: 9 Global Step: 159400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:41:18,804-Speed 4632.91 samples/sec Loss 2.3199 Epoch: 9 Global Step: 159450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:41:29,884-Speed 4620.93 samples/sec Loss 2.3232 Epoch: 9 Global Step: 159500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:41:40,986-Speed 4612.28 samples/sec Loss 2.3193 Epoch: 9 Global Step: 159550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:41:52,224-Speed 4556.29 samples/sec Loss 2.3640 Epoch: 9 Global Step: 159600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:42:03,686-Speed 4467.09 samples/sec Loss 2.3242 Epoch: 9 Global Step: 159650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:42:14,609-Speed 4687.76 samples/sec Loss 2.3293 Epoch: 9 Global Step: 159700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:42:26,704-Speed 4233.40 samples/sec Loss 2.3203 Epoch: 9 Global Step: 159750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:42:38,553-Speed 4321.33 samples/sec Loss 2.3238 Epoch: 9 Global Step: 159800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:42:49,738-Speed 4577.63 samples/sec Loss 2.3175 Epoch: 9 Global Step: 159850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:43:00,697-Speed 4672.24 samples/sec Loss 2.3542 Epoch: 9 Global Step: 159900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:43:11,973-Speed 4541.05 samples/sec Loss 2.3259 Epoch: 9 Global Step: 159950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:43:23,248-Speed 4541.33 samples/sec Loss 2.3075 Epoch: 9 Global Step: 160000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:43:47,369-[lfw][160000]XNorm: 23.637213 Training: 2021-03-18 02:43:47,369-[lfw][160000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-18 02:43:47,369-[lfw][160000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:44:14,948-[cfp_fp][160000]XNorm: 20.356840 Training: 2021-03-18 02:44:14,948-[cfp_fp][160000]Accuracy-Flip: 0.97500+-0.00834 Training: 2021-03-18 02:44:14,948-[cfp_fp][160000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:44:38,833-[agedb_30][160000]XNorm: 23.017902 Training: 2021-03-18 02:44:38,833-[agedb_30][160000]Accuracy-Flip: 0.97317+-0.00673 Training: 2021-03-18 02:44:38,833-[agedb_30][160000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:44:50,676-Speed 585.63 samples/sec Loss 2.3364 Epoch: 9 Global Step: 160050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:01,762-Speed 4618.53 samples/sec Loss 2.3537 Epoch: 9 Global Step: 160100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:13,191-Speed 4480.30 samples/sec Loss 2.3200 Epoch: 9 Global Step: 160150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:25,105-Speed 4297.59 samples/sec Loss 2.3561 Epoch: 9 Global Step: 160200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:35,997-Speed 4701.15 samples/sec Loss 2.3481 Epoch: 9 Global Step: 160250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:46,989-Speed 4658.09 samples/sec Loss 2.3264 Epoch: 9 Global Step: 160300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:45:58,993-Speed 4265.94 samples/sec Loss 2.3526 Epoch: 9 Global Step: 160350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:46:10,805-Speed 4334.70 samples/sec Loss 2.3444 Epoch: 9 Global Step: 160400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:46:22,935-Speed 4221.23 samples/sec Loss 2.3499 Epoch: 9 Global Step: 160450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:46:34,064-Speed 4600.74 samples/sec Loss 2.3218 Epoch: 9 Global Step: 160500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:46:45,332-Speed 4544.08 samples/sec Loss 2.3389 Epoch: 9 Global Step: 160550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:46:56,411-Speed 4621.72 samples/sec Loss 2.3137 Epoch: 9 Global Step: 160600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:47:07,517-Speed 4610.50 samples/sec Loss 2.3696 Epoch: 9 Global Step: 160650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:47:18,664-Speed 4593.41 samples/sec Loss 2.3413 Epoch: 9 Global Step: 160700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:47:29,717-Speed 4632.43 samples/sec Loss 2.3574 Epoch: 9 Global Step: 160750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:47:41,150-Speed 4478.71 samples/sec Loss 2.3333 Epoch: 9 Global Step: 160800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:47:53,000-Speed 4320.91 samples/sec Loss 2.2883 Epoch: 9 Global Step: 160850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:48:04,280-Speed 4539.29 samples/sec Loss 2.3529 Epoch: 9 Global Step: 160900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:48:15,561-Speed 4538.84 samples/sec Loss 2.3390 Epoch: 9 Global Step: 160950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:48:26,966-Speed 4489.65 samples/sec Loss 2.3249 Epoch: 9 Global Step: 161000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:48:38,179-Speed 4566.40 samples/sec Loss 2.3519 Epoch: 9 Global Step: 161050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:48:49,670-Speed 4456.16 samples/sec Loss 2.3271 Epoch: 9 Global Step: 161100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:00,787-Speed 4605.78 samples/sec Loss 2.3521 Epoch: 9 Global Step: 161150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:12,183-Speed 4493.23 samples/sec Loss 2.3585 Epoch: 9 Global Step: 161200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:23,143-Speed 4672.15 samples/sec Loss 2.3216 Epoch: 9 Global Step: 161250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:34,280-Speed 4597.37 samples/sec Loss 2.3648 Epoch: 9 Global Step: 161300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:45,471-Speed 4575.33 samples/sec Loss 2.3507 Epoch: 9 Global Step: 161350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:49:56,675-Speed 4570.38 samples/sec Loss 2.3358 Epoch: 9 Global Step: 161400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:50:07,842-Speed 4585.13 samples/sec Loss 2.3460 Epoch: 9 Global Step: 161450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:50:19,175-Speed 4518.25 samples/sec Loss 2.3363 Epoch: 9 Global Step: 161500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:50:30,175-Speed 4654.65 samples/sec Loss 2.3492 Epoch: 9 Global Step: 161550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:50:41,302-Speed 4602.00 samples/sec Loss 2.3526 Epoch: 9 Global Step: 161600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:50:52,523-Speed 4562.88 samples/sec Loss 2.3614 Epoch: 9 Global Step: 161650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:51:03,577-Speed 4632.59 samples/sec Loss 2.3326 Epoch: 9 Global Step: 161700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:51:15,074-Speed 4453.56 samples/sec Loss 2.3464 Epoch: 9 Global Step: 161750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:51:26,344-Speed 4543.43 samples/sec Loss 2.3315 Epoch: 9 Global Step: 161800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:51:37,860-Speed 4446.19 samples/sec Loss 2.3897 Epoch: 9 Global Step: 161850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:51:49,302-Speed 4474.96 samples/sec Loss 2.3387 Epoch: 9 Global Step: 161900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:52:00,549-Speed 4552.56 samples/sec Loss 2.3650 Epoch: 9 Global Step: 161950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:52:11,807-Speed 4548.26 samples/sec Loss 2.3718 Epoch: 9 Global Step: 162000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:52:36,233-[lfw][162000]XNorm: 22.914941 Training: 2021-03-18 02:52:36,234-[lfw][162000]Accuracy-Flip: 0.99717+-0.00269 Training: 2021-03-18 02:52:36,234-[lfw][162000]Accuracy-Highest: 0.99767 Training: 2021-03-18 02:53:03,983-[cfp_fp][162000]XNorm: 19.811574 Training: 2021-03-18 02:53:03,983-[cfp_fp][162000]Accuracy-Flip: 0.96886+-0.00714 Training: 2021-03-18 02:53:03,984-[cfp_fp][162000]Accuracy-Highest: 0.97571 Training: 2021-03-18 02:53:27,950-[agedb_30][162000]XNorm: 22.200286 Training: 2021-03-18 02:53:27,950-[agedb_30][162000]Accuracy-Flip: 0.97417+-0.00817 Training: 2021-03-18 02:53:27,950-[agedb_30][162000]Accuracy-Highest: 0.97517 Training: 2021-03-18 02:53:39,292-Speed 585.25 samples/sec Loss 2.3454 Epoch: 9 Global Step: 162050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:53:50,336-Speed 4636.48 samples/sec Loss 2.3573 Epoch: 9 Global Step: 162100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:01,949-Speed 4409.23 samples/sec Loss 2.3238 Epoch: 9 Global Step: 162150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:12,958-Speed 4650.64 samples/sec Loss 2.3454 Epoch: 9 Global Step: 162200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:24,346-Speed 4496.45 samples/sec Loss 2.3799 Epoch: 9 Global Step: 162250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:35,197-Speed 4718.78 samples/sec Loss 2.3664 Epoch: 9 Global Step: 162300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:46,374-Speed 4581.29 samples/sec Loss 2.3351 Epoch: 9 Global Step: 162350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:54:57,696-Speed 4522.49 samples/sec Loss 2.3223 Epoch: 9 Global Step: 162400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:55:08,865-Speed 4584.60 samples/sec Loss 2.3513 Epoch: 9 Global Step: 162450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:55:20,039-Speed 4582.21 samples/sec Loss 2.3750 Epoch: 9 Global Step: 162500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:55:31,170-Speed 4599.82 samples/sec Loss 2.3565 Epoch: 9 Global Step: 162550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:55:43,063-Speed 4305.56 samples/sec Loss 2.3530 Epoch: 9 Global Step: 162600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:55:54,932-Speed 4314.06 samples/sec Loss 2.3391 Epoch: 9 Global Step: 162650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:56:06,254-Speed 4522.39 samples/sec Loss 2.3841 Epoch: 9 Global Step: 162700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:56:17,585-Speed 4518.98 samples/sec Loss 2.3836 Epoch: 9 Global Step: 162750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:56:28,912-Speed 4520.38 samples/sec Loss 2.3334 Epoch: 9 Global Step: 162800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:56:40,061-Speed 4592.50 samples/sec Loss 2.3830 Epoch: 9 Global Step: 162850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:56:52,202-Speed 4217.47 samples/sec Loss 2.3712 Epoch: 9 Global Step: 162900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:57:03,445-Speed 4553.99 samples/sec Loss 2.3472 Epoch: 9 Global Step: 162950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:57:14,948-Speed 4451.51 samples/sec Loss 2.3619 Epoch: 9 Global Step: 163000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:57:25,866-Speed 4689.58 samples/sec Loss 2.3555 Epoch: 9 Global Step: 163050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:57:37,945-Speed 4239.13 samples/sec Loss 2.3657 Epoch: 9 Global Step: 163100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:57:49,020-Speed 4623.17 samples/sec Loss 2.3609 Epoch: 9 Global Step: 163150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:01,398-Speed 4136.70 samples/sec Loss 2.3501 Epoch: 9 Global Step: 163200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:13,118-Speed 4368.86 samples/sec Loss 2.3979 Epoch: 9 Global Step: 163250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:24,960-Speed 4324.08 samples/sec Loss 2.3486 Epoch: 9 Global Step: 163300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:35,942-Speed 4662.82 samples/sec Loss 2.3822 Epoch: 9 Global Step: 163350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:47,437-Speed 4454.21 samples/sec Loss 2.3567 Epoch: 9 Global Step: 163400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:58:58,585-Speed 4593.00 samples/sec Loss 2.3385 Epoch: 9 Global Step: 163450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:59:10,209-Speed 4405.23 samples/sec Loss 2.3647 Epoch: 9 Global Step: 163500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:59:21,502-Speed 4534.01 samples/sec Loss 2.3837 Epoch: 9 Global Step: 163550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:59:32,600-Speed 4613.79 samples/sec Loss 2.3300 Epoch: 9 Global Step: 163600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:59:43,901-Speed 4530.87 samples/sec Loss 2.3388 Epoch: 9 Global Step: 163650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 02:59:55,985-Speed 4237.21 samples/sec Loss 2.3910 Epoch: 9 Global Step: 163700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:00:07,308-Speed 4521.97 samples/sec Loss 2.3246 Epoch: 9 Global Step: 163750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:00:18,356-Speed 4634.72 samples/sec Loss 2.3440 Epoch: 9 Global Step: 163800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:00:29,635-Speed 4539.38 samples/sec Loss 2.3469 Epoch: 9 Global Step: 163850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:00:40,978-Speed 4514.36 samples/sec Loss 2.3755 Epoch: 9 Global Step: 163900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:00:52,376-Speed 4492.09 samples/sec Loss 2.3453 Epoch: 9 Global Step: 163950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:01:03,534-Speed 4589.01 samples/sec Loss 2.3911 Epoch: 9 Global Step: 164000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:01:28,223-[lfw][164000]XNorm: 23.702650 Training: 2021-03-18 03:01:28,223-[lfw][164000]Accuracy-Flip: 0.99667+-0.00365 Training: 2021-03-18 03:01:28,223-[lfw][164000]Accuracy-Highest: 0.99767 Training: 2021-03-18 03:01:56,080-[cfp_fp][164000]XNorm: 20.475234 Training: 2021-03-18 03:01:56,080-[cfp_fp][164000]Accuracy-Flip: 0.97543+-0.00763 Training: 2021-03-18 03:01:56,080-[cfp_fp][164000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:02:19,954-[agedb_30][164000]XNorm: 23.168913 Training: 2021-03-18 03:02:19,954-[agedb_30][164000]Accuracy-Flip: 0.97400+-0.00597 Training: 2021-03-18 03:02:19,954-[agedb_30][164000]Accuracy-Highest: 0.97517 Training: 2021-03-18 03:02:31,049-Speed 585.05 samples/sec Loss 2.3893 Epoch: 9 Global Step: 164050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:02:42,352-Speed 4530.01 samples/sec Loss 2.3629 Epoch: 9 Global Step: 164100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:02:53,427-Speed 4623.46 samples/sec Loss 2.3452 Epoch: 9 Global Step: 164150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:03:04,433-Speed 4652.27 samples/sec Loss 2.3835 Epoch: 9 Global Step: 164200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:03:15,811-Speed 4500.32 samples/sec Loss 2.3719 Epoch: 9 Global Step: 164250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:03:27,134-Speed 4522.01 samples/sec Loss 2.3718 Epoch: 9 Global Step: 164300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:03:38,392-Speed 4547.92 samples/sec Loss 2.3931 Epoch: 9 Global Step: 164350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:03:49,363-Speed 4667.31 samples/sec Loss 2.3745 Epoch: 9 Global Step: 164400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:00,572-Speed 4567.95 samples/sec Loss 2.4062 Epoch: 9 Global Step: 164450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:11,742-Speed 4584.23 samples/sec Loss 2.3698 Epoch: 9 Global Step: 164500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:23,034-Speed 4534.63 samples/sec Loss 2.3593 Epoch: 9 Global Step: 164550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:34,219-Speed 4577.50 samples/sec Loss 2.3776 Epoch: 9 Global Step: 164600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:45,456-Speed 4556.50 samples/sec Loss 2.3499 Epoch: 9 Global Step: 164650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:04:56,680-Speed 4562.13 samples/sec Loss 2.3792 Epoch: 9 Global Step: 164700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:05:08,054-Speed 4501.69 samples/sec Loss 2.3587 Epoch: 9 Global Step: 164750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:05:19,278-Speed 4561.82 samples/sec Loss 2.3959 Epoch: 9 Global Step: 164800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:05:30,580-Speed 4530.47 samples/sec Loss 2.3599 Epoch: 9 Global Step: 164850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:05:41,750-Speed 4584.26 samples/sec Loss 2.3698 Epoch: 9 Global Step: 164900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:05:52,958-Speed 4568.43 samples/sec Loss 2.3521 Epoch: 9 Global Step: 164950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:06:04,160-Speed 4570.97 samples/sec Loss 2.3423 Epoch: 9 Global Step: 165000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:06:15,415-Speed 4549.56 samples/sec Loss 2.3586 Epoch: 9 Global Step: 165050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:06:26,578-Speed 4586.64 samples/sec Loss 2.3903 Epoch: 9 Global Step: 165100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:06:37,693-Speed 4606.92 samples/sec Loss 2.3557 Epoch: 9 Global Step: 165150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:06:48,952-Speed 4547.58 samples/sec Loss 2.3803 Epoch: 9 Global Step: 165200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:00,198-Speed 4552.92 samples/sec Loss 2.3912 Epoch: 9 Global Step: 165250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:11,473-Speed 4541.40 samples/sec Loss 2.3546 Epoch: 9 Global Step: 165300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:23,317-Speed 4323.05 samples/sec Loss 2.3747 Epoch: 9 Global Step: 165350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:34,566-Speed 4551.82 samples/sec Loss 2.3791 Epoch: 9 Global Step: 165400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:46,623-Speed 4246.59 samples/sec Loss 2.3655 Epoch: 9 Global Step: 165450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:07:57,785-Speed 4587.44 samples/sec Loss 2.3548 Epoch: 9 Global Step: 165500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:08:08,582-Speed 4742.26 samples/sec Loss 2.3672 Epoch: 9 Global Step: 165550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:08:20,111-Speed 4441.17 samples/sec Loss 2.3506 Epoch: 9 Global Step: 165600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:08:31,509-Speed 4492.31 samples/sec Loss 2.3957 Epoch: 9 Global Step: 165650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:08:42,705-Speed 4573.47 samples/sec Loss 2.3605 Epoch: 9 Global Step: 165700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:08:54,398-Speed 4379.04 samples/sec Loss 2.3525 Epoch: 9 Global Step: 165750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:09:05,511-Speed 4607.38 samples/sec Loss 2.3528 Epoch: 9 Global Step: 165800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:09:16,641-Speed 4600.55 samples/sec Loss 2.3860 Epoch: 9 Global Step: 165850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:09:27,631-Speed 4658.93 samples/sec Loss 2.3798 Epoch: 9 Global Step: 165900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:09:39,777-Speed 4215.64 samples/sec Loss 2.4078 Epoch: 9 Global Step: 165950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:09:50,737-Speed 4672.01 samples/sec Loss 2.3690 Epoch: 9 Global Step: 166000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:10:14,913-[lfw][166000]XNorm: 23.163415 Training: 2021-03-18 03:10:14,913-[lfw][166000]Accuracy-Flip: 0.99750+-0.00300 Training: 2021-03-18 03:10:14,913-[lfw][166000]Accuracy-Highest: 0.99767 Training: 2021-03-18 03:10:42,373-[cfp_fp][166000]XNorm: 19.972478 Training: 2021-03-18 03:10:42,373-[cfp_fp][166000]Accuracy-Flip: 0.97329+-0.00811 Training: 2021-03-18 03:10:42,373-[cfp_fp][166000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:11:06,292-[agedb_30][166000]XNorm: 22.428284 Training: 2021-03-18 03:11:06,292-[agedb_30][166000]Accuracy-Flip: 0.97550+-0.00671 Training: 2021-03-18 03:11:06,292-[agedb_30][166000]Accuracy-Highest: 0.97550 Training: 2021-03-18 03:11:18,172-Speed 585.58 samples/sec Loss 2.3671 Epoch: 9 Global Step: 166050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:11:29,974-Speed 4338.61 samples/sec Loss 2.3592 Epoch: 9 Global Step: 166100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:11:41,010-Speed 4639.39 samples/sec Loss 2.3644 Epoch: 9 Global Step: 166150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:11:52,789-Speed 4347.10 samples/sec Loss 2.3926 Epoch: 9 Global Step: 166200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:12:04,192-Speed 4490.32 samples/sec Loss 2.3810 Epoch: 9 Global Step: 166250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:12:15,019-Speed 4729.18 samples/sec Loss 2.3794 Epoch: 9 Global Step: 166300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:12:26,499-Speed 4460.14 samples/sec Loss 2.3862 Epoch: 9 Global Step: 166350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:12:37,616-Speed 4605.81 samples/sec Loss 2.3720 Epoch: 9 Global Step: 166400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:12:48,890-Speed 4541.99 samples/sec Loss 2.4021 Epoch: 9 Global Step: 166450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:00,777-Speed 4307.58 samples/sec Loss 2.3760 Epoch: 9 Global Step: 166500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:12,052-Speed 4541.17 samples/sec Loss 2.3723 Epoch: 9 Global Step: 166550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:23,175-Speed 4603.73 samples/sec Loss 2.3618 Epoch: 9 Global Step: 166600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:34,601-Speed 4481.25 samples/sec Loss 2.3706 Epoch: 9 Global Step: 166650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:45,882-Speed 4538.62 samples/sec Loss 2.3646 Epoch: 9 Global Step: 166700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:13:57,411-Speed 4441.44 samples/sec Loss 2.3632 Epoch: 9 Global Step: 166750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:14:08,921-Speed 4448.78 samples/sec Loss 2.3820 Epoch: 9 Global Step: 166800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:14:20,152-Speed 4559.10 samples/sec Loss 2.3445 Epoch: 9 Global Step: 166850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:14:31,525-Speed 4501.77 samples/sec Loss 2.3868 Epoch: 9 Global Step: 166900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:14:55,793-Speed 2109.88 samples/sec Loss 2.0854 Epoch: 10 Global Step: 166950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:15:07,781-Speed 4271.25 samples/sec Loss 2.0380 Epoch: 10 Global Step: 167000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:15:19,171-Speed 4495.25 samples/sec Loss 2.0320 Epoch: 10 Global Step: 167050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:15:30,180-Speed 4651.43 samples/sec Loss 2.0557 Epoch: 10 Global Step: 167100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:15:41,418-Speed 4556.34 samples/sec Loss 2.0403 Epoch: 10 Global Step: 167150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:15:52,468-Speed 4633.86 samples/sec Loss 2.0441 Epoch: 10 Global Step: 167200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:03,783-Speed 4524.95 samples/sec Loss 2.0207 Epoch: 10 Global Step: 167250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:14,810-Speed 4643.54 samples/sec Loss 2.0592 Epoch: 10 Global Step: 167300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:25,840-Speed 4642.04 samples/sec Loss 2.0443 Epoch: 10 Global Step: 167350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:36,675-Speed 4725.84 samples/sec Loss 2.0587 Epoch: 10 Global Step: 167400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:47,557-Speed 4705.48 samples/sec Loss 2.0692 Epoch: 10 Global Step: 167450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:16:58,604-Speed 4634.91 samples/sec Loss 2.0523 Epoch: 10 Global Step: 167500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:17:09,733-Speed 4601.07 samples/sec Loss 2.0858 Epoch: 10 Global Step: 167550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:17:20,756-Speed 4645.00 samples/sec Loss 2.0837 Epoch: 10 Global Step: 167600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:17:31,759-Speed 4653.52 samples/sec Loss 2.0746 Epoch: 10 Global Step: 167650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:17:42,853-Speed 4615.47 samples/sec Loss 2.1011 Epoch: 10 Global Step: 167700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:17:53,796-Speed 4678.98 samples/sec Loss 2.0573 Epoch: 10 Global Step: 167750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:18:04,638-Speed 4722.82 samples/sec Loss 2.0861 Epoch: 10 Global Step: 167800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:18:15,640-Speed 4653.68 samples/sec Loss 2.0919 Epoch: 10 Global Step: 167850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:18:26,272-Speed 4816.34 samples/sec Loss 2.0848 Epoch: 10 Global Step: 167900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:18:36,919-Speed 4808.94 samples/sec Loss 2.1122 Epoch: 10 Global Step: 167950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:18:47,869-Speed 4676.11 samples/sec Loss 2.1021 Epoch: 10 Global Step: 168000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:19:12,410-[lfw][168000]XNorm: 23.004836 Training: 2021-03-18 03:19:12,411-[lfw][168000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-18 03:19:12,411-[lfw][168000]Accuracy-Highest: 0.99767 Training: 2021-03-18 03:19:40,098-[cfp_fp][168000]XNorm: 19.571947 Training: 2021-03-18 03:19:40,098-[cfp_fp][168000]Accuracy-Flip: 0.97529+-0.00767 Training: 2021-03-18 03:19:40,098-[cfp_fp][168000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:20:03,980-[agedb_30][168000]XNorm: 22.271462 Training: 2021-03-18 03:20:03,980-[agedb_30][168000]Accuracy-Flip: 0.97600+-0.00633 Training: 2021-03-18 03:20:03,980-[agedb_30][168000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:20:14,868-Speed 588.52 samples/sec Loss 2.1045 Epoch: 10 Global Step: 168050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:20:25,870-Speed 4653.83 samples/sec Loss 2.0825 Epoch: 10 Global Step: 168100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:20:37,685-Speed 4334.15 samples/sec Loss 2.0862 Epoch: 10 Global Step: 168150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:20:48,631-Speed 4677.63 samples/sec Loss 2.0755 Epoch: 10 Global Step: 168200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:20:59,503-Speed 4709.55 samples/sec Loss 2.1225 Epoch: 10 Global Step: 168250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:21:10,386-Speed 4704.98 samples/sec Loss 2.0895 Epoch: 10 Global Step: 168300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:21:22,091-Speed 4374.48 samples/sec Loss 2.0990 Epoch: 10 Global Step: 168350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:21:32,892-Speed 4740.35 samples/sec Loss 2.1210 Epoch: 10 Global Step: 168400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:21:43,877-Speed 4661.39 samples/sec Loss 2.1266 Epoch: 10 Global Step: 168450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:21:54,843-Speed 4669.36 samples/sec Loss 2.1269 Epoch: 10 Global Step: 168500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:22:06,043-Speed 4571.66 samples/sec Loss 2.1526 Epoch: 10 Global Step: 168550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:22:18,011-Speed 4278.29 samples/sec Loss 2.0986 Epoch: 10 Global Step: 168600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:22:28,977-Speed 4669.22 samples/sec Loss 2.1285 Epoch: 10 Global Step: 168650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:22:39,840-Speed 4713.58 samples/sec Loss 2.1218 Epoch: 10 Global Step: 168700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:22:51,628-Speed 4343.84 samples/sec Loss 2.1461 Epoch: 10 Global Step: 168750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:02,640-Speed 4649.70 samples/sec Loss 2.1546 Epoch: 10 Global Step: 168800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:15,668-Speed 3929.98 samples/sec Loss 2.1324 Epoch: 10 Global Step: 168850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:26,616-Speed 4677.32 samples/sec Loss 2.1192 Epoch: 10 Global Step: 168900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:37,497-Speed 4705.49 samples/sec Loss 2.1421 Epoch: 10 Global Step: 168950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:48,337-Speed 4723.91 samples/sec Loss 2.1325 Epoch: 10 Global Step: 169000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:23:59,259-Speed 4687.74 samples/sec Loss 2.1245 Epoch: 10 Global Step: 169050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:24:10,879-Speed 4406.44 samples/sec Loss 2.1452 Epoch: 10 Global Step: 169100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:24:22,024-Speed 4594.27 samples/sec Loss 2.1742 Epoch: 10 Global Step: 169150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:24:32,798-Speed 4752.41 samples/sec Loss 2.1538 Epoch: 10 Global Step: 169200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:24:43,607-Speed 4737.22 samples/sec Loss 2.1533 Epoch: 10 Global Step: 169250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:24:55,320-Speed 4371.23 samples/sec Loss 2.1306 Epoch: 10 Global Step: 169300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:25:06,175-Speed 4717.25 samples/sec Loss 2.1822 Epoch: 10 Global Step: 169350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:25:17,124-Speed 4676.23 samples/sec Loss 2.1803 Epoch: 10 Global Step: 169400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:25:28,108-Speed 4661.68 samples/sec Loss 2.1500 Epoch: 10 Global Step: 169450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:25:38,947-Speed 4723.87 samples/sec Loss 2.1533 Epoch: 10 Global Step: 169500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:25:49,860-Speed 4692.13 samples/sec Loss 2.1422 Epoch: 10 Global Step: 169550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:00,993-Speed 4599.51 samples/sec Loss 2.1462 Epoch: 10 Global Step: 169600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:12,216-Speed 4562.21 samples/sec Loss 2.1487 Epoch: 10 Global Step: 169650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:23,276-Speed 4629.48 samples/sec Loss 2.1739 Epoch: 10 Global Step: 169700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:34,124-Speed 4720.40 samples/sec Loss 2.1649 Epoch: 10 Global Step: 169750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:45,068-Speed 4678.61 samples/sec Loss 2.1510 Epoch: 10 Global Step: 169800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:26:56,027-Speed 4672.17 samples/sec Loss 2.1780 Epoch: 10 Global Step: 169850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:27:06,976-Speed 4676.58 samples/sec Loss 2.1971 Epoch: 10 Global Step: 169900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:27:17,989-Speed 4649.13 samples/sec Loss 2.1893 Epoch: 10 Global Step: 169950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:27:28,831-Speed 4722.98 samples/sec Loss 2.1622 Epoch: 10 Global Step: 170000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:27:52,859-[lfw][170000]XNorm: 23.591122 Training: 2021-03-18 03:27:52,859-[lfw][170000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-18 03:27:52,859-[lfw][170000]Accuracy-Highest: 0.99800 Training: 2021-03-18 03:28:20,672-[cfp_fp][170000]XNorm: 20.492354 Training: 2021-03-18 03:28:20,672-[cfp_fp][170000]Accuracy-Flip: 0.97271+-0.00776 Training: 2021-03-18 03:28:20,672-[cfp_fp][170000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:28:45,031-[agedb_30][170000]XNorm: 23.093982 Training: 2021-03-18 03:28:45,031-[agedb_30][170000]Accuracy-Flip: 0.97483+-0.00565 Training: 2021-03-18 03:28:45,031-[agedb_30][170000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:28:55,869-Speed 588.25 samples/sec Loss 2.1950 Epoch: 10 Global Step: 170050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:29:06,735-Speed 4712.40 samples/sec Loss 2.1954 Epoch: 10 Global Step: 170100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:29:17,613-Speed 4706.81 samples/sec Loss 2.1768 Epoch: 10 Global Step: 170150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:29:28,614-Speed 4654.36 samples/sec Loss 2.1811 Epoch: 10 Global Step: 170200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:29:39,592-Speed 4664.28 samples/sec Loss 2.1585 Epoch: 10 Global Step: 170250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:29:50,240-Speed 4808.80 samples/sec Loss 2.1728 Epoch: 10 Global Step: 170300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:01,158-Speed 4689.71 samples/sec Loss 2.1831 Epoch: 10 Global Step: 170350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:11,966-Speed 4737.44 samples/sec Loss 2.2136 Epoch: 10 Global Step: 170400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:23,064-Speed 4613.66 samples/sec Loss 2.1849 Epoch: 10 Global Step: 170450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:33,934-Speed 4710.30 samples/sec Loss 2.1938 Epoch: 10 Global Step: 170500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:44,754-Speed 4732.38 samples/sec Loss 2.1989 Epoch: 10 Global Step: 170550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:30:55,710-Speed 4673.50 samples/sec Loss 2.2059 Epoch: 10 Global Step: 170600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:31:06,691-Speed 4662.88 samples/sec Loss 2.2141 Epoch: 10 Global Step: 170650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:31:17,814-Speed 4603.29 samples/sec Loss 2.2439 Epoch: 10 Global Step: 170700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:31:28,729-Speed 4691.32 samples/sec Loss 2.2110 Epoch: 10 Global Step: 170750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:31:39,644-Speed 4691.47 samples/sec Loss 2.1879 Epoch: 10 Global Step: 170800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:31:50,597-Speed 4674.71 samples/sec Loss 2.1740 Epoch: 10 Global Step: 170850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:02,422-Speed 4330.18 samples/sec Loss 2.1714 Epoch: 10 Global Step: 170900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:13,426-Speed 4653.17 samples/sec Loss 2.2066 Epoch: 10 Global Step: 170950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:24,357-Speed 4684.07 samples/sec Loss 2.1995 Epoch: 10 Global Step: 171000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:35,266-Speed 4693.66 samples/sec Loss 2.2028 Epoch: 10 Global Step: 171050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:47,168-Speed 4301.76 samples/sec Loss 2.2057 Epoch: 10 Global Step: 171100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:32:58,167-Speed 4655.31 samples/sec Loss 2.2214 Epoch: 10 Global Step: 171150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:33:09,325-Speed 4588.84 samples/sec Loss 2.2165 Epoch: 10 Global Step: 171200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:33:20,200-Speed 4708.27 samples/sec Loss 2.2217 Epoch: 10 Global Step: 171250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:33:30,936-Speed 4769.36 samples/sec Loss 2.2142 Epoch: 10 Global Step: 171300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:33:41,971-Speed 4640.41 samples/sec Loss 2.1997 Epoch: 10 Global Step: 171350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:33:52,847-Speed 4707.89 samples/sec Loss 2.1852 Epoch: 10 Global Step: 171400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:34:03,722-Speed 4708.23 samples/sec Loss 2.2062 Epoch: 10 Global Step: 171450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:34:14,695-Speed 4666.01 samples/sec Loss 2.2315 Epoch: 10 Global Step: 171500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:34:26,515-Speed 4332.10 samples/sec Loss 2.1942 Epoch: 10 Global Step: 171550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:34:37,368-Speed 4717.68 samples/sec Loss 2.2193 Epoch: 10 Global Step: 171600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:34:49,897-Speed 4086.67 samples/sec Loss 2.1719 Epoch: 10 Global Step: 171650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:00,819-Speed 4688.21 samples/sec Loss 2.2281 Epoch: 10 Global Step: 171700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:12,877-Speed 4246.51 samples/sec Loss 2.1975 Epoch: 10 Global Step: 171750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:23,608-Speed 4771.48 samples/sec Loss 2.2343 Epoch: 10 Global Step: 171800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:34,399-Speed 4744.84 samples/sec Loss 2.2359 Epoch: 10 Global Step: 171850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:45,405-Speed 4652.30 samples/sec Loss 2.2403 Epoch: 10 Global Step: 171900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:35:57,074-Speed 4388.13 samples/sec Loss 2.2647 Epoch: 10 Global Step: 171950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:36:07,831-Speed 4759.91 samples/sec Loss 2.2527 Epoch: 10 Global Step: 172000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:36:31,519-[lfw][172000]XNorm: 22.389622 Training: 2021-03-18 03:36:31,519-[lfw][172000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-18 03:36:31,519-[lfw][172000]Accuracy-Highest: 0.99800 Training: 2021-03-18 03:36:58,966-[cfp_fp][172000]XNorm: 19.325133 Training: 2021-03-18 03:36:58,966-[cfp_fp][172000]Accuracy-Flip: 0.97414+-0.00875 Training: 2021-03-18 03:36:58,966-[cfp_fp][172000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:37:22,862-[agedb_30][172000]XNorm: 21.899889 Training: 2021-03-18 03:37:22,862-[agedb_30][172000]Accuracy-Flip: 0.97450+-0.00687 Training: 2021-03-18 03:37:22,862-[agedb_30][172000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:37:34,613-Speed 589.98 samples/sec Loss 2.2419 Epoch: 10 Global Step: 172050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:37:45,464-Speed 4718.72 samples/sec Loss 2.2439 Epoch: 10 Global Step: 172100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:37:56,365-Speed 4697.00 samples/sec Loss 2.2497 Epoch: 10 Global Step: 172150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:38:07,139-Speed 4752.51 samples/sec Loss 2.2229 Epoch: 10 Global Step: 172200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:38:18,008-Speed 4710.96 samples/sec Loss 2.2418 Epoch: 10 Global Step: 172250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:38:28,968-Speed 4671.92 samples/sec Loss 2.2224 Epoch: 10 Global Step: 172300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:38:39,822-Speed 4717.17 samples/sec Loss 2.2063 Epoch: 10 Global Step: 172350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:38:50,683-Speed 4714.57 samples/sec Loss 2.2357 Epoch: 10 Global Step: 172400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:01,420-Speed 4768.79 samples/sec Loss 2.2372 Epoch: 10 Global Step: 172450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:12,418-Speed 4655.63 samples/sec Loss 2.2405 Epoch: 10 Global Step: 172500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:23,322-Speed 4695.75 samples/sec Loss 2.2355 Epoch: 10 Global Step: 172550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:34,392-Speed 4625.49 samples/sec Loss 2.2218 Epoch: 10 Global Step: 172600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:45,255-Speed 4713.47 samples/sec Loss 2.2376 Epoch: 10 Global Step: 172650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:39:56,043-Speed 4746.49 samples/sec Loss 2.2376 Epoch: 10 Global Step: 172700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:40:06,842-Speed 4741.74 samples/sec Loss 2.2687 Epoch: 10 Global Step: 172750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:40:17,651-Speed 4737.00 samples/sec Loss 2.2382 Epoch: 10 Global Step: 172800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:40:28,530-Speed 4706.43 samples/sec Loss 2.2614 Epoch: 10 Global Step: 172850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:40:39,502-Speed 4666.85 samples/sec Loss 2.2679 Epoch: 10 Global Step: 172900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:40:50,384-Speed 4705.32 samples/sec Loss 2.2854 Epoch: 10 Global Step: 172950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:41:01,054-Speed 4799.01 samples/sec Loss 2.2418 Epoch: 10 Global Step: 173000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:41:11,830-Speed 4751.59 samples/sec Loss 2.2398 Epoch: 10 Global Step: 173050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:41:22,773-Speed 4678.95 samples/sec Loss 2.2456 Epoch: 10 Global Step: 173100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-18 03:41:33,736-Speed 4670.51 samples/sec Loss 2.2841 Epoch: 10 Global Step: 173150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:41:44,758-Speed 4645.35 samples/sec Loss 2.2598 Epoch: 10 Global Step: 173200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:41:55,870-Speed 4608.06 samples/sec Loss 2.2714 Epoch: 10 Global Step: 173250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:42:06,771-Speed 4696.87 samples/sec Loss 2.2458 Epoch: 10 Global Step: 173300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:42:17,630-Speed 4715.51 samples/sec Loss 2.2661 Epoch: 10 Global Step: 173350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:42:28,555-Speed 4686.67 samples/sec Loss 2.2683 Epoch: 10 Global Step: 173400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:42:39,464-Speed 4693.43 samples/sec Loss 2.2501 Epoch: 10 Global Step: 173450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:42:50,495-Speed 4642.03 samples/sec Loss 2.2657 Epoch: 10 Global Step: 173500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:01,303-Speed 4737.69 samples/sec Loss 2.2462 Epoch: 10 Global Step: 173550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:12,597-Speed 4533.38 samples/sec Loss 2.2727 Epoch: 10 Global Step: 173600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:23,775-Speed 4580.79 samples/sec Loss 2.2524 Epoch: 10 Global Step: 173650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:34,890-Speed 4606.51 samples/sec Loss 2.2319 Epoch: 10 Global Step: 173700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:45,598-Speed 4781.86 samples/sec Loss 2.2759 Epoch: 10 Global Step: 173750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:43:57,310-Speed 4371.58 samples/sec Loss 2.3162 Epoch: 10 Global Step: 173800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:44:08,316-Speed 4652.65 samples/sec Loss 2.2905 Epoch: 10 Global Step: 173850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:44:19,301-Speed 4661.12 samples/sec Loss 2.2647 Epoch: 10 Global Step: 173900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:44:30,128-Speed 4729.30 samples/sec Loss 2.2894 Epoch: 10 Global Step: 173950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:44:41,750-Speed 4405.45 samples/sec Loss 2.2568 Epoch: 10 Global Step: 174000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:45:05,672-[lfw][174000]XNorm: 23.415569 Training: 2021-03-18 03:45:05,672-[lfw][174000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-18 03:45:05,672-[lfw][174000]Accuracy-Highest: 0.99800 Training: 2021-03-18 03:45:33,163-[cfp_fp][174000]XNorm: 20.645427 Training: 2021-03-18 03:45:33,163-[cfp_fp][174000]Accuracy-Flip: 0.97286+-0.00883 Training: 2021-03-18 03:45:33,163-[cfp_fp][174000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:45:56,850-[agedb_30][174000]XNorm: 22.780860 Training: 2021-03-18 03:45:56,851-[agedb_30][174000]Accuracy-Flip: 0.97533+-0.00774 Training: 2021-03-18 03:45:56,851-[agedb_30][174000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:46:07,727-Speed 595.52 samples/sec Loss 2.2911 Epoch: 10 Global Step: 174050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:46:18,638-Speed 4692.76 samples/sec Loss 2.2400 Epoch: 10 Global Step: 174100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:46:29,650-Speed 4649.77 samples/sec Loss 2.3202 Epoch: 10 Global Step: 174150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:46:40,483-Speed 4726.50 samples/sec Loss 2.2762 Epoch: 10 Global Step: 174200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:46:51,504-Speed 4645.98 samples/sec Loss 2.2986 Epoch: 10 Global Step: 174250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:02,308-Speed 4739.43 samples/sec Loss 2.2643 Epoch: 10 Global Step: 174300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:13,214-Speed 4694.81 samples/sec Loss 2.2670 Epoch: 10 Global Step: 174350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:24,407-Speed 4574.44 samples/sec Loss 2.2512 Epoch: 10 Global Step: 174400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:37,016-Speed 4061.06 samples/sec Loss 2.3118 Epoch: 10 Global Step: 174450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:48,820-Speed 4337.72 samples/sec Loss 2.2893 Epoch: 10 Global Step: 174500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:47:59,686-Speed 4712.01 samples/sec Loss 2.2941 Epoch: 10 Global Step: 174550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:48:10,593-Speed 4694.47 samples/sec Loss 2.2730 Epoch: 10 Global Step: 174600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:48:22,375-Speed 4345.84 samples/sec Loss 2.2899 Epoch: 10 Global Step: 174650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:48:33,184-Speed 4737.28 samples/sec Loss 2.2578 Epoch: 10 Global Step: 174700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:48:44,018-Speed 4725.98 samples/sec Loss 2.2892 Epoch: 10 Global Step: 174750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:48:55,034-Speed 4647.95 samples/sec Loss 2.2526 Epoch: 10 Global Step: 174800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:49:06,569-Speed 4439.18 samples/sec Loss 2.2804 Epoch: 10 Global Step: 174850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:49:18,224-Speed 4393.17 samples/sec Loss 2.2977 Epoch: 10 Global Step: 174900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:49:29,123-Speed 4697.71 samples/sec Loss 2.2548 Epoch: 10 Global Step: 174950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:49:40,048-Speed 4686.77 samples/sec Loss 2.2736 Epoch: 10 Global Step: 175000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:49:51,052-Speed 4653.22 samples/sec Loss 2.2978 Epoch: 10 Global Step: 175050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:02,206-Speed 4590.82 samples/sec Loss 2.2936 Epoch: 10 Global Step: 175100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:12,998-Speed 4744.53 samples/sec Loss 2.2947 Epoch: 10 Global Step: 175150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:24,113-Speed 4606.55 samples/sec Loss 2.3057 Epoch: 10 Global Step: 175200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:34,932-Speed 4732.71 samples/sec Loss 2.2856 Epoch: 10 Global Step: 175250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:45,976-Speed 4636.47 samples/sec Loss 2.2947 Epoch: 10 Global Step: 175300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:50:57,016-Speed 4637.78 samples/sec Loss 2.2748 Epoch: 10 Global Step: 175350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:51:07,755-Speed 4768.20 samples/sec Loss 2.2776 Epoch: 10 Global Step: 175400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:51:18,661-Speed 4694.99 samples/sec Loss 2.3161 Epoch: 10 Global Step: 175450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:51:29,475-Speed 4734.98 samples/sec Loss 2.3243 Epoch: 10 Global Step: 175500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:51:40,501-Speed 4643.89 samples/sec Loss 2.2797 Epoch: 10 Global Step: 175550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:51:51,598-Speed 4614.09 samples/sec Loss 2.3170 Epoch: 10 Global Step: 175600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:02,387-Speed 4745.81 samples/sec Loss 2.2946 Epoch: 10 Global Step: 175650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:13,367-Speed 4663.57 samples/sec Loss 2.2933 Epoch: 10 Global Step: 175700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:24,373-Speed 4651.92 samples/sec Loss 2.3002 Epoch: 10 Global Step: 175750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:35,261-Speed 4703.15 samples/sec Loss 2.3201 Epoch: 10 Global Step: 175800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:46,161-Speed 4697.44 samples/sec Loss 2.3118 Epoch: 10 Global Step: 175850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:52:57,200-Speed 4638.32 samples/sec Loss 2.3196 Epoch: 10 Global Step: 175900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:53:08,031-Speed 4727.67 samples/sec Loss 2.2781 Epoch: 10 Global Step: 175950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:53:18,742-Speed 4780.09 samples/sec Loss 2.2900 Epoch: 10 Global Step: 176000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:53:42,610-[lfw][176000]XNorm: 23.745472 Training: 2021-03-18 03:53:42,611-[lfw][176000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-18 03:53:42,611-[lfw][176000]Accuracy-Highest: 0.99800 Training: 2021-03-18 03:54:10,058-[cfp_fp][176000]XNorm: 20.622878 Training: 2021-03-18 03:54:10,058-[cfp_fp][176000]Accuracy-Flip: 0.97529+-0.01042 Training: 2021-03-18 03:54:10,058-[cfp_fp][176000]Accuracy-Highest: 0.97571 Training: 2021-03-18 03:54:33,828-[agedb_30][176000]XNorm: 23.099543 Training: 2021-03-18 03:54:33,829-[agedb_30][176000]Accuracy-Flip: 0.97467+-0.00846 Training: 2021-03-18 03:54:33,829-[agedb_30][176000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:54:44,656-Speed 595.95 samples/sec Loss 2.3256 Epoch: 10 Global Step: 176050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:54:55,556-Speed 4697.49 samples/sec Loss 2.2855 Epoch: 10 Global Step: 176100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:55:06,152-Speed 4832.29 samples/sec Loss 2.2958 Epoch: 10 Global Step: 176150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:55:17,077-Speed 4686.58 samples/sec Loss 2.3209 Epoch: 10 Global Step: 176200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:55:27,908-Speed 4727.56 samples/sec Loss 2.3419 Epoch: 10 Global Step: 176250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:55:39,158-Speed 4551.20 samples/sec Loss 2.2903 Epoch: 10 Global Step: 176300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:55:49,957-Speed 4741.70 samples/sec Loss 2.3059 Epoch: 10 Global Step: 176350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:01,173-Speed 4565.00 samples/sec Loss 2.3124 Epoch: 10 Global Step: 176400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:12,115-Speed 4679.38 samples/sec Loss 2.3167 Epoch: 10 Global Step: 176450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:23,125-Speed 4650.55 samples/sec Loss 2.3256 Epoch: 10 Global Step: 176500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:34,155-Speed 4642.32 samples/sec Loss 2.3245 Epoch: 10 Global Step: 176550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:44,890-Speed 4769.49 samples/sec Loss 2.3082 Epoch: 10 Global Step: 176600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:56:56,722-Speed 4327.59 samples/sec Loss 2.3045 Epoch: 10 Global Step: 176650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:57:07,568-Speed 4721.04 samples/sec Loss 2.2794 Epoch: 10 Global Step: 176700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:57:18,468-Speed 4697.46 samples/sec Loss 2.3189 Epoch: 10 Global Step: 176750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:57:29,311-Speed 4722.35 samples/sec Loss 2.3364 Epoch: 10 Global Step: 176800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:57:40,242-Speed 4684.32 samples/sec Loss 2.3150 Epoch: 10 Global Step: 176850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:57:51,778-Speed 4438.68 samples/sec Loss 2.3139 Epoch: 10 Global Step: 176900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:02,767-Speed 4659.58 samples/sec Loss 2.3257 Epoch: 10 Global Step: 176950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:13,737-Speed 4667.47 samples/sec Loss 2.3231 Epoch: 10 Global Step: 177000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:24,624-Speed 4703.06 samples/sec Loss 2.3243 Epoch: 10 Global Step: 177050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:35,361-Speed 4768.79 samples/sec Loss 2.3302 Epoch: 10 Global Step: 177100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:46,172-Speed 4736.52 samples/sec Loss 2.3242 Epoch: 10 Global Step: 177150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:58:57,011-Speed 4723.73 samples/sec Loss 2.3295 Epoch: 10 Global Step: 177200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:59:07,929-Speed 4689.97 samples/sec Loss 2.2978 Epoch: 10 Global Step: 177250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:59:19,794-Speed 4315.24 samples/sec Loss 2.3060 Epoch: 10 Global Step: 177300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:59:31,486-Speed 4379.34 samples/sec Loss 2.3400 Epoch: 10 Global Step: 177350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:59:43,118-Speed 4402.08 samples/sec Loss 2.3311 Epoch: 10 Global Step: 177400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 03:59:53,995-Speed 4707.37 samples/sec Loss 2.3231 Epoch: 10 Global Step: 177450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:00:05,090-Speed 4615.15 samples/sec Loss 2.3322 Epoch: 10 Global Step: 177500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:00:16,209-Speed 4605.02 samples/sec Loss 2.3455 Epoch: 10 Global Step: 177550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:00:27,918-Speed 4372.68 samples/sec Loss 2.3323 Epoch: 10 Global Step: 177600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:00:39,450-Speed 4440.08 samples/sec Loss 2.3606 Epoch: 10 Global Step: 177650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:00:50,668-Speed 4564.70 samples/sec Loss 2.3261 Epoch: 10 Global Step: 177700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:02,189-Speed 4444.05 samples/sec Loss 2.3057 Epoch: 10 Global Step: 177750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:13,270-Speed 4620.80 samples/sec Loss 2.3309 Epoch: 10 Global Step: 177800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:24,296-Speed 4643.96 samples/sec Loss 2.3366 Epoch: 10 Global Step: 177850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:35,328-Speed 4641.64 samples/sec Loss 2.3589 Epoch: 10 Global Step: 177900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:46,292-Speed 4670.11 samples/sec Loss 2.3297 Epoch: 10 Global Step: 177950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:01:57,285-Speed 4657.80 samples/sec Loss 2.3077 Epoch: 10 Global Step: 178000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:02:21,202-[lfw][178000]XNorm: 22.726904 Training: 2021-03-18 04:02:21,202-[lfw][178000]Accuracy-Flip: 0.99667+-0.00298 Training: 2021-03-18 04:02:21,202-[lfw][178000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:02:48,778-[cfp_fp][178000]XNorm: 19.819716 Training: 2021-03-18 04:02:48,778-[cfp_fp][178000]Accuracy-Flip: 0.97657+-0.00780 Training: 2021-03-18 04:02:48,778-[cfp_fp][178000]Accuracy-Highest: 0.97657 Training: 2021-03-18 04:03:12,537-[agedb_30][178000]XNorm: 22.259821 Training: 2021-03-18 04:03:12,537-[agedb_30][178000]Accuracy-Flip: 0.97467+-0.00627 Training: 2021-03-18 04:03:12,539-[agedb_30][178000]Accuracy-Highest: 0.97600 Training: 2021-03-18 04:03:23,356-Speed 594.86 samples/sec Loss 2.3415 Epoch: 10 Global Step: 178050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:03:34,115-Speed 4759.10 samples/sec Loss 2.3486 Epoch: 10 Global Step: 178100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:03:44,836-Speed 4775.93 samples/sec Loss 2.3249 Epoch: 10 Global Step: 178150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:03:55,878-Speed 4637.21 samples/sec Loss 2.3217 Epoch: 10 Global Step: 178200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:04:06,850-Speed 4666.42 samples/sec Loss 2.3089 Epoch: 10 Global Step: 178250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:04:17,830-Speed 4663.48 samples/sec Loss 2.3589 Epoch: 10 Global Step: 178300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:04:28,778-Speed 4676.91 samples/sec Loss 2.3408 Epoch: 10 Global Step: 178350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:04:39,478-Speed 4785.45 samples/sec Loss 2.3351 Epoch: 10 Global Step: 178400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:04:50,410-Speed 4683.64 samples/sec Loss 2.3558 Epoch: 10 Global Step: 178450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:01,268-Speed 4715.67 samples/sec Loss 2.3326 Epoch: 10 Global Step: 178500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:12,088-Speed 4732.00 samples/sec Loss 2.3391 Epoch: 10 Global Step: 178550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:23,211-Speed 4603.67 samples/sec Loss 2.3347 Epoch: 10 Global Step: 178600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:33,950-Speed 4767.56 samples/sec Loss 2.3826 Epoch: 10 Global Step: 178650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:45,129-Speed 4580.34 samples/sec Loss 2.3312 Epoch: 10 Global Step: 178700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:05:56,323-Speed 4574.32 samples/sec Loss 2.3163 Epoch: 10 Global Step: 178750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:06:06,980-Speed 4804.51 samples/sec Loss 2.3389 Epoch: 10 Global Step: 178800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:06:17,569-Speed 4835.58 samples/sec Loss 2.3336 Epoch: 10 Global Step: 178850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:06:28,476-Speed 4694.56 samples/sec Loss 2.3637 Epoch: 10 Global Step: 178900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:06:39,338-Speed 4714.09 samples/sec Loss 2.3376 Epoch: 10 Global Step: 178950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:06:50,510-Speed 4582.94 samples/sec Loss 2.3498 Epoch: 10 Global Step: 179000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:01,375-Speed 4712.58 samples/sec Loss 2.3470 Epoch: 10 Global Step: 179050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:12,336-Speed 4671.53 samples/sec Loss 2.3238 Epoch: 10 Global Step: 179100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:23,333-Speed 4656.37 samples/sec Loss 2.3565 Epoch: 10 Global Step: 179150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:34,461-Speed 4600.99 samples/sec Loss 2.3525 Epoch: 10 Global Step: 179200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:45,226-Speed 4756.47 samples/sec Loss 2.3326 Epoch: 10 Global Step: 179250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:07:56,184-Speed 4672.44 samples/sec Loss 2.3330 Epoch: 10 Global Step: 179300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:08:07,094-Speed 4693.52 samples/sec Loss 2.3429 Epoch: 10 Global Step: 179350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:08:17,951-Speed 4716.27 samples/sec Loss 2.3587 Epoch: 10 Global Step: 179400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:08:29,539-Speed 4418.59 samples/sec Loss 2.3624 Epoch: 10 Global Step: 179450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:08:40,479-Speed 4680.57 samples/sec Loss 2.3211 Epoch: 10 Global Step: 179500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:08:51,436-Speed 4673.07 samples/sec Loss 2.3626 Epoch: 10 Global Step: 179550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:02,275-Speed 4723.91 samples/sec Loss 2.3208 Epoch: 10 Global Step: 179600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:13,254-Speed 4663.92 samples/sec Loss 2.3713 Epoch: 10 Global Step: 179650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:24,057-Speed 4739.50 samples/sec Loss 2.3712 Epoch: 10 Global Step: 179700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:35,756-Speed 4377.06 samples/sec Loss 2.3734 Epoch: 10 Global Step: 179750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:46,758-Speed 4654.01 samples/sec Loss 2.3653 Epoch: 10 Global Step: 179800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:09:57,727-Speed 4667.75 samples/sec Loss 2.3769 Epoch: 10 Global Step: 179850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:10:08,800-Speed 4624.31 samples/sec Loss 2.3456 Epoch: 10 Global Step: 179900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:10:19,612-Speed 4735.82 samples/sec Loss 2.3614 Epoch: 10 Global Step: 179950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:10:30,496-Speed 4704.46 samples/sec Loss 2.3843 Epoch: 10 Global Step: 180000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:10:54,832-[lfw][180000]XNorm: 23.189939 Training: 2021-03-18 04:10:54,832-[lfw][180000]Accuracy-Flip: 0.99733+-0.00281 Training: 2021-03-18 04:10:54,833-[lfw][180000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:11:22,323-[cfp_fp][180000]XNorm: 20.544607 Training: 2021-03-18 04:11:22,323-[cfp_fp][180000]Accuracy-Flip: 0.97129+-0.00749 Training: 2021-03-18 04:11:22,323-[cfp_fp][180000]Accuracy-Highest: 0.97657 Training: 2021-03-18 04:11:46,085-[agedb_30][180000]XNorm: 22.720856 Training: 2021-03-18 04:11:46,086-[agedb_30][180000]Accuracy-Flip: 0.97617+-0.00691 Training: 2021-03-18 04:11:46,086-[agedb_30][180000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:11:56,731-Speed 593.73 samples/sec Loss 2.3644 Epoch: 10 Global Step: 180050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:12:07,659-Speed 4685.66 samples/sec Loss 2.3621 Epoch: 10 Global Step: 180100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:12:18,379-Speed 4776.04 samples/sec Loss 2.3424 Epoch: 10 Global Step: 180150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:12:30,208-Speed 4328.81 samples/sec Loss 2.3693 Epoch: 10 Global Step: 180200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:12:42,021-Speed 4334.35 samples/sec Loss 2.3603 Epoch: 10 Global Step: 180250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:12:53,794-Speed 4349.31 samples/sec Loss 2.3415 Epoch: 10 Global Step: 180300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:13:04,691-Speed 4698.56 samples/sec Loss 2.3441 Epoch: 10 Global Step: 180350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:13:15,647-Speed 4673.67 samples/sec Loss 2.3436 Epoch: 10 Global Step: 180400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:13:26,476-Speed 4728.27 samples/sec Loss 2.3514 Epoch: 10 Global Step: 180450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:13:39,078-Speed 4063.19 samples/sec Loss 2.3675 Epoch: 10 Global Step: 180500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:13:49,894-Speed 4734.00 samples/sec Loss 2.3546 Epoch: 10 Global Step: 180550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:01,501-Speed 4411.27 samples/sec Loss 2.3759 Epoch: 10 Global Step: 180600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:12,529-Speed 4642.98 samples/sec Loss 2.3716 Epoch: 10 Global Step: 180650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:23,548-Speed 4646.85 samples/sec Loss 2.3582 Epoch: 10 Global Step: 180700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:34,563-Speed 4648.39 samples/sec Loss 2.3833 Epoch: 10 Global Step: 180750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:45,548-Speed 4661.31 samples/sec Loss 2.3580 Epoch: 10 Global Step: 180800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:14:56,329-Speed 4749.51 samples/sec Loss 2.3477 Epoch: 10 Global Step: 180850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:15:07,377-Speed 4634.48 samples/sec Loss 2.3689 Epoch: 10 Global Step: 180900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:15:18,463-Speed 4618.62 samples/sec Loss 2.3878 Epoch: 10 Global Step: 180950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:15:29,230-Speed 4755.82 samples/sec Loss 2.3858 Epoch: 10 Global Step: 181000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:15:40,068-Speed 4724.28 samples/sec Loss 2.3665 Epoch: 10 Global Step: 181050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:15:50,976-Speed 4694.01 samples/sec Loss 2.3428 Epoch: 10 Global Step: 181100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:02,035-Speed 4630.26 samples/sec Loss 2.3525 Epoch: 10 Global Step: 181150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:13,043-Speed 4651.07 samples/sec Loss 2.3234 Epoch: 10 Global Step: 181200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:23,743-Speed 4785.54 samples/sec Loss 2.4066 Epoch: 10 Global Step: 181250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:34,634-Speed 4701.40 samples/sec Loss 2.3420 Epoch: 10 Global Step: 181300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:45,604-Speed 4667.42 samples/sec Loss 2.3920 Epoch: 10 Global Step: 181350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:16:56,737-Speed 4599.53 samples/sec Loss 2.3763 Epoch: 10 Global Step: 181400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:17:07,845-Speed 4609.50 samples/sec Loss 2.3542 Epoch: 10 Global Step: 181450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:17:18,893-Speed 4634.37 samples/sec Loss 2.3628 Epoch: 10 Global Step: 181500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:17:29,877-Speed 4662.03 samples/sec Loss 2.3642 Epoch: 10 Global Step: 181550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:17:40,887-Speed 4650.44 samples/sec Loss 2.3738 Epoch: 10 Global Step: 181600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:17:52,039-Speed 4591.41 samples/sec Loss 2.3191 Epoch: 10 Global Step: 181650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:03,266-Speed 4561.02 samples/sec Loss 2.3838 Epoch: 10 Global Step: 181700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:13,821-Speed 4850.83 samples/sec Loss 2.3584 Epoch: 10 Global Step: 181750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:24,984-Speed 4587.04 samples/sec Loss 2.3758 Epoch: 10 Global Step: 181800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:35,833-Speed 4719.67 samples/sec Loss 2.3638 Epoch: 10 Global Step: 181850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:46,857-Speed 4644.69 samples/sec Loss 2.3737 Epoch: 10 Global Step: 181900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:18:58,015-Speed 4588.87 samples/sec Loss 2.3344 Epoch: 10 Global Step: 181950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:19:08,918-Speed 4696.08 samples/sec Loss 2.3598 Epoch: 10 Global Step: 182000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:19:32,877-[lfw][182000]XNorm: 22.520193 Training: 2021-03-18 04:19:32,878-[lfw][182000]Accuracy-Flip: 0.99733+-0.00300 Training: 2021-03-18 04:19:32,878-[lfw][182000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:20:00,386-[cfp_fp][182000]XNorm: 19.585551 Training: 2021-03-18 04:20:00,387-[cfp_fp][182000]Accuracy-Flip: 0.97371+-0.00948 Training: 2021-03-18 04:20:00,387-[cfp_fp][182000]Accuracy-Highest: 0.97657 Training: 2021-03-18 04:20:24,161-[agedb_30][182000]XNorm: 21.780116 Training: 2021-03-18 04:20:24,161-[agedb_30][182000]Accuracy-Flip: 0.97250+-0.00655 Training: 2021-03-18 04:20:24,161-[agedb_30][182000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:20:35,021-Speed 594.65 samples/sec Loss 2.3785 Epoch: 10 Global Step: 182050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:20:46,069-Speed 4634.34 samples/sec Loss 2.3830 Epoch: 10 Global Step: 182100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:20:56,909-Speed 4723.71 samples/sec Loss 2.3890 Epoch: 10 Global Step: 182150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:21:07,780-Speed 4710.06 samples/sec Loss 2.3782 Epoch: 10 Global Step: 182200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:21:19,512-Speed 4364.33 samples/sec Loss 2.3601 Epoch: 10 Global Step: 182250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:21:30,527-Speed 4648.65 samples/sec Loss 2.4009 Epoch: 10 Global Step: 182300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:21:41,402-Speed 4707.94 samples/sec Loss 2.3955 Epoch: 10 Global Step: 182350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:21:52,425-Speed 4645.20 samples/sec Loss 2.3690 Epoch: 10 Global Step: 182400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:03,323-Speed 4698.48 samples/sec Loss 2.3497 Epoch: 10 Global Step: 182450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:14,175-Speed 4718.30 samples/sec Loss 2.3727 Epoch: 10 Global Step: 182500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:25,992-Speed 4333.07 samples/sec Loss 2.3446 Epoch: 10 Global Step: 182550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:37,071-Speed 4621.60 samples/sec Loss 2.3780 Epoch: 10 Global Step: 182600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:48,053-Speed 4662.55 samples/sec Loss 2.3797 Epoch: 10 Global Step: 182650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:22:59,072-Speed 4646.63 samples/sec Loss 2.3455 Epoch: 10 Global Step: 182700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:23:09,972-Speed 4697.86 samples/sec Loss 2.3896 Epoch: 10 Global Step: 182750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:23:20,874-Speed 4696.67 samples/sec Loss 2.3749 Epoch: 10 Global Step: 182800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:23:31,539-Speed 4800.97 samples/sec Loss 2.3761 Epoch: 10 Global Step: 182850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:23:42,459-Speed 4688.74 samples/sec Loss 2.3776 Epoch: 10 Global Step: 182900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:23:53,134-Speed 4796.63 samples/sec Loss 2.3641 Epoch: 10 Global Step: 182950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:24:04,138-Speed 4653.44 samples/sec Loss 2.3692 Epoch: 10 Global Step: 183000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:24:14,991-Speed 4717.90 samples/sec Loss 2.3701 Epoch: 10 Global Step: 183050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:24:26,706-Speed 4370.67 samples/sec Loss 2.3917 Epoch: 10 Global Step: 183100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:24:39,170-Speed 4108.25 samples/sec Loss 2.3818 Epoch: 10 Global Step: 183150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:24:50,600-Speed 4479.80 samples/sec Loss 2.3918 Epoch: 10 Global Step: 183200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:02,385-Speed 4344.72 samples/sec Loss 2.3811 Epoch: 10 Global Step: 183250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:13,094-Speed 4781.07 samples/sec Loss 2.3885 Epoch: 10 Global Step: 183300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:24,661-Speed 4426.81 samples/sec Loss 2.3846 Epoch: 10 Global Step: 183350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:35,681-Speed 4646.32 samples/sec Loss 2.3779 Epoch: 10 Global Step: 183400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:47,587-Speed 4300.39 samples/sec Loss 2.3693 Epoch: 10 Global Step: 183450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:25:58,373-Speed 4747.54 samples/sec Loss 2.3943 Epoch: 10 Global Step: 183500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:26:09,396-Speed 4644.78 samples/sec Loss 2.3829 Epoch: 10 Global Step: 183550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:26:20,939-Speed 4435.83 samples/sec Loss 2.3471 Epoch: 10 Global Step: 183600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:26:44,558-Speed 2167.87 samples/sec Loss 1.8537 Epoch: 11 Global Step: 183650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:26:55,469-Speed 4692.72 samples/sec Loss 1.7171 Epoch: 11 Global Step: 183700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:27:06,272-Speed 4739.93 samples/sec Loss 1.6866 Epoch: 11 Global Step: 183750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:27:17,242-Speed 4667.33 samples/sec Loss 1.6626 Epoch: 11 Global Step: 183800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:27:27,977-Speed 4769.76 samples/sec Loss 1.6170 Epoch: 11 Global Step: 183850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:27:38,679-Speed 4784.64 samples/sec Loss 1.6199 Epoch: 11 Global Step: 183900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:27:49,466-Speed 4746.78 samples/sec Loss 1.5814 Epoch: 11 Global Step: 183950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:28:00,303-Speed 4724.76 samples/sec Loss 1.5758 Epoch: 11 Global Step: 184000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:28:24,780-[lfw][184000]XNorm: 22.866468 Training: 2021-03-18 04:28:24,781-[lfw][184000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 04:28:24,781-[lfw][184000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:28:52,366-[cfp_fp][184000]XNorm: 20.228370 Training: 2021-03-18 04:28:52,366-[cfp_fp][184000]Accuracy-Flip: 0.97857+-0.00731 Training: 2021-03-18 04:28:52,366-[cfp_fp][184000]Accuracy-Highest: 0.97857 Training: 2021-03-18 04:29:16,205-[agedb_30][184000]XNorm: 22.465259 Training: 2021-03-18 04:29:16,205-[agedb_30][184000]Accuracy-Flip: 0.97650+-0.00584 Training: 2021-03-18 04:29:16,205-[agedb_30][184000]Accuracy-Highest: 0.97650 Training: 2021-03-18 04:29:26,913-Speed 591.16 samples/sec Loss 1.5756 Epoch: 11 Global Step: 184050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:29:37,797-Speed 4704.79 samples/sec Loss 1.5815 Epoch: 11 Global Step: 184100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:29:48,412-Speed 4823.43 samples/sec Loss 1.5557 Epoch: 11 Global Step: 184150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:29:59,307-Speed 4699.94 samples/sec Loss 1.5451 Epoch: 11 Global Step: 184200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:30:10,223-Speed 4690.41 samples/sec Loss 1.5353 Epoch: 11 Global Step: 184250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:30:21,092-Speed 4710.87 samples/sec Loss 1.5167 Epoch: 11 Global Step: 184300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:30:31,836-Speed 4766.07 samples/sec Loss 1.5170 Epoch: 11 Global Step: 184350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:30:42,636-Speed 4741.19 samples/sec Loss 1.5177 Epoch: 11 Global Step: 184400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:30:53,421-Speed 4747.61 samples/sec Loss 1.5282 Epoch: 11 Global Step: 184450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:04,123-Speed 4784.35 samples/sec Loss 1.5182 Epoch: 11 Global Step: 184500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:15,087-Speed 4670.36 samples/sec Loss 1.4810 Epoch: 11 Global Step: 184550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:25,672-Speed 4837.04 samples/sec Loss 1.5047 Epoch: 11 Global Step: 184600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:36,633-Speed 4671.50 samples/sec Loss 1.4544 Epoch: 11 Global Step: 184650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:47,517-Speed 4704.72 samples/sec Loss 1.4655 Epoch: 11 Global Step: 184700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:31:58,584-Speed 4626.66 samples/sec Loss 1.4800 Epoch: 11 Global Step: 184750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:32:09,285-Speed 4784.78 samples/sec Loss 1.4429 Epoch: 11 Global Step: 184800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:32:20,037-Speed 4762.27 samples/sec Loss 1.4593 Epoch: 11 Global Step: 184850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:32:30,810-Speed 4752.76 samples/sec Loss 1.4666 Epoch: 11 Global Step: 184900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:32:41,873-Speed 4628.61 samples/sec Loss 1.4752 Epoch: 11 Global Step: 184950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:32:52,675-Speed 4739.84 samples/sec Loss 1.4726 Epoch: 11 Global Step: 185000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:03,545-Speed 4710.81 samples/sec Loss 1.4446 Epoch: 11 Global Step: 185050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:15,139-Speed 4416.12 samples/sec Loss 1.4520 Epoch: 11 Global Step: 185100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:25,659-Speed 4867.21 samples/sec Loss 1.4483 Epoch: 11 Global Step: 185150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:36,548-Speed 4702.60 samples/sec Loss 1.4411 Epoch: 11 Global Step: 185200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:47,265-Speed 4777.53 samples/sec Loss 1.4464 Epoch: 11 Global Step: 185250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:33:58,029-Speed 4757.00 samples/sec Loss 1.4308 Epoch: 11 Global Step: 185300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:34:08,566-Speed 4859.29 samples/sec Loss 1.4426 Epoch: 11 Global Step: 185350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:34:19,314-Speed 4764.39 samples/sec Loss 1.4151 Epoch: 11 Global Step: 185400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:34:30,641-Speed 4520.30 samples/sec Loss 1.4074 Epoch: 11 Global Step: 185450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:34:41,766-Speed 4602.33 samples/sec Loss 1.4279 Epoch: 11 Global Step: 185500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:34:52,655-Speed 4702.43 samples/sec Loss 1.4223 Epoch: 11 Global Step: 185550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:03,411-Speed 4760.74 samples/sec Loss 1.4401 Epoch: 11 Global Step: 185600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:14,170-Speed 4758.81 samples/sec Loss 1.4328 Epoch: 11 Global Step: 185650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:24,822-Speed 4806.95 samples/sec Loss 1.4284 Epoch: 11 Global Step: 185700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:35,771-Speed 4676.51 samples/sec Loss 1.4108 Epoch: 11 Global Step: 185750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:46,624-Speed 4718.11 samples/sec Loss 1.4061 Epoch: 11 Global Step: 185800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:35:57,349-Speed 4774.00 samples/sec Loss 1.4230 Epoch: 11 Global Step: 185850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:36:08,273-Speed 4687.20 samples/sec Loss 1.4157 Epoch: 11 Global Step: 185900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:36:19,068-Speed 4743.32 samples/sec Loss 1.4160 Epoch: 11 Global Step: 185950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:36:29,912-Speed 4721.89 samples/sec Loss 1.4064 Epoch: 11 Global Step: 186000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:36:54,238-[lfw][186000]XNorm: 22.858912 Training: 2021-03-18 04:36:54,239-[lfw][186000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 04:36:54,239-[lfw][186000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:37:21,740-[cfp_fp][186000]XNorm: 20.452906 Training: 2021-03-18 04:37:21,740-[cfp_fp][186000]Accuracy-Flip: 0.98157+-0.00587 Training: 2021-03-18 04:37:21,740-[cfp_fp][186000]Accuracy-Highest: 0.98157 Training: 2021-03-18 04:37:45,564-[agedb_30][186000]XNorm: 22.520119 Training: 2021-03-18 04:37:45,565-[agedb_30][186000]Accuracy-Flip: 0.97917+-0.00672 Training: 2021-03-18 04:37:45,565-[agedb_30][186000]Accuracy-Highest: 0.97917 Training: 2021-03-18 04:37:57,142-Speed 586.96 samples/sec Loss 1.4328 Epoch: 11 Global Step: 186050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:38:09,462-Speed 4156.31 samples/sec Loss 1.3995 Epoch: 11 Global Step: 186100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:38:20,202-Speed 4767.38 samples/sec Loss 1.4115 Epoch: 11 Global Step: 186150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:38:33,083-Speed 3975.07 samples/sec Loss 1.4186 Epoch: 11 Global Step: 186200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:38:43,744-Speed 4803.15 samples/sec Loss 1.4083 Epoch: 11 Global Step: 186250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:38:54,541-Speed 4742.50 samples/sec Loss 1.3882 Epoch: 11 Global Step: 186300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:39:06,390-Speed 4321.03 samples/sec Loss 1.3945 Epoch: 11 Global Step: 186350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:39:17,365-Speed 4665.53 samples/sec Loss 1.3894 Epoch: 11 Global Step: 186400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:39:28,076-Speed 4780.37 samples/sec Loss 1.3763 Epoch: 11 Global Step: 186450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:39:38,935-Speed 4715.60 samples/sec Loss 1.4139 Epoch: 11 Global Step: 186500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:39:49,771-Speed 4724.88 samples/sec Loss 1.3813 Epoch: 11 Global Step: 186550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:00,464-Speed 4788.85 samples/sec Loss 1.3609 Epoch: 11 Global Step: 186600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:11,348-Speed 4704.09 samples/sec Loss 1.3850 Epoch: 11 Global Step: 186650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:22,136-Speed 4746.35 samples/sec Loss 1.3929 Epoch: 11 Global Step: 186700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:32,842-Speed 4782.99 samples/sec Loss 1.3849 Epoch: 11 Global Step: 186750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:43,502-Speed 4803.11 samples/sec Loss 1.3884 Epoch: 11 Global Step: 186800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:40:54,259-Speed 4759.84 samples/sec Loss 1.3831 Epoch: 11 Global Step: 186850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:05,002-Speed 4766.26 samples/sec Loss 1.3580 Epoch: 11 Global Step: 186900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:15,585-Speed 4838.15 samples/sec Loss 1.3851 Epoch: 11 Global Step: 186950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:26,274-Speed 4790.21 samples/sec Loss 1.3776 Epoch: 11 Global Step: 187000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:37,129-Speed 4717.23 samples/sec Loss 1.3730 Epoch: 11 Global Step: 187050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:48,156-Speed 4643.50 samples/sec Loss 1.3719 Epoch: 11 Global Step: 187100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:41:59,156-Speed 4654.84 samples/sec Loss 1.3425 Epoch: 11 Global Step: 187150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-18 04:42:10,069-Speed 4692.12 samples/sec Loss 1.3550 Epoch: 11 Global Step: 187200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:42:20,826-Speed 4759.88 samples/sec Loss 1.3568 Epoch: 11 Global Step: 187250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:42:31,775-Speed 4676.33 samples/sec Loss 1.3641 Epoch: 11 Global Step: 187300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:42:42,690-Speed 4691.28 samples/sec Loss 1.3642 Epoch: 11 Global Step: 187350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:42:53,470-Speed 4749.91 samples/sec Loss 1.3593 Epoch: 11 Global Step: 187400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:04,234-Speed 4756.72 samples/sec Loss 1.3493 Epoch: 11 Global Step: 187450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:15,064-Speed 4727.96 samples/sec Loss 1.3425 Epoch: 11 Global Step: 187500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:25,896-Speed 4727.07 samples/sec Loss 1.3600 Epoch: 11 Global Step: 187550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:36,533-Speed 4813.93 samples/sec Loss 1.3659 Epoch: 11 Global Step: 187600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:47,084-Speed 4852.66 samples/sec Loss 1.3529 Epoch: 11 Global Step: 187650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:43:57,767-Speed 4793.04 samples/sec Loss 1.3377 Epoch: 11 Global Step: 187700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:44:08,471-Speed 4783.28 samples/sec Loss 1.3470 Epoch: 11 Global Step: 187750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:44:19,551-Speed 4621.40 samples/sec Loss 1.3608 Epoch: 11 Global Step: 187800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:44:30,288-Speed 4768.83 samples/sec Loss 1.3507 Epoch: 11 Global Step: 187850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:44:41,211-Speed 4687.78 samples/sec Loss 1.3421 Epoch: 11 Global Step: 187900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:44:51,968-Speed 4760.57 samples/sec Loss 1.3507 Epoch: 11 Global Step: 187950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:45:02,760-Speed 4744.50 samples/sec Loss 1.3617 Epoch: 11 Global Step: 188000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:45:27,198-[lfw][188000]XNorm: 22.720995 Training: 2021-03-18 04:45:27,198-[lfw][188000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-18 04:45:27,200-[lfw][188000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:45:54,843-[cfp_fp][188000]XNorm: 20.421880 Training: 2021-03-18 04:45:54,843-[cfp_fp][188000]Accuracy-Flip: 0.98343+-0.00600 Training: 2021-03-18 04:45:54,843-[cfp_fp][188000]Accuracy-Highest: 0.98343 Training: 2021-03-18 04:46:18,780-[agedb_30][188000]XNorm: 22.458207 Training: 2021-03-18 04:46:18,780-[agedb_30][188000]Accuracy-Flip: 0.97900+-0.00597 Training: 2021-03-18 04:46:18,780-[agedb_30][188000]Accuracy-Highest: 0.97917 Training: 2021-03-18 04:46:30,349-Speed 584.55 samples/sec Loss 1.3548 Epoch: 11 Global Step: 188050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:46:40,817-Speed 4891.09 samples/sec Loss 1.3677 Epoch: 11 Global Step: 188100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:46:51,625-Speed 4737.90 samples/sec Loss 1.3578 Epoch: 11 Global Step: 188150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:02,347-Speed 4775.44 samples/sec Loss 1.3521 Epoch: 11 Global Step: 188200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:13,261-Speed 4691.40 samples/sec Loss 1.3497 Epoch: 11 Global Step: 188250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:24,077-Speed 4734.23 samples/sec Loss 1.3488 Epoch: 11 Global Step: 188300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:35,659-Speed 4420.75 samples/sec Loss 1.3364 Epoch: 11 Global Step: 188350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:46,485-Speed 4729.71 samples/sec Loss 1.3388 Epoch: 11 Global Step: 188400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:47:57,419-Speed 4682.55 samples/sec Loss 1.3251 Epoch: 11 Global Step: 188450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:48:08,077-Speed 4804.53 samples/sec Loss 1.3219 Epoch: 11 Global Step: 188500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:48:18,949-Speed 4709.56 samples/sec Loss 1.3370 Epoch: 11 Global Step: 188550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:48:29,802-Speed 4717.98 samples/sec Loss 1.3301 Epoch: 11 Global Step: 188600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:48:40,570-Speed 4754.80 samples/sec Loss 1.3352 Epoch: 11 Global Step: 188650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:48:51,454-Speed 4704.39 samples/sec Loss 1.3059 Epoch: 11 Global Step: 188700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:02,347-Speed 4700.71 samples/sec Loss 1.3351 Epoch: 11 Global Step: 188750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:13,206-Speed 4715.36 samples/sec Loss 1.3519 Epoch: 11 Global Step: 188800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:24,137-Speed 4684.54 samples/sec Loss 1.3431 Epoch: 11 Global Step: 188850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:35,698-Speed 4428.69 samples/sec Loss 1.3357 Epoch: 11 Global Step: 188900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:47,097-Speed 4492.21 samples/sec Loss 1.3287 Epoch: 11 Global Step: 188950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:49:57,802-Speed 4783.02 samples/sec Loss 1.3244 Epoch: 11 Global Step: 189000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:50:10,943-Speed 3896.33 samples/sec Loss 1.3309 Epoch: 11 Global Step: 189050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:50:21,744-Speed 4740.83 samples/sec Loss 1.3222 Epoch: 11 Global Step: 189100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:50:32,635-Speed 4701.00 samples/sec Loss 1.3337 Epoch: 11 Global Step: 189150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:50:43,480-Speed 4721.67 samples/sec Loss 1.3366 Epoch: 11 Global Step: 189200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:50:55,081-Speed 4413.86 samples/sec Loss 1.3433 Epoch: 11 Global Step: 189250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:05,803-Speed 4775.16 samples/sec Loss 1.3379 Epoch: 11 Global Step: 189300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:16,532-Speed 4772.69 samples/sec Loss 1.3161 Epoch: 11 Global Step: 189350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:27,242-Speed 4781.11 samples/sec Loss 1.3188 Epoch: 11 Global Step: 189400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:38,140-Speed 4698.13 samples/sec Loss 1.3296 Epoch: 11 Global Step: 189450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:48,991-Speed 4718.83 samples/sec Loss 1.3290 Epoch: 11 Global Step: 189500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:51:59,792-Speed 4740.71 samples/sec Loss 1.3529 Epoch: 11 Global Step: 189550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:52:10,607-Speed 4734.25 samples/sec Loss 1.3259 Epoch: 11 Global Step: 189600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:52:21,455-Speed 4720.41 samples/sec Loss 1.3101 Epoch: 11 Global Step: 189650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:52:32,255-Speed 4740.91 samples/sec Loss 1.3270 Epoch: 11 Global Step: 189700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:52:42,933-Speed 4795.20 samples/sec Loss 1.3172 Epoch: 11 Global Step: 189750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:52:53,795-Speed 4713.90 samples/sec Loss 1.3082 Epoch: 11 Global Step: 189800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:53:04,523-Speed 4772.92 samples/sec Loss 1.3197 Epoch: 11 Global Step: 189850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:53:15,455-Speed 4683.69 samples/sec Loss 1.3238 Epoch: 11 Global Step: 189900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:53:26,210-Speed 4760.88 samples/sec Loss 1.3051 Epoch: 11 Global Step: 189950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:53:37,235-Speed 4644.16 samples/sec Loss 1.3243 Epoch: 11 Global Step: 190000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:54:01,276-[lfw][190000]XNorm: 22.788365 Training: 2021-03-18 04:54:01,276-[lfw][190000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 04:54:01,276-[lfw][190000]Accuracy-Highest: 0.99800 Training: 2021-03-18 04:54:28,858-[cfp_fp][190000]XNorm: 20.329545 Training: 2021-03-18 04:54:28,858-[cfp_fp][190000]Accuracy-Flip: 0.98357+-0.00579 Training: 2021-03-18 04:54:28,858-[cfp_fp][190000]Accuracy-Highest: 0.98357 Training: 2021-03-18 04:54:52,721-[agedb_30][190000]XNorm: 22.468311 Training: 2021-03-18 04:54:52,721-[agedb_30][190000]Accuracy-Flip: 0.98083+-0.00597 Training: 2021-03-18 04:54:52,722-[agedb_30][190000]Accuracy-Highest: 0.98083 Training: 2021-03-18 04:55:03,413-Speed 594.12 samples/sec Loss 1.3323 Epoch: 11 Global Step: 190050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:55:14,256-Speed 4722.61 samples/sec Loss 1.3171 Epoch: 11 Global Step: 190100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:55:25,203-Speed 4677.26 samples/sec Loss 1.3152 Epoch: 11 Global Step: 190150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:55:36,141-Speed 4680.92 samples/sec Loss 1.3070 Epoch: 11 Global Step: 190200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:55:46,977-Speed 4725.42 samples/sec Loss 1.3180 Epoch: 11 Global Step: 190250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:55:57,762-Speed 4747.59 samples/sec Loss 1.3207 Epoch: 11 Global Step: 190300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:56:08,779-Speed 4647.33 samples/sec Loss 1.3457 Epoch: 11 Global Step: 190350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:56:19,507-Speed 4772.81 samples/sec Loss 1.3149 Epoch: 11 Global Step: 190400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:56:30,149-Speed 4811.70 samples/sec Loss 1.2989 Epoch: 11 Global Step: 190450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:56:41,008-Speed 4715.22 samples/sec Loss 1.2890 Epoch: 11 Global Step: 190500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:56:51,735-Speed 4773.31 samples/sec Loss 1.3022 Epoch: 11 Global Step: 190550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:02,545-Speed 4736.88 samples/sec Loss 1.2821 Epoch: 11 Global Step: 190600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:13,358-Speed 4735.18 samples/sec Loss 1.3190 Epoch: 11 Global Step: 190650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:24,101-Speed 4766.06 samples/sec Loss 1.3060 Epoch: 11 Global Step: 190700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:35,038-Speed 4681.60 samples/sec Loss 1.3158 Epoch: 11 Global Step: 190750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:45,636-Speed 4831.84 samples/sec Loss 1.2982 Epoch: 11 Global Step: 190800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:57:56,265-Speed 4817.26 samples/sec Loss 1.3031 Epoch: 11 Global Step: 190850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:58:07,650-Speed 4497.42 samples/sec Loss 1.3033 Epoch: 11 Global Step: 190900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:58:18,431-Speed 4749.00 samples/sec Loss 1.3113 Epoch: 11 Global Step: 190950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:58:29,200-Speed 4754.62 samples/sec Loss 1.2994 Epoch: 11 Global Step: 191000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:58:40,071-Speed 4710.42 samples/sec Loss 1.3106 Epoch: 11 Global Step: 191050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:58:51,656-Speed 4419.71 samples/sec Loss 1.2891 Epoch: 11 Global Step: 191100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:02,390-Speed 4769.93 samples/sec Loss 1.3046 Epoch: 11 Global Step: 191150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:13,096-Speed 4782.97 samples/sec Loss 1.3070 Epoch: 11 Global Step: 191200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:23,845-Speed 4763.63 samples/sec Loss 1.2721 Epoch: 11 Global Step: 191250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:34,634-Speed 4745.55 samples/sec Loss 1.2895 Epoch: 11 Global Step: 191300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:45,320-Speed 4791.81 samples/sec Loss 1.2892 Epoch: 11 Global Step: 191350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 04:59:56,149-Speed 4728.55 samples/sec Loss 1.3109 Epoch: 11 Global Step: 191400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:00:07,057-Speed 4694.08 samples/sec Loss 1.3148 Epoch: 11 Global Step: 191450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:00:17,736-Speed 4794.38 samples/sec Loss 1.2919 Epoch: 11 Global Step: 191500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:00:28,735-Speed 4655.40 samples/sec Loss 1.2989 Epoch: 11 Global Step: 191550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:00:39,554-Speed 4732.72 samples/sec Loss 1.3007 Epoch: 11 Global Step: 191600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:00:50,383-Speed 4728.58 samples/sec Loss 1.2912 Epoch: 11 Global Step: 191650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:01,042-Speed 4803.64 samples/sec Loss 1.2823 Epoch: 11 Global Step: 191700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:11,681-Speed 4812.85 samples/sec Loss 1.2871 Epoch: 11 Global Step: 191750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:22,404-Speed 4775.04 samples/sec Loss 1.2958 Epoch: 11 Global Step: 191800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:35,063-Speed 4044.63 samples/sec Loss 1.3040 Epoch: 11 Global Step: 191850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:47,633-Speed 4073.33 samples/sec Loss 1.2895 Epoch: 11 Global Step: 191900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:01:59,238-Speed 4412.30 samples/sec Loss 1.2926 Epoch: 11 Global Step: 191950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:02:10,110-Speed 4709.35 samples/sec Loss 1.2998 Epoch: 11 Global Step: 192000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:02:34,673-[lfw][192000]XNorm: 22.921137 Training: 2021-03-18 05:02:34,674-[lfw][192000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-18 05:02:34,674-[lfw][192000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:03:02,270-[cfp_fp][192000]XNorm: 20.940846 Training: 2021-03-18 05:03:02,270-[cfp_fp][192000]Accuracy-Flip: 0.98300+-0.00540 Training: 2021-03-18 05:03:02,270-[cfp_fp][192000]Accuracy-Highest: 0.98357 Training: 2021-03-18 05:03:26,121-[agedb_30][192000]XNorm: 22.832664 Training: 2021-03-18 05:03:26,122-[agedb_30][192000]Accuracy-Flip: 0.98017+-0.00575 Training: 2021-03-18 05:03:26,122-[agedb_30][192000]Accuracy-Highest: 0.98083 Training: 2021-03-18 05:03:36,913-Speed 589.85 samples/sec Loss 1.2876 Epoch: 11 Global Step: 192050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:03:47,879-Speed 4669.39 samples/sec Loss 1.2807 Epoch: 11 Global Step: 192100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:03:58,654-Speed 4752.12 samples/sec Loss 1.2990 Epoch: 11 Global Step: 192150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:04:09,477-Speed 4730.89 samples/sec Loss 1.3032 Epoch: 11 Global Step: 192200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:04:21,117-Speed 4398.91 samples/sec Loss 1.2989 Epoch: 11 Global Step: 192250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:04:31,982-Speed 4712.39 samples/sec Loss 1.3019 Epoch: 11 Global Step: 192300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:04:42,783-Speed 4740.64 samples/sec Loss 1.3086 Epoch: 11 Global Step: 192350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:04:53,568-Speed 4747.55 samples/sec Loss 1.2761 Epoch: 11 Global Step: 192400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:04,661-Speed 4615.93 samples/sec Loss 1.2823 Epoch: 11 Global Step: 192450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:15,531-Speed 4710.82 samples/sec Loss 1.2890 Epoch: 11 Global Step: 192500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:26,233-Speed 4784.34 samples/sec Loss 1.2924 Epoch: 11 Global Step: 192550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:37,249-Speed 4648.29 samples/sec Loss 1.2764 Epoch: 11 Global Step: 192600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:47,926-Speed 4795.81 samples/sec Loss 1.2820 Epoch: 11 Global Step: 192650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:05:58,780-Speed 4717.15 samples/sec Loss 1.2931 Epoch: 11 Global Step: 192700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:06:09,532-Speed 4762.10 samples/sec Loss 1.2850 Epoch: 11 Global Step: 192750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:06:20,276-Speed 4765.81 samples/sec Loss 1.2749 Epoch: 11 Global Step: 192800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:06:30,978-Speed 4784.72 samples/sec Loss 1.2861 Epoch: 11 Global Step: 192850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:06:41,922-Speed 4678.56 samples/sec Loss 1.2964 Epoch: 11 Global Step: 192900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:06:52,766-Speed 4722.03 samples/sec Loss 1.2891 Epoch: 11 Global Step: 192950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:03,579-Speed 4735.11 samples/sec Loss 1.2816 Epoch: 11 Global Step: 193000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:14,522-Speed 4678.96 samples/sec Loss 1.2846 Epoch: 11 Global Step: 193050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:25,261-Speed 4768.16 samples/sec Loss 1.2773 Epoch: 11 Global Step: 193100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:36,016-Speed 4761.12 samples/sec Loss 1.3044 Epoch: 11 Global Step: 193150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:46,892-Speed 4707.72 samples/sec Loss 1.2864 Epoch: 11 Global Step: 193200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:07:57,680-Speed 4746.30 samples/sec Loss 1.2729 Epoch: 11 Global Step: 193250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:08:08,606-Speed 4686.66 samples/sec Loss 1.2650 Epoch: 11 Global Step: 193300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:08:19,300-Speed 4788.10 samples/sec Loss 1.2688 Epoch: 11 Global Step: 193350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:08:29,939-Speed 4812.63 samples/sec Loss 1.2769 Epoch: 11 Global Step: 193400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:08:40,646-Speed 4782.56 samples/sec Loss 1.2850 Epoch: 11 Global Step: 193450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:08:51,409-Speed 4757.06 samples/sec Loss 1.2685 Epoch: 11 Global Step: 193500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:02,100-Speed 4789.50 samples/sec Loss 1.2753 Epoch: 11 Global Step: 193550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:12,887-Speed 4746.71 samples/sec Loss 1.2898 Epoch: 11 Global Step: 193600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:23,716-Speed 4728.09 samples/sec Loss 1.2623 Epoch: 11 Global Step: 193650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:34,676-Speed 4671.87 samples/sec Loss 1.2749 Epoch: 11 Global Step: 193700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:46,400-Speed 4367.24 samples/sec Loss 1.2664 Epoch: 11 Global Step: 193750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:09:57,206-Speed 4738.49 samples/sec Loss 1.2674 Epoch: 11 Global Step: 193800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:10:08,018-Speed 4735.81 samples/sec Loss 1.2778 Epoch: 11 Global Step: 193850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:10:19,583-Speed 4427.36 samples/sec Loss 1.2515 Epoch: 11 Global Step: 193900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:10:30,393-Speed 4736.82 samples/sec Loss 1.2832 Epoch: 11 Global Step: 193950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:10:41,334-Speed 4679.94 samples/sec Loss 1.2930 Epoch: 11 Global Step: 194000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:11:05,704-[lfw][194000]XNorm: 22.846172 Training: 2021-03-18 05:11:05,705-[lfw][194000]Accuracy-Flip: 0.99750+-0.00318 Training: 2021-03-18 05:11:05,705-[lfw][194000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:11:33,150-[cfp_fp][194000]XNorm: 20.739265 Training: 2021-03-18 05:11:33,150-[cfp_fp][194000]Accuracy-Flip: 0.98286+-0.00582 Training: 2021-03-18 05:11:33,150-[cfp_fp][194000]Accuracy-Highest: 0.98357 Training: 2021-03-18 05:11:56,866-[agedb_30][194000]XNorm: 22.646311 Training: 2021-03-18 05:11:56,867-[agedb_30][194000]Accuracy-Flip: 0.98067+-0.00642 Training: 2021-03-18 05:11:56,867-[agedb_30][194000]Accuracy-Highest: 0.98083 Training: 2021-03-18 05:12:07,267-Speed 595.82 samples/sec Loss 1.2648 Epoch: 11 Global Step: 194050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:12:18,271-Speed 4653.15 samples/sec Loss 1.2596 Epoch: 11 Global Step: 194100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:12:29,280-Speed 4650.98 samples/sec Loss 1.2549 Epoch: 11 Global Step: 194150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:12:40,292-Speed 4649.43 samples/sec Loss 1.2818 Epoch: 11 Global Step: 194200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:12:51,086-Speed 4743.90 samples/sec Loss 1.2767 Epoch: 11 Global Step: 194250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:01,966-Speed 4706.26 samples/sec Loss 1.2538 Epoch: 11 Global Step: 194300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:13,022-Speed 4631.33 samples/sec Loss 1.2558 Epoch: 11 Global Step: 194350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:24,020-Speed 4655.71 samples/sec Loss 1.2703 Epoch: 11 Global Step: 194400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:34,763-Speed 4766.25 samples/sec Loss 1.2809 Epoch: 11 Global Step: 194450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:45,725-Speed 4670.78 samples/sec Loss 1.2614 Epoch: 11 Global Step: 194500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:13:56,831-Speed 4610.60 samples/sec Loss 1.2880 Epoch: 11 Global Step: 194550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:14:07,731-Speed 4697.24 samples/sec Loss 1.2835 Epoch: 11 Global Step: 194600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:14:18,657-Speed 4686.64 samples/sec Loss 1.2604 Epoch: 11 Global Step: 194650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:14:29,545-Speed 4702.76 samples/sec Loss 1.2658 Epoch: 11 Global Step: 194700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:14:42,112-Speed 4074.19 samples/sec Loss 1.2878 Epoch: 11 Global Step: 194750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:14:54,390-Speed 4170.38 samples/sec Loss 1.2693 Epoch: 11 Global Step: 194800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:15:05,944-Speed 4431.37 samples/sec Loss 1.2528 Epoch: 11 Global Step: 194850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:15:16,818-Speed 4708.94 samples/sec Loss 1.2785 Epoch: 11 Global Step: 194900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:15:27,863-Speed 4635.99 samples/sec Loss 1.2750 Epoch: 11 Global Step: 194950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:15:38,496-Speed 4815.51 samples/sec Loss 1.2629 Epoch: 11 Global Step: 195000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:15:49,485-Speed 4659.59 samples/sec Loss 1.2661 Epoch: 11 Global Step: 195050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:01,339-Speed 4319.31 samples/sec Loss 1.2742 Epoch: 11 Global Step: 195100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:12,286-Speed 4677.44 samples/sec Loss 1.2473 Epoch: 11 Global Step: 195150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:23,138-Speed 4718.26 samples/sec Loss 1.2673 Epoch: 11 Global Step: 195200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:34,053-Speed 4691.09 samples/sec Loss 1.2573 Epoch: 11 Global Step: 195250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:45,007-Speed 4674.17 samples/sec Loss 1.2784 Epoch: 11 Global Step: 195300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:16:55,844-Speed 4725.16 samples/sec Loss 1.2433 Epoch: 11 Global Step: 195350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:17:06,808-Speed 4670.01 samples/sec Loss 1.2647 Epoch: 11 Global Step: 195400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:17:17,719-Speed 4692.70 samples/sec Loss 1.2570 Epoch: 11 Global Step: 195450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:17:28,800-Speed 4620.84 samples/sec Loss 1.2533 Epoch: 11 Global Step: 195500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:17:39,654-Speed 4717.50 samples/sec Loss 1.2619 Epoch: 11 Global Step: 195550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:17:50,562-Speed 4694.32 samples/sec Loss 1.2513 Epoch: 11 Global Step: 195600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:01,512-Speed 4676.05 samples/sec Loss 1.2566 Epoch: 11 Global Step: 195650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:12,259-Speed 4764.40 samples/sec Loss 1.2409 Epoch: 11 Global Step: 195700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:23,093-Speed 4726.00 samples/sec Loss 1.2669 Epoch: 11 Global Step: 195750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:34,007-Speed 4691.81 samples/sec Loss 1.2449 Epoch: 11 Global Step: 195800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:44,964-Speed 4672.77 samples/sec Loss 1.2360 Epoch: 11 Global Step: 195850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:18:55,918-Speed 4674.53 samples/sec Loss 1.2579 Epoch: 11 Global Step: 195900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:19:06,781-Speed 4713.85 samples/sec Loss 1.2571 Epoch: 11 Global Step: 195950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:19:17,611-Speed 4728.01 samples/sec Loss 1.2450 Epoch: 11 Global Step: 196000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:19:42,039-[lfw][196000]XNorm: 22.486781 Training: 2021-03-18 05:19:42,039-[lfw][196000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-18 05:19:42,039-[lfw][196000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:20:10,055-[cfp_fp][196000]XNorm: 20.229124 Training: 2021-03-18 05:20:10,055-[cfp_fp][196000]Accuracy-Flip: 0.98457+-0.00619 Training: 2021-03-18 05:20:10,055-[cfp_fp][196000]Accuracy-Highest: 0.98457 Training: 2021-03-18 05:20:33,772-[agedb_30][196000]XNorm: 22.396741 Training: 2021-03-18 05:20:33,772-[agedb_30][196000]Accuracy-Flip: 0.97900+-0.00559 Training: 2021-03-18 05:20:33,772-[agedb_30][196000]Accuracy-Highest: 0.98083 Training: 2021-03-18 05:20:44,439-Speed 589.67 samples/sec Loss 1.2609 Epoch: 11 Global Step: 196050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:20:55,307-Speed 4711.45 samples/sec Loss 1.2783 Epoch: 11 Global Step: 196100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:21:06,051-Speed 4765.44 samples/sec Loss 1.2432 Epoch: 11 Global Step: 196150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:21:17,048-Speed 4656.06 samples/sec Loss 1.2565 Epoch: 11 Global Step: 196200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:21:28,109-Speed 4629.41 samples/sec Loss 1.2478 Epoch: 11 Global Step: 196250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:21:39,058-Speed 4676.35 samples/sec Loss 1.2544 Epoch: 11 Global Step: 196300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:21:49,929-Speed 4710.07 samples/sec Loss 1.2167 Epoch: 11 Global Step: 196350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:00,917-Speed 4659.98 samples/sec Loss 1.2447 Epoch: 11 Global Step: 196400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:11,888-Speed 4666.85 samples/sec Loss 1.2786 Epoch: 11 Global Step: 196450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:23,101-Speed 4566.36 samples/sec Loss 1.2547 Epoch: 11 Global Step: 196500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:33,876-Speed 4752.07 samples/sec Loss 1.2521 Epoch: 11 Global Step: 196550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:44,902-Speed 4643.92 samples/sec Loss 1.2217 Epoch: 11 Global Step: 196600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:22:56,548-Speed 4396.77 samples/sec Loss 1.2602 Epoch: 11 Global Step: 196650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:23:08,335-Speed 4344.10 samples/sec Loss 1.2634 Epoch: 11 Global Step: 196700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:23:19,118-Speed 4748.37 samples/sec Loss 1.2537 Epoch: 11 Global Step: 196750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:23:30,098-Speed 4663.66 samples/sec Loss 1.2532 Epoch: 11 Global Step: 196800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:23:41,044-Speed 4677.48 samples/sec Loss 1.2765 Epoch: 11 Global Step: 196850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:23:51,840-Speed 4742.80 samples/sec Loss 1.2665 Epoch: 11 Global Step: 196900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:02,751-Speed 4693.11 samples/sec Loss 1.2581 Epoch: 11 Global Step: 196950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:13,620-Speed 4710.61 samples/sec Loss 1.2604 Epoch: 11 Global Step: 197000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:24,613-Speed 4658.04 samples/sec Loss 1.2620 Epoch: 11 Global Step: 197050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:35,583-Speed 4667.45 samples/sec Loss 1.2748 Epoch: 11 Global Step: 197100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:46,566-Speed 4662.00 samples/sec Loss 1.2474 Epoch: 11 Global Step: 197150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:24:57,567-Speed 4654.10 samples/sec Loss 1.2513 Epoch: 11 Global Step: 197200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:25:08,556-Speed 4659.58 samples/sec Loss 1.2505 Epoch: 11 Global Step: 197250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:25:19,331-Speed 4751.70 samples/sec Loss 1.2550 Epoch: 11 Global Step: 197300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:25:30,210-Speed 4706.64 samples/sec Loss 1.2611 Epoch: 11 Global Step: 197350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:25:41,346-Speed 4597.92 samples/sec Loss 1.2515 Epoch: 11 Global Step: 197400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:25:52,249-Speed 4696.50 samples/sec Loss 1.2551 Epoch: 11 Global Step: 197450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:26:03,359-Speed 4608.50 samples/sec Loss 1.2514 Epoch: 11 Global Step: 197500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:26:14,287-Speed 4685.75 samples/sec Loss 1.2520 Epoch: 11 Global Step: 197550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:26:26,035-Speed 4358.33 samples/sec Loss 1.2483 Epoch: 11 Global Step: 197600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:26:38,443-Speed 4126.76 samples/sec Loss 1.2418 Epoch: 11 Global Step: 197650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:26:50,166-Speed 4367.66 samples/sec Loss 1.2403 Epoch: 11 Global Step: 197700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:01,299-Speed 4599.06 samples/sec Loss 1.2502 Epoch: 11 Global Step: 197750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:12,829-Speed 4440.62 samples/sec Loss 1.2701 Epoch: 11 Global Step: 197800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:23,776-Speed 4677.54 samples/sec Loss 1.2374 Epoch: 11 Global Step: 197850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:34,603-Speed 4729.07 samples/sec Loss 1.2617 Epoch: 11 Global Step: 197900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:45,470-Speed 4712.18 samples/sec Loss 1.2522 Epoch: 11 Global Step: 197950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:27:57,134-Speed 4389.61 samples/sec Loss 1.2553 Epoch: 11 Global Step: 198000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:28:21,297-[lfw][198000]XNorm: 22.293106 Training: 2021-03-18 05:28:21,297-[lfw][198000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 05:28:21,297-[lfw][198000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:28:48,872-[cfp_fp][198000]XNorm: 20.147679 Training: 2021-03-18 05:28:48,873-[cfp_fp][198000]Accuracy-Flip: 0.98514+-0.00561 Training: 2021-03-18 05:28:48,873-[cfp_fp][198000]Accuracy-Highest: 0.98514 Training: 2021-03-18 05:29:12,619-[agedb_30][198000]XNorm: 22.117486 Training: 2021-03-18 05:29:12,619-[agedb_30][198000]Accuracy-Flip: 0.98100+-0.00569 Training: 2021-03-18 05:29:12,619-[agedb_30][198000]Accuracy-Highest: 0.98100 Training: 2021-03-18 05:29:23,370-Speed 593.73 samples/sec Loss 1.2349 Epoch: 11 Global Step: 198050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:29:34,349-Speed 4663.45 samples/sec Loss 1.2352 Epoch: 11 Global Step: 198100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:29:45,419-Speed 4625.60 samples/sec Loss 1.2434 Epoch: 11 Global Step: 198150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:29:56,333-Speed 4691.27 samples/sec Loss 1.2517 Epoch: 11 Global Step: 198200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:30:07,254-Speed 4688.52 samples/sec Loss 1.2391 Epoch: 11 Global Step: 198250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:30:17,996-Speed 4766.68 samples/sec Loss 1.2271 Epoch: 11 Global Step: 198300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:30:28,939-Speed 4679.16 samples/sec Loss 1.2553 Epoch: 11 Global Step: 198350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:30:39,991-Speed 4632.77 samples/sec Loss 1.2379 Epoch: 11 Global Step: 198400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:30:50,775-Speed 4748.22 samples/sec Loss 1.2184 Epoch: 11 Global Step: 198450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:01,838-Speed 4628.40 samples/sec Loss 1.2326 Epoch: 11 Global Step: 198500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:12,912-Speed 4623.71 samples/sec Loss 1.2373 Epoch: 11 Global Step: 198550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:24,002-Speed 4616.75 samples/sec Loss 1.2471 Epoch: 11 Global Step: 198600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:35,164-Speed 4587.32 samples/sec Loss 1.2331 Epoch: 11 Global Step: 198650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:45,995-Speed 4727.36 samples/sec Loss 1.2506 Epoch: 11 Global Step: 198700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:31:57,279-Speed 4537.79 samples/sec Loss 1.2341 Epoch: 11 Global Step: 198750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:32:08,082-Speed 4740.20 samples/sec Loss 1.2306 Epoch: 11 Global Step: 198800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:32:19,108-Speed 4643.50 samples/sec Loss 1.2575 Epoch: 11 Global Step: 198850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:32:29,985-Speed 4707.61 samples/sec Loss 1.2523 Epoch: 11 Global Step: 198900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:32:40,976-Speed 4658.46 samples/sec Loss 1.2523 Epoch: 11 Global Step: 198950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:32:51,907-Speed 4684.40 samples/sec Loss 1.2183 Epoch: 11 Global Step: 199000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:02,847-Speed 4680.16 samples/sec Loss 1.2096 Epoch: 11 Global Step: 199050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:13,859-Speed 4649.79 samples/sec Loss 1.2533 Epoch: 11 Global Step: 199100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:24,795-Speed 4682.05 samples/sec Loss 1.2377 Epoch: 11 Global Step: 199150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:35,994-Speed 4572.31 samples/sec Loss 1.2553 Epoch: 11 Global Step: 199200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:46,697-Speed 4783.73 samples/sec Loss 1.2264 Epoch: 11 Global Step: 199250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:33:57,637-Speed 4680.31 samples/sec Loss 1.2691 Epoch: 11 Global Step: 199300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:34:08,646-Speed 4651.03 samples/sec Loss 1.2252 Epoch: 11 Global Step: 199350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:34:19,594-Speed 4677.32 samples/sec Loss 1.2455 Epoch: 11 Global Step: 199400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:34:30,520-Speed 4686.25 samples/sec Loss 1.2346 Epoch: 11 Global Step: 199450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:34:42,145-Speed 4404.55 samples/sec Loss 1.2483 Epoch: 11 Global Step: 199500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:34:53,823-Speed 4384.67 samples/sec Loss 1.2434 Epoch: 11 Global Step: 199550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:04,831-Speed 4651.42 samples/sec Loss 1.2280 Epoch: 11 Global Step: 199600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:15,734-Speed 4695.97 samples/sec Loss 1.2521 Epoch: 11 Global Step: 199650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:26,967-Speed 4558.27 samples/sec Loss 1.2142 Epoch: 11 Global Step: 199700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:38,056-Speed 4617.84 samples/sec Loss 1.2604 Epoch: 11 Global Step: 199750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:49,117-Speed 4629.09 samples/sec Loss 1.2302 Epoch: 11 Global Step: 199800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:35:59,978-Speed 4714.34 samples/sec Loss 1.2596 Epoch: 11 Global Step: 199850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:36:10,923-Speed 4678.23 samples/sec Loss 1.2169 Epoch: 11 Global Step: 199900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:36:21,991-Speed 4626.19 samples/sec Loss 1.2290 Epoch: 11 Global Step: 199950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:36:32,974-Speed 4661.80 samples/sec Loss 1.2594 Epoch: 11 Global Step: 200000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:36:56,709-[lfw][200000]XNorm: 22.904079 Training: 2021-03-18 05:36:56,709-[lfw][200000]Accuracy-Flip: 0.99783+-0.00308 Training: 2021-03-18 05:36:56,709-[lfw][200000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:37:24,113-[cfp_fp][200000]XNorm: 20.569515 Training: 2021-03-18 05:37:24,113-[cfp_fp][200000]Accuracy-Flip: 0.98357+-0.00569 Training: 2021-03-18 05:37:24,113-[cfp_fp][200000]Accuracy-Highest: 0.98514 Training: 2021-03-18 05:37:47,973-[agedb_30][200000]XNorm: 22.720827 Training: 2021-03-18 05:37:47,973-[agedb_30][200000]Accuracy-Flip: 0.98050+-0.00641 Training: 2021-03-18 05:37:47,973-[agedb_30][200000]Accuracy-Highest: 0.98100 Training: 2021-03-18 05:37:58,864-Speed 596.12 samples/sec Loss 1.2457 Epoch: 11 Global Step: 200050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:38:09,788-Speed 4687.11 samples/sec Loss 1.2192 Epoch: 11 Global Step: 200100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:38:20,865-Speed 4622.15 samples/sec Loss 1.2339 Epoch: 11 Global Step: 200150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:38:31,899-Speed 4640.86 samples/sec Loss 1.2505 Epoch: 11 Global Step: 200200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:38:42,836-Speed 4681.66 samples/sec Loss 1.2179 Epoch: 11 Global Step: 200250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:39:06,536-Speed 2160.34 samples/sec Loss 1.2312 Epoch: 12 Global Step: 200300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:39:18,347-Speed 4335.36 samples/sec Loss 1.1582 Epoch: 12 Global Step: 200350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:39:32,071-Speed 3730.99 samples/sec Loss 1.1449 Epoch: 12 Global Step: 200400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:39:44,694-Speed 4055.99 samples/sec Loss 1.1509 Epoch: 12 Global Step: 200450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:39:56,310-Speed 4408.00 samples/sec Loss 1.1439 Epoch: 12 Global Step: 200500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:40:07,519-Speed 4568.07 samples/sec Loss 1.1511 Epoch: 12 Global Step: 200550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:40:19,593-Speed 4241.11 samples/sec Loss 1.1558 Epoch: 12 Global Step: 200600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:40:30,411-Speed 4733.16 samples/sec Loss 1.1501 Epoch: 12 Global Step: 200650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:40:42,108-Speed 4377.48 samples/sec Loss 1.1459 Epoch: 12 Global Step: 200700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:40:53,269-Speed 4587.65 samples/sec Loss 1.1542 Epoch: 12 Global Step: 200750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:41:04,255-Speed 4660.72 samples/sec Loss 1.1438 Epoch: 12 Global Step: 200800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:41:15,267-Speed 4649.90 samples/sec Loss 1.1457 Epoch: 12 Global Step: 200850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:41:26,201-Speed 4682.96 samples/sec Loss 1.1447 Epoch: 12 Global Step: 200900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:41:37,259-Speed 4630.28 samples/sec Loss 1.1542 Epoch: 12 Global Step: 200950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:41:49,370-Speed 4227.56 samples/sec Loss 1.1465 Epoch: 12 Global Step: 201000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:42:00,463-Speed 4615.97 samples/sec Loss 1.1459 Epoch: 12 Global Step: 201050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:42:11,348-Speed 4704.06 samples/sec Loss 1.1302 Epoch: 12 Global Step: 201100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:42:22,399-Speed 4633.12 samples/sec Loss 1.1311 Epoch: 12 Global Step: 201150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-18 05:42:33,354-Speed 4673.96 samples/sec Loss 1.1719 Epoch: 12 Global Step: 201200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:42:44,220-Speed 4712.46 samples/sec Loss 1.1415 Epoch: 12 Global Step: 201250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:42:55,458-Speed 4556.10 samples/sec Loss 1.1444 Epoch: 12 Global Step: 201300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:43:06,607-Speed 4592.79 samples/sec Loss 1.1480 Epoch: 12 Global Step: 201350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:43:17,584-Speed 4664.33 samples/sec Loss 1.1580 Epoch: 12 Global Step: 201400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:43:28,557-Speed 4666.54 samples/sec Loss 1.1542 Epoch: 12 Global Step: 201450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:43:39,562-Speed 4652.61 samples/sec Loss 1.1474 Epoch: 12 Global Step: 201500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:43:50,367-Speed 4738.74 samples/sec Loss 1.1477 Epoch: 12 Global Step: 201550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:01,167-Speed 4741.18 samples/sec Loss 1.1299 Epoch: 12 Global Step: 201600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:12,438-Speed 4542.89 samples/sec Loss 1.1480 Epoch: 12 Global Step: 201650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:23,462-Speed 4644.60 samples/sec Loss 1.1383 Epoch: 12 Global Step: 201700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:34,561-Speed 4613.22 samples/sec Loss 1.1627 Epoch: 12 Global Step: 201750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:45,628-Speed 4626.73 samples/sec Loss 1.1388 Epoch: 12 Global Step: 201800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:44:56,531-Speed 4696.18 samples/sec Loss 1.1533 Epoch: 12 Global Step: 201850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:45:07,415-Speed 4704.61 samples/sec Loss 1.1353 Epoch: 12 Global Step: 201900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:45:18,433-Speed 4646.88 samples/sec Loss 1.1472 Epoch: 12 Global Step: 201950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:45:29,302-Speed 4711.20 samples/sec Loss 1.1375 Epoch: 12 Global Step: 202000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:45:53,509-[lfw][202000]XNorm: 22.500783 Training: 2021-03-18 05:45:53,509-[lfw][202000]Accuracy-Flip: 0.99783+-0.00279 Training: 2021-03-18 05:45:53,509-[lfw][202000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:46:21,037-[cfp_fp][202000]XNorm: 20.506734 Training: 2021-03-18 05:46:21,037-[cfp_fp][202000]Accuracy-Flip: 0.98471+-0.00599 Training: 2021-03-18 05:46:21,037-[cfp_fp][202000]Accuracy-Highest: 0.98514 Training: 2021-03-18 05:46:44,749-[agedb_30][202000]XNorm: 22.445482 Training: 2021-03-18 05:46:44,750-[agedb_30][202000]Accuracy-Flip: 0.98117+-0.00703 Training: 2021-03-18 05:46:44,750-[agedb_30][202000]Accuracy-Highest: 0.98117 Training: 2021-03-18 05:46:55,766-Speed 592.16 samples/sec Loss 1.1624 Epoch: 12 Global Step: 202050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:47:06,766-Speed 4654.49 samples/sec Loss 1.1585 Epoch: 12 Global Step: 202100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:47:17,751-Speed 4661.27 samples/sec Loss 1.1652 Epoch: 12 Global Step: 202150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:47:28,953-Speed 4570.86 samples/sec Loss 1.1727 Epoch: 12 Global Step: 202200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:47:40,073-Speed 4604.84 samples/sec Loss 1.1525 Epoch: 12 Global Step: 202250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:47:51,699-Speed 4404.04 samples/sec Loss 1.1431 Epoch: 12 Global Step: 202300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:03,391-Speed 4379.58 samples/sec Loss 1.1310 Epoch: 12 Global Step: 202350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:14,261-Speed 4710.39 samples/sec Loss 1.1535 Epoch: 12 Global Step: 202400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:25,268-Speed 4651.84 samples/sec Loss 1.1543 Epoch: 12 Global Step: 202450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:36,027-Speed 4759.24 samples/sec Loss 1.1434 Epoch: 12 Global Step: 202500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:47,154-Speed 4601.40 samples/sec Loss 1.1394 Epoch: 12 Global Step: 202550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:48:57,900-Speed 4764.73 samples/sec Loss 1.1691 Epoch: 12 Global Step: 202600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:49:08,984-Speed 4619.76 samples/sec Loss 1.1390 Epoch: 12 Global Step: 202650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:49:20,031-Speed 4635.02 samples/sec Loss 1.1852 Epoch: 12 Global Step: 202700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:49:30,770-Speed 4768.10 samples/sec Loss 1.1571 Epoch: 12 Global Step: 202750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:49:41,757-Speed 4659.96 samples/sec Loss 1.1595 Epoch: 12 Global Step: 202800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:49:52,714-Speed 4673.19 samples/sec Loss 1.1374 Epoch: 12 Global Step: 202850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:03,524-Speed 4736.90 samples/sec Loss 1.1384 Epoch: 12 Global Step: 202900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:14,451-Speed 4686.05 samples/sec Loss 1.1633 Epoch: 12 Global Step: 202950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:25,209-Speed 4759.51 samples/sec Loss 1.1624 Epoch: 12 Global Step: 203000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:36,401-Speed 4574.85 samples/sec Loss 1.1518 Epoch: 12 Global Step: 203050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:47,343-Speed 4679.39 samples/sec Loss 1.1415 Epoch: 12 Global Step: 203100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:50:58,483-Speed 4596.26 samples/sec Loss 1.1449 Epoch: 12 Global Step: 203150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:51:10,126-Speed 4398.07 samples/sec Loss 1.1506 Epoch: 12 Global Step: 203200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:51:21,837-Speed 4372.11 samples/sec Loss 1.1520 Epoch: 12 Global Step: 203250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:51:33,907-Speed 4242.03 samples/sec Loss 1.1636 Epoch: 12 Global Step: 203300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:51:44,904-Speed 4656.10 samples/sec Loss 1.1413 Epoch: 12 Global Step: 203350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:51:55,563-Speed 4803.78 samples/sec Loss 1.1692 Epoch: 12 Global Step: 203400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:52:06,511-Speed 4676.73 samples/sec Loss 1.1578 Epoch: 12 Global Step: 203450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:52:18,372-Speed 4317.12 samples/sec Loss 1.1445 Epoch: 12 Global Step: 203500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:52:29,920-Speed 4433.63 samples/sec Loss 1.1491 Epoch: 12 Global Step: 203550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:52:40,610-Speed 4789.92 samples/sec Loss 1.1479 Epoch: 12 Global Step: 203600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:52:51,457-Speed 4720.60 samples/sec Loss 1.1614 Epoch: 12 Global Step: 203650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:02,451-Speed 4657.11 samples/sec Loss 1.1654 Epoch: 12 Global Step: 203700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:13,263-Speed 4735.82 samples/sec Loss 1.1523 Epoch: 12 Global Step: 203750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:24,429-Speed 4585.87 samples/sec Loss 1.1343 Epoch: 12 Global Step: 203800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:36,323-Speed 4305.00 samples/sec Loss 1.1554 Epoch: 12 Global Step: 203850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:47,004-Speed 4793.68 samples/sec Loss 1.1577 Epoch: 12 Global Step: 203900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:53:57,926-Speed 4687.91 samples/sec Loss 1.1434 Epoch: 12 Global Step: 203950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:54:08,983-Speed 4630.95 samples/sec Loss 1.1347 Epoch: 12 Global Step: 204000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:54:32,759-[lfw][204000]XNorm: 22.620661 Training: 2021-03-18 05:54:32,760-[lfw][204000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 05:54:32,760-[lfw][204000]Accuracy-Highest: 0.99800 Training: 2021-03-18 05:55:00,356-[cfp_fp][204000]XNorm: 20.555154 Training: 2021-03-18 05:55:00,357-[cfp_fp][204000]Accuracy-Flip: 0.98457+-0.00571 Training: 2021-03-18 05:55:00,357-[cfp_fp][204000]Accuracy-Highest: 0.98514 Training: 2021-03-18 05:55:24,221-[agedb_30][204000]XNorm: 22.456696 Training: 2021-03-18 05:55:24,221-[agedb_30][204000]Accuracy-Flip: 0.98067+-0.00583 Training: 2021-03-18 05:55:24,221-[agedb_30][204000]Accuracy-Highest: 0.98117 Training: 2021-03-18 05:55:35,272-Speed 593.36 samples/sec Loss 1.1508 Epoch: 12 Global Step: 204050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:55:46,182-Speed 4693.28 samples/sec Loss 1.1520 Epoch: 12 Global Step: 204100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:55:57,088-Speed 4694.83 samples/sec Loss 1.1813 Epoch: 12 Global Step: 204150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:56:08,170-Speed 4620.58 samples/sec Loss 1.1458 Epoch: 12 Global Step: 204200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:56:19,075-Speed 4695.18 samples/sec Loss 1.1636 Epoch: 12 Global Step: 204250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:56:29,914-Speed 4723.99 samples/sec Loss 1.1562 Epoch: 12 Global Step: 204300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:56:40,972-Speed 4630.18 samples/sec Loss 1.1611 Epoch: 12 Global Step: 204350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:56:52,113-Speed 4595.87 samples/sec Loss 1.1253 Epoch: 12 Global Step: 204400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:03,040-Speed 4686.18 samples/sec Loss 1.1437 Epoch: 12 Global Step: 204450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:13,945-Speed 4695.27 samples/sec Loss 1.1522 Epoch: 12 Global Step: 204500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:25,292-Speed 4512.42 samples/sec Loss 1.1732 Epoch: 12 Global Step: 204550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:36,159-Speed 4711.84 samples/sec Loss 1.1571 Epoch: 12 Global Step: 204600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:47,138-Speed 4663.76 samples/sec Loss 1.1403 Epoch: 12 Global Step: 204650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:57:58,112-Speed 4665.91 samples/sec Loss 1.1664 Epoch: 12 Global Step: 204700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:58:09,221-Speed 4609.26 samples/sec Loss 1.1629 Epoch: 12 Global Step: 204750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:58:20,163-Speed 4679.15 samples/sec Loss 1.1285 Epoch: 12 Global Step: 204800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:58:31,252-Speed 4617.86 samples/sec Loss 1.1436 Epoch: 12 Global Step: 204850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:58:42,049-Speed 4742.17 samples/sec Loss 1.1401 Epoch: 12 Global Step: 204900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:58:52,894-Speed 4721.55 samples/sec Loss 1.1632 Epoch: 12 Global Step: 204950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:03,776-Speed 4704.84 samples/sec Loss 1.1481 Epoch: 12 Global Step: 205000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:14,807-Speed 4641.88 samples/sec Loss 1.1528 Epoch: 12 Global Step: 205050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:25,915-Speed 4609.52 samples/sec Loss 1.1680 Epoch: 12 Global Step: 205100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:37,478-Speed 4428.18 samples/sec Loss 1.1672 Epoch: 12 Global Step: 205150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:48,377-Speed 4698.17 samples/sec Loss 1.1557 Epoch: 12 Global Step: 205200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 05:59:59,944-Speed 4426.43 samples/sec Loss 1.1705 Epoch: 12 Global Step: 205250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:00:10,829-Speed 4703.83 samples/sec Loss 1.1492 Epoch: 12 Global Step: 205300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:00:21,610-Speed 4749.35 samples/sec Loss 1.1497 Epoch: 12 Global Step: 205350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:00:32,367-Speed 4760.08 samples/sec Loss 1.1360 Epoch: 12 Global Step: 205400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:00:43,369-Speed 4653.88 samples/sec Loss 1.1525 Epoch: 12 Global Step: 205450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:00:54,367-Speed 4655.86 samples/sec Loss 1.1293 Epoch: 12 Global Step: 205500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:01:05,273-Speed 4694.67 samples/sec Loss 1.1506 Epoch: 12 Global Step: 205550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:01:16,122-Speed 4719.65 samples/sec Loss 1.1595 Epoch: 12 Global Step: 205600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:01:27,054-Speed 4683.86 samples/sec Loss 1.1560 Epoch: 12 Global Step: 205650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:01:38,252-Speed 4572.21 samples/sec Loss 1.1639 Epoch: 12 Global Step: 205700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:01:49,183-Speed 4684.34 samples/sec Loss 1.1162 Epoch: 12 Global Step: 205750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:02:00,121-Speed 4681.07 samples/sec Loss 1.1429 Epoch: 12 Global Step: 205800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:02:11,143-Speed 4645.58 samples/sec Loss 1.1621 Epoch: 12 Global Step: 205850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:02:22,243-Speed 4613.16 samples/sec Loss 1.1500 Epoch: 12 Global Step: 205900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:02:33,167-Speed 4686.86 samples/sec Loss 1.1596 Epoch: 12 Global Step: 205950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:02:44,913-Speed 4359.32 samples/sec Loss 1.1292 Epoch: 12 Global Step: 206000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:03:09,160-[lfw][206000]XNorm: 22.859314 Training: 2021-03-18 06:03:09,160-[lfw][206000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 06:03:09,160-[lfw][206000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:03:37,013-[cfp_fp][206000]XNorm: 20.686961 Training: 2021-03-18 06:03:37,013-[cfp_fp][206000]Accuracy-Flip: 0.98314+-0.00560 Training: 2021-03-18 06:03:37,014-[cfp_fp][206000]Accuracy-Highest: 0.98514 Training: 2021-03-18 06:04:00,884-[agedb_30][206000]XNorm: 22.662190 Training: 2021-03-18 06:04:00,885-[agedb_30][206000]Accuracy-Flip: 0.98033+-0.00618 Training: 2021-03-18 06:04:00,885-[agedb_30][206000]Accuracy-Highest: 0.98117 Training: 2021-03-18 06:04:11,673-Speed 590.14 samples/sec Loss 1.1449 Epoch: 12 Global Step: 206050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:04:23,632-Speed 4281.31 samples/sec Loss 1.1404 Epoch: 12 Global Step: 206100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:04:35,696-Speed 4244.25 samples/sec Loss 1.1425 Epoch: 12 Global Step: 206150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:04:46,673-Speed 4664.52 samples/sec Loss 1.1632 Epoch: 12 Global Step: 206200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:04:57,832-Speed 4588.24 samples/sec Loss 1.1407 Epoch: 12 Global Step: 206250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:05:08,844-Speed 4649.88 samples/sec Loss 1.1265 Epoch: 12 Global Step: 206300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:05:19,793-Speed 4676.54 samples/sec Loss 1.1520 Epoch: 12 Global Step: 206350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:05:30,573-Speed 4749.48 samples/sec Loss 1.1512 Epoch: 12 Global Step: 206400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:05:43,120-Speed 4080.89 samples/sec Loss 1.1399 Epoch: 12 Global Step: 206450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:05:54,158-Speed 4638.98 samples/sec Loss 1.1337 Epoch: 12 Global Step: 206500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:06:05,108-Speed 4676.16 samples/sec Loss 1.1459 Epoch: 12 Global Step: 206550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:06:15,973-Speed 4712.47 samples/sec Loss 1.1633 Epoch: 12 Global Step: 206600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:06:26,926-Speed 4674.84 samples/sec Loss 1.1463 Epoch: 12 Global Step: 206650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:06:38,115-Speed 4576.17 samples/sec Loss 1.1586 Epoch: 12 Global Step: 206700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:06:49,809-Speed 4378.51 samples/sec Loss 1.1736 Epoch: 12 Global Step: 206750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:00,584-Speed 4752.24 samples/sec Loss 1.1402 Epoch: 12 Global Step: 206800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:11,399-Speed 4734.31 samples/sec Loss 1.1387 Epoch: 12 Global Step: 206850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:22,488-Speed 4617.56 samples/sec Loss 1.1501 Epoch: 12 Global Step: 206900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:33,335-Speed 4720.54 samples/sec Loss 1.1411 Epoch: 12 Global Step: 206950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:44,092-Speed 4760.29 samples/sec Loss 1.1536 Epoch: 12 Global Step: 207000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:07:54,837-Speed 4764.97 samples/sec Loss 1.1687 Epoch: 12 Global Step: 207050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:08:06,031-Speed 4574.44 samples/sec Loss 1.1450 Epoch: 12 Global Step: 207100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:08:17,074-Speed 4636.57 samples/sec Loss 1.1265 Epoch: 12 Global Step: 207150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:08:28,049-Speed 4665.31 samples/sec Loss 1.1435 Epoch: 12 Global Step: 207200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:08:38,931-Speed 4705.46 samples/sec Loss 1.1331 Epoch: 12 Global Step: 207250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:08:49,718-Speed 4746.54 samples/sec Loss 1.1576 Epoch: 12 Global Step: 207300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:00,566-Speed 4720.20 samples/sec Loss 1.1448 Epoch: 12 Global Step: 207350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:11,577-Speed 4650.07 samples/sec Loss 1.1484 Epoch: 12 Global Step: 207400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:22,422-Speed 4721.49 samples/sec Loss 1.1331 Epoch: 12 Global Step: 207450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:33,388-Speed 4669.43 samples/sec Loss 1.1350 Epoch: 12 Global Step: 207500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:44,360-Speed 4666.59 samples/sec Loss 1.1440 Epoch: 12 Global Step: 207550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:09:55,352-Speed 4658.23 samples/sec Loss 1.1597 Epoch: 12 Global Step: 207600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:10:06,213-Speed 4714.20 samples/sec Loss 1.1528 Epoch: 12 Global Step: 207650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:10:17,462-Speed 4551.79 samples/sec Loss 1.1642 Epoch: 12 Global Step: 207700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:10:28,572-Speed 4608.68 samples/sec Loss 1.1499 Epoch: 12 Global Step: 207750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:10:39,475-Speed 4696.25 samples/sec Loss 1.1548 Epoch: 12 Global Step: 207800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:10:50,513-Speed 4638.82 samples/sec Loss 1.1409 Epoch: 12 Global Step: 207850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:11:01,618-Speed 4610.73 samples/sec Loss 1.1319 Epoch: 12 Global Step: 207900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:11:13,229-Speed 4409.70 samples/sec Loss 1.1585 Epoch: 12 Global Step: 207950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:11:23,864-Speed 4814.50 samples/sec Loss 1.1320 Epoch: 12 Global Step: 208000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:11:47,762-[lfw][208000]XNorm: 22.347355 Training: 2021-03-18 06:11:47,762-[lfw][208000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-18 06:11:47,762-[lfw][208000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:12:15,287-[cfp_fp][208000]XNorm: 20.413324 Training: 2021-03-18 06:12:15,287-[cfp_fp][208000]Accuracy-Flip: 0.98514+-0.00569 Training: 2021-03-18 06:12:15,287-[cfp_fp][208000]Accuracy-Highest: 0.98514 Training: 2021-03-18 06:12:39,095-[agedb_30][208000]XNorm: 22.220976 Training: 2021-03-18 06:12:39,096-[agedb_30][208000]Accuracy-Flip: 0.98183+-0.00575 Training: 2021-03-18 06:12:39,096-[agedb_30][208000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:12:50,061-Speed 593.99 samples/sec Loss 1.1422 Epoch: 12 Global Step: 208050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:01,642-Speed 4421.23 samples/sec Loss 1.1461 Epoch: 12 Global Step: 208100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:12,420-Speed 4750.66 samples/sec Loss 1.1456 Epoch: 12 Global Step: 208150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:23,413-Speed 4657.81 samples/sec Loss 1.1374 Epoch: 12 Global Step: 208200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:34,109-Speed 4787.14 samples/sec Loss 1.1444 Epoch: 12 Global Step: 208250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:45,160-Speed 4633.43 samples/sec Loss 1.1411 Epoch: 12 Global Step: 208300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:13:56,221-Speed 4629.13 samples/sec Loss 1.1515 Epoch: 12 Global Step: 208350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:14:07,258-Speed 4638.93 samples/sec Loss 1.1563 Epoch: 12 Global Step: 208400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:14:18,207-Speed 4676.81 samples/sec Loss 1.1410 Epoch: 12 Global Step: 208450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:14:29,383-Speed 4581.72 samples/sec Loss 1.1606 Epoch: 12 Global Step: 208500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:14:40,109-Speed 4773.59 samples/sec Loss 1.1484 Epoch: 12 Global Step: 208550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:14:51,085-Speed 4665.29 samples/sec Loss 1.1606 Epoch: 12 Global Step: 208600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:01,994-Speed 4693.26 samples/sec Loss 1.1328 Epoch: 12 Global Step: 208650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:12,972-Speed 4664.52 samples/sec Loss 1.1596 Epoch: 12 Global Step: 208700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:23,821-Speed 4719.58 samples/sec Loss 1.1314 Epoch: 12 Global Step: 208750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:35,115-Speed 4533.36 samples/sec Loss 1.1268 Epoch: 12 Global Step: 208800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:46,631-Speed 4446.19 samples/sec Loss 1.1553 Epoch: 12 Global Step: 208850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:15:57,507-Speed 4708.00 samples/sec Loss 1.1568 Epoch: 12 Global Step: 208900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:16:09,152-Speed 4397.11 samples/sec Loss 1.1519 Epoch: 12 Global Step: 208950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:16:20,156-Speed 4652.95 samples/sec Loss 1.1371 Epoch: 12 Global Step: 209000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:16:31,801-Speed 4397.28 samples/sec Loss 1.1421 Epoch: 12 Global Step: 209050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:16:42,560-Speed 4759.01 samples/sec Loss 1.1382 Epoch: 12 Global Step: 209100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:16:53,451-Speed 4701.38 samples/sec Loss 1.1331 Epoch: 12 Global Step: 209150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:17:04,486-Speed 4640.10 samples/sec Loss 1.1451 Epoch: 12 Global Step: 209200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:17:15,966-Speed 4460.45 samples/sec Loss 1.1394 Epoch: 12 Global Step: 209250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:17:26,653-Speed 4791.08 samples/sec Loss 1.1410 Epoch: 12 Global Step: 209300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:17:38,274-Speed 4406.24 samples/sec Loss 1.1400 Epoch: 12 Global Step: 209350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:17:49,030-Speed 4760.59 samples/sec Loss 1.1483 Epoch: 12 Global Step: 209400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:00,212-Speed 4578.77 samples/sec Loss 1.1502 Epoch: 12 Global Step: 209450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:11,270-Speed 4630.41 samples/sec Loss 1.1329 Epoch: 12 Global Step: 209500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:22,281-Speed 4650.32 samples/sec Loss 1.1383 Epoch: 12 Global Step: 209550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:33,178-Speed 4698.84 samples/sec Loss 1.1602 Epoch: 12 Global Step: 209600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:44,025-Speed 4720.64 samples/sec Loss 1.1469 Epoch: 12 Global Step: 209650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:18:55,865-Speed 4324.46 samples/sec Loss 1.1446 Epoch: 12 Global Step: 209700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:19:06,676-Speed 4735.98 samples/sec Loss 1.1289 Epoch: 12 Global Step: 209750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:19:17,657-Speed 4663.14 samples/sec Loss 1.1543 Epoch: 12 Global Step: 209800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:19:28,512-Speed 4716.96 samples/sec Loss 1.1467 Epoch: 12 Global Step: 209850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:19:39,326-Speed 4734.95 samples/sec Loss 1.1443 Epoch: 12 Global Step: 209900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:19:50,308-Speed 4662.83 samples/sec Loss 1.1393 Epoch: 12 Global Step: 209950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:20:01,221-Speed 4691.83 samples/sec Loss 1.1470 Epoch: 12 Global Step: 210000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:20:25,059-[lfw][210000]XNorm: 22.584214 Training: 2021-03-18 06:20:25,059-[lfw][210000]Accuracy-Flip: 0.99767+-0.00281 Training: 2021-03-18 06:20:25,059-[lfw][210000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:20:52,630-[cfp_fp][210000]XNorm: 20.679225 Training: 2021-03-18 06:20:52,631-[cfp_fp][210000]Accuracy-Flip: 0.98543+-0.00446 Training: 2021-03-18 06:20:52,631-[cfp_fp][210000]Accuracy-Highest: 0.98543 Training: 2021-03-18 06:21:16,375-[agedb_30][210000]XNorm: 22.587242 Training: 2021-03-18 06:21:16,375-[agedb_30][210000]Accuracy-Flip: 0.98183+-0.00673 Training: 2021-03-18 06:21:16,375-[agedb_30][210000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:21:27,262-Speed 595.07 samples/sec Loss 1.1386 Epoch: 12 Global Step: 210050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:21:38,292-Speed 4642.24 samples/sec Loss 1.1450 Epoch: 12 Global Step: 210100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:21:49,059-Speed 4755.27 samples/sec Loss 1.1356 Epoch: 12 Global Step: 210150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:00,095-Speed 4639.52 samples/sec Loss 1.1552 Epoch: 12 Global Step: 210200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:10,998-Speed 4696.48 samples/sec Loss 1.1326 Epoch: 12 Global Step: 210250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:21,883-Speed 4703.82 samples/sec Loss 1.1509 Epoch: 12 Global Step: 210300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:32,932-Speed 4634.25 samples/sec Loss 1.1522 Epoch: 12 Global Step: 210350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:43,881-Speed 4676.80 samples/sec Loss 1.1253 Epoch: 12 Global Step: 210400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:22:54,732-Speed 4718.78 samples/sec Loss 1.1323 Epoch: 12 Global Step: 210450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:23:05,731-Speed 4655.03 samples/sec Loss 1.1483 Epoch: 12 Global Step: 210500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:23:16,669-Speed 4681.30 samples/sec Loss 1.1610 Epoch: 12 Global Step: 210550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:23:27,571-Speed 4696.71 samples/sec Loss 1.1547 Epoch: 12 Global Step: 210600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:23:38,590-Speed 4646.63 samples/sec Loss 1.1258 Epoch: 12 Global Step: 210650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:23:50,339-Speed 4357.97 samples/sec Loss 1.1484 Epoch: 12 Global Step: 210700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:01,390-Speed 4633.55 samples/sec Loss 1.1377 Epoch: 12 Global Step: 210750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:12,280-Speed 4701.54 samples/sec Loss 1.1358 Epoch: 12 Global Step: 210800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:23,209-Speed 4685.25 samples/sec Loss 1.1633 Epoch: 12 Global Step: 210850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:34,964-Speed 4355.92 samples/sec Loss 1.1474 Epoch: 12 Global Step: 210900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:45,928-Speed 4670.05 samples/sec Loss 1.1579 Epoch: 12 Global Step: 210950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:24:56,817-Speed 4702.42 samples/sec Loss 1.1561 Epoch: 12 Global Step: 211000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:25:07,728-Speed 4692.82 samples/sec Loss 1.1374 Epoch: 12 Global Step: 211050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:25:18,769-Speed 4637.29 samples/sec Loss 1.1453 Epoch: 12 Global Step: 211100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:25:29,670-Speed 4697.23 samples/sec Loss 1.1293 Epoch: 12 Global Step: 211150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:25:40,440-Speed 4754.17 samples/sec Loss 1.1214 Epoch: 12 Global Step: 211200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:25:51,283-Speed 4722.36 samples/sec Loss 1.1347 Epoch: 12 Global Step: 211250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:02,310-Speed 4643.25 samples/sec Loss 1.1241 Epoch: 12 Global Step: 211300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:13,293-Speed 4662.12 samples/sec Loss 1.1385 Epoch: 12 Global Step: 211350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:24,279-Speed 4660.67 samples/sec Loss 1.1519 Epoch: 12 Global Step: 211400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:35,244-Speed 4669.69 samples/sec Loss 1.1469 Epoch: 12 Global Step: 211450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:45,966-Speed 4775.58 samples/sec Loss 1.1557 Epoch: 12 Global Step: 211500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:26:56,891-Speed 4686.55 samples/sec Loss 1.1480 Epoch: 12 Global Step: 211550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:27:08,840-Speed 4285.24 samples/sec Loss 1.1185 Epoch: 12 Global Step: 211600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:27:19,703-Speed 4713.55 samples/sec Loss 1.1489 Epoch: 12 Global Step: 211650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:27:30,603-Speed 4697.64 samples/sec Loss 1.1426 Epoch: 12 Global Step: 211700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:27:41,536-Speed 4683.51 samples/sec Loss 1.1465 Epoch: 12 Global Step: 211750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:27:53,204-Speed 4388.32 samples/sec Loss 1.1294 Epoch: 12 Global Step: 211800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:28:03,982-Speed 4750.62 samples/sec Loss 1.1475 Epoch: 12 Global Step: 211850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:28:15,912-Speed 4292.06 samples/sec Loss 1.1457 Epoch: 12 Global Step: 211900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:28:26,668-Speed 4760.59 samples/sec Loss 1.1370 Epoch: 12 Global Step: 211950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:28:37,463-Speed 4743.02 samples/sec Loss 1.1550 Epoch: 12 Global Step: 212000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:29:01,250-[lfw][212000]XNorm: 22.515858 Training: 2021-03-18 06:29:01,250-[lfw][212000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 06:29:01,251-[lfw][212000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:29:28,792-[cfp_fp][212000]XNorm: 20.518523 Training: 2021-03-18 06:29:28,792-[cfp_fp][212000]Accuracy-Flip: 0.98571+-0.00561 Training: 2021-03-18 06:29:28,792-[cfp_fp][212000]Accuracy-Highest: 0.98571 Training: 2021-03-18 06:29:52,486-[agedb_30][212000]XNorm: 22.564163 Training: 2021-03-18 06:29:52,486-[agedb_30][212000]Accuracy-Flip: 0.98100+-0.00624 Training: 2021-03-18 06:29:52,486-[agedb_30][212000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:30:03,349-Speed 596.15 samples/sec Loss 1.1532 Epoch: 12 Global Step: 212050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:30:14,274-Speed 4686.64 samples/sec Loss 1.1486 Epoch: 12 Global Step: 212100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:30:26,122-Speed 4321.58 samples/sec Loss 1.1634 Epoch: 12 Global Step: 212150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:30:37,344-Speed 4563.07 samples/sec Loss 1.1453 Epoch: 12 Global Step: 212200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:30:49,148-Speed 4337.88 samples/sec Loss 1.1560 Epoch: 12 Global Step: 212250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:00,253-Speed 4610.86 samples/sec Loss 1.1262 Epoch: 12 Global Step: 212300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:11,113-Speed 4714.79 samples/sec Loss 1.1486 Epoch: 12 Global Step: 212350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:21,865-Speed 4762.21 samples/sec Loss 1.1546 Epoch: 12 Global Step: 212400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:32,889-Speed 4644.67 samples/sec Loss 1.1479 Epoch: 12 Global Step: 212450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:43,989-Speed 4612.82 samples/sec Loss 1.1389 Epoch: 12 Global Step: 212500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:31:54,956-Speed 4668.66 samples/sec Loss 1.1500 Epoch: 12 Global Step: 212550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:32:06,726-Speed 4350.51 samples/sec Loss 1.1666 Epoch: 12 Global Step: 212600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:32:17,717-Speed 4658.44 samples/sec Loss 1.1246 Epoch: 12 Global Step: 212650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:32:28,636-Speed 4689.55 samples/sec Loss 1.1411 Epoch: 12 Global Step: 212700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:32:39,533-Speed 4698.88 samples/sec Loss 1.1691 Epoch: 12 Global Step: 212750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:32:50,466-Speed 4683.14 samples/sec Loss 1.1434 Epoch: 12 Global Step: 212800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:01,538-Speed 4624.55 samples/sec Loss 1.1276 Epoch: 12 Global Step: 212850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:12,751-Speed 4566.68 samples/sec Loss 1.1381 Epoch: 12 Global Step: 212900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:24,004-Speed 4549.87 samples/sec Loss 1.1488 Epoch: 12 Global Step: 212950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:35,004-Speed 4655.13 samples/sec Loss 1.1479 Epoch: 12 Global Step: 213000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:45,914-Speed 4693.01 samples/sec Loss 1.1373 Epoch: 12 Global Step: 213050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:33:56,961-Speed 4635.30 samples/sec Loss 1.1537 Epoch: 12 Global Step: 213100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:34:07,811-Speed 4718.88 samples/sec Loss 1.1434 Epoch: 12 Global Step: 213150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:34:18,717-Speed 4695.10 samples/sec Loss 1.1404 Epoch: 12 Global Step: 213200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:34:29,709-Speed 4658.31 samples/sec Loss 1.1386 Epoch: 12 Global Step: 213250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:34:40,694-Speed 4661.05 samples/sec Loss 1.1229 Epoch: 12 Global Step: 213300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:34:51,650-Speed 4673.52 samples/sec Loss 1.1285 Epoch: 12 Global Step: 213350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:02,662-Speed 4649.69 samples/sec Loss 1.1069 Epoch: 12 Global Step: 213400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:13,630-Speed 4668.71 samples/sec Loss 1.1351 Epoch: 12 Global Step: 213450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:25,361-Speed 4364.72 samples/sec Loss 1.1558 Epoch: 12 Global Step: 213500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:35,834-Speed 4888.99 samples/sec Loss 1.1200 Epoch: 12 Global Step: 213550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:46,687-Speed 4717.92 samples/sec Loss 1.1581 Epoch: 12 Global Step: 213600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:35:57,825-Speed 4597.08 samples/sec Loss 1.1375 Epoch: 12 Global Step: 213650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:36:09,560-Speed 4363.05 samples/sec Loss 1.1273 Epoch: 12 Global Step: 213700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:36:20,706-Speed 4594.02 samples/sec Loss 1.1266 Epoch: 12 Global Step: 213750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:36:31,599-Speed 4700.47 samples/sec Loss 1.1453 Epoch: 12 Global Step: 213800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:36:42,575-Speed 4665.14 samples/sec Loss 1.1416 Epoch: 12 Global Step: 213850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:36:53,562-Speed 4660.27 samples/sec Loss 1.1391 Epoch: 12 Global Step: 213900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:37:04,480-Speed 4689.86 samples/sec Loss 1.1346 Epoch: 12 Global Step: 213950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:37:15,297-Speed 4733.66 samples/sec Loss 1.1485 Epoch: 12 Global Step: 214000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:37:39,409-[lfw][214000]XNorm: 22.329682 Training: 2021-03-18 06:37:39,410-[lfw][214000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 06:37:39,410-[lfw][214000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:38:06,999-[cfp_fp][214000]XNorm: 20.585716 Training: 2021-03-18 06:38:06,999-[cfp_fp][214000]Accuracy-Flip: 0.98414+-0.00650 Training: 2021-03-18 06:38:06,999-[cfp_fp][214000]Accuracy-Highest: 0.98571 Training: 2021-03-18 06:38:30,700-[agedb_30][214000]XNorm: 22.335969 Training: 2021-03-18 06:38:30,700-[agedb_30][214000]Accuracy-Flip: 0.98183+-0.00685 Training: 2021-03-18 06:38:30,700-[agedb_30][214000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:38:41,268-Speed 595.55 samples/sec Loss 1.1628 Epoch: 12 Global Step: 214050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:38:52,122-Speed 4717.33 samples/sec Loss 1.1246 Epoch: 12 Global Step: 214100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:03,021-Speed 4698.44 samples/sec Loss 1.1405 Epoch: 12 Global Step: 214150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:13,747-Speed 4773.46 samples/sec Loss 1.1473 Epoch: 12 Global Step: 214200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:24,714-Speed 4668.87 samples/sec Loss 1.1361 Epoch: 12 Global Step: 214250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:35,670-Speed 4673.69 samples/sec Loss 1.1287 Epoch: 12 Global Step: 214300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:47,332-Speed 4390.49 samples/sec Loss 1.1482 Epoch: 12 Global Step: 214350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:39:58,377-Speed 4635.85 samples/sec Loss 1.1343 Epoch: 12 Global Step: 214400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:40:09,539-Speed 4587.49 samples/sec Loss 1.1370 Epoch: 12 Global Step: 214450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:40:20,409-Speed 4710.27 samples/sec Loss 1.1431 Epoch: 12 Global Step: 214500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:40:31,502-Speed 4616.02 samples/sec Loss 1.1353 Epoch: 12 Global Step: 214550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:40:43,411-Speed 4299.27 samples/sec Loss 1.1360 Epoch: 12 Global Step: 214600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:40:54,214-Speed 4739.71 samples/sec Loss 1.1413 Epoch: 12 Global Step: 214650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:41:05,142-Speed 4685.81 samples/sec Loss 1.1205 Epoch: 12 Global Step: 214700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:41:16,087-Speed 4678.09 samples/sec Loss 1.1465 Epoch: 12 Global Step: 214750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:41:28,023-Speed 4289.84 samples/sec Loss 1.1383 Epoch: 12 Global Step: 214800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:41:38,991-Speed 4668.06 samples/sec Loss 1.1330 Epoch: 12 Global Step: 214850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:41:49,916-Speed 4686.83 samples/sec Loss 1.1543 Epoch: 12 Global Step: 214900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:01,888-Speed 4276.66 samples/sec Loss 1.1358 Epoch: 12 Global Step: 214950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:12,630-Speed 4766.81 samples/sec Loss 1.1187 Epoch: 12 Global Step: 215000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:23,478-Speed 4719.78 samples/sec Loss 1.1271 Epoch: 12 Global Step: 215050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:35,142-Speed 4390.16 samples/sec Loss 1.1598 Epoch: 12 Global Step: 215100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:46,260-Speed 4605.64 samples/sec Loss 1.1447 Epoch: 12 Global Step: 215150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-18 06:42:57,133-Speed 4708.87 samples/sec Loss 1.1199 Epoch: 12 Global Step: 215200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:43:07,989-Speed 4716.57 samples/sec Loss 1.1479 Epoch: 12 Global Step: 215250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:43:19,194-Speed 4569.79 samples/sec Loss 1.1312 Epoch: 12 Global Step: 215300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:43:30,150-Speed 4673.28 samples/sec Loss 1.1248 Epoch: 12 Global Step: 215350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:43:41,995-Speed 4322.67 samples/sec Loss 1.1438 Epoch: 12 Global Step: 215400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:43:52,979-Speed 4661.69 samples/sec Loss 1.1500 Epoch: 12 Global Step: 215450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:03,914-Speed 4682.70 samples/sec Loss 1.1199 Epoch: 12 Global Step: 215500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:14,725-Speed 4736.33 samples/sec Loss 1.1593 Epoch: 12 Global Step: 215550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:25,715-Speed 4658.89 samples/sec Loss 1.1609 Epoch: 12 Global Step: 215600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:36,547-Speed 4727.06 samples/sec Loss 1.1187 Epoch: 12 Global Step: 215650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:47,562-Speed 4648.34 samples/sec Loss 1.1451 Epoch: 12 Global Step: 215700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:44:58,297-Speed 4770.06 samples/sec Loss 1.1395 Epoch: 12 Global Step: 215750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:45:09,247-Speed 4676.08 samples/sec Loss 1.1353 Epoch: 12 Global Step: 215800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:45:20,105-Speed 4715.63 samples/sec Loss 1.1678 Epoch: 12 Global Step: 215850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:45:30,879-Speed 4752.57 samples/sec Loss 1.1424 Epoch: 12 Global Step: 215900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:45:41,811-Speed 4683.46 samples/sec Loss 1.1344 Epoch: 12 Global Step: 215950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:45:52,619-Speed 4737.66 samples/sec Loss 1.1500 Epoch: 12 Global Step: 216000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:46:16,778-[lfw][216000]XNorm: 22.631110 Training: 2021-03-18 06:46:16,778-[lfw][216000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-18 06:46:16,778-[lfw][216000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:46:44,278-[cfp_fp][216000]XNorm: 20.801794 Training: 2021-03-18 06:46:44,278-[cfp_fp][216000]Accuracy-Flip: 0.98443+-0.00562 Training: 2021-03-18 06:46:44,278-[cfp_fp][216000]Accuracy-Highest: 0.98571 Training: 2021-03-18 06:47:07,984-[agedb_30][216000]XNorm: 22.734777 Training: 2021-03-18 06:47:07,984-[agedb_30][216000]Accuracy-Flip: 0.98083+-0.00583 Training: 2021-03-18 06:47:07,984-[agedb_30][216000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:47:18,904-Speed 593.39 samples/sec Loss 1.1295 Epoch: 12 Global Step: 216050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:47:29,982-Speed 4622.00 samples/sec Loss 1.1423 Epoch: 12 Global Step: 216100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:47:41,138-Speed 4589.64 samples/sec Loss 1.1322 Epoch: 12 Global Step: 216150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:47:52,239-Speed 4612.26 samples/sec Loss 1.1350 Epoch: 12 Global Step: 216200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:03,345-Speed 4610.55 samples/sec Loss 1.1588 Epoch: 12 Global Step: 216250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:14,417-Speed 4624.63 samples/sec Loss 1.1525 Epoch: 12 Global Step: 216300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:26,387-Speed 4277.35 samples/sec Loss 1.1292 Epoch: 12 Global Step: 216350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:37,287-Speed 4697.71 samples/sec Loss 1.1339 Epoch: 12 Global Step: 216400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:48,235-Speed 4676.83 samples/sec Loss 1.1401 Epoch: 12 Global Step: 216450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:48:59,462-Speed 4560.78 samples/sec Loss 1.1377 Epoch: 12 Global Step: 216500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:49:10,988-Speed 4442.41 samples/sec Loss 1.1460 Epoch: 12 Global Step: 216550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:49:21,939-Speed 4675.27 samples/sec Loss 1.1206 Epoch: 12 Global Step: 216600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:49:32,832-Speed 4700.87 samples/sec Loss 1.1556 Epoch: 12 Global Step: 216650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:49:43,677-Speed 4721.60 samples/sec Loss 1.1538 Epoch: 12 Global Step: 216700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:49:54,561-Speed 4704.16 samples/sec Loss 1.1237 Epoch: 12 Global Step: 216750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:50:05,298-Speed 4768.96 samples/sec Loss 1.1176 Epoch: 12 Global Step: 216800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:50:16,245-Speed 4677.38 samples/sec Loss 1.1459 Epoch: 12 Global Step: 216850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:50:27,070-Speed 4730.32 samples/sec Loss 1.1079 Epoch: 12 Global Step: 216900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:50:37,961-Speed 4701.11 samples/sec Loss 1.1347 Epoch: 12 Global Step: 216950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:01,783-Speed 2149.40 samples/sec Loss 1.1168 Epoch: 13 Global Step: 217000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:13,144-Speed 4506.99 samples/sec Loss 1.0490 Epoch: 13 Global Step: 217050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:24,316-Speed 4583.15 samples/sec Loss 1.0662 Epoch: 13 Global Step: 217100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:35,370-Speed 4632.30 samples/sec Loss 1.0576 Epoch: 13 Global Step: 217150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:47,581-Speed 4193.10 samples/sec Loss 1.0320 Epoch: 13 Global Step: 217200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:51:58,602-Speed 4646.10 samples/sec Loss 1.0618 Epoch: 13 Global Step: 217250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:52:09,721-Speed 4605.39 samples/sec Loss 1.0485 Epoch: 13 Global Step: 217300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:52:20,905-Speed 4578.12 samples/sec Loss 1.0533 Epoch: 13 Global Step: 217350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:52:31,952-Speed 4634.97 samples/sec Loss 1.0572 Epoch: 13 Global Step: 217400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:52:42,780-Speed 4728.94 samples/sec Loss 1.0316 Epoch: 13 Global Step: 217450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:52:54,411-Speed 4402.37 samples/sec Loss 1.0436 Epoch: 13 Global Step: 217500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:53:05,338-Speed 4685.54 samples/sec Loss 1.0570 Epoch: 13 Global Step: 217550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:53:16,444-Speed 4610.56 samples/sec Loss 1.0456 Epoch: 13 Global Step: 217600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:53:28,210-Speed 4351.75 samples/sec Loss 1.0444 Epoch: 13 Global Step: 217650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:53:38,932-Speed 4775.43 samples/sec Loss 1.0788 Epoch: 13 Global Step: 217700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:53:50,017-Speed 4619.10 samples/sec Loss 1.0604 Epoch: 13 Global Step: 217750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:54:00,765-Speed 4763.93 samples/sec Loss 1.0585 Epoch: 13 Global Step: 217800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:54:11,440-Speed 4796.54 samples/sec Loss 1.0824 Epoch: 13 Global Step: 217850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:54:23,418-Speed 4274.83 samples/sec Loss 1.0415 Epoch: 13 Global Step: 217900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:54:34,404-Speed 4660.79 samples/sec Loss 1.0597 Epoch: 13 Global Step: 217950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:54:46,155-Speed 4357.27 samples/sec Loss 1.0550 Epoch: 13 Global Step: 218000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:55:10,167-[lfw][218000]XNorm: 22.341751 Training: 2021-03-18 06:55:10,168-[lfw][218000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-18 06:55:10,168-[lfw][218000]Accuracy-Highest: 0.99800 Training: 2021-03-18 06:55:37,618-[cfp_fp][218000]XNorm: 20.540421 Training: 2021-03-18 06:55:37,618-[cfp_fp][218000]Accuracy-Flip: 0.98386+-0.00673 Training: 2021-03-18 06:55:37,618-[cfp_fp][218000]Accuracy-Highest: 0.98571 Training: 2021-03-18 06:56:01,320-[agedb_30][218000]XNorm: 22.273084 Training: 2021-03-18 06:56:01,320-[agedb_30][218000]Accuracy-Flip: 0.98017+-0.00634 Training: 2021-03-18 06:56:01,320-[agedb_30][218000]Accuracy-Highest: 0.98183 Training: 2021-03-18 06:56:12,136-Speed 595.49 samples/sec Loss 1.0625 Epoch: 13 Global Step: 218050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:56:23,090-Speed 4674.34 samples/sec Loss 1.0557 Epoch: 13 Global Step: 218100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:56:34,041-Speed 4675.79 samples/sec Loss 1.0637 Epoch: 13 Global Step: 218150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:56:44,819-Speed 4750.70 samples/sec Loss 1.0676 Epoch: 13 Global Step: 218200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:56:56,293-Speed 4462.32 samples/sec Loss 1.0496 Epoch: 13 Global Step: 218250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:57:06,915-Speed 4820.39 samples/sec Loss 1.0756 Epoch: 13 Global Step: 218300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:57:17,890-Speed 4665.54 samples/sec Loss 1.0617 Epoch: 13 Global Step: 218350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:57:28,861-Speed 4666.96 samples/sec Loss 1.0525 Epoch: 13 Global Step: 218400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:57:39,957-Speed 4614.91 samples/sec Loss 1.0605 Epoch: 13 Global Step: 218450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:57:50,833-Speed 4707.81 samples/sec Loss 1.0595 Epoch: 13 Global Step: 218500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:01,879-Speed 4635.60 samples/sec Loss 1.0649 Epoch: 13 Global Step: 218550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:12,728-Speed 4719.65 samples/sec Loss 1.0581 Epoch: 13 Global Step: 218600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:23,610-Speed 4705.03 samples/sec Loss 1.0623 Epoch: 13 Global Step: 218650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:34,314-Speed 4783.73 samples/sec Loss 1.0733 Epoch: 13 Global Step: 218700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:45,316-Speed 4654.01 samples/sec Loss 1.0598 Epoch: 13 Global Step: 218750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:58:56,179-Speed 4713.75 samples/sec Loss 1.0640 Epoch: 13 Global Step: 218800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:59:07,273-Speed 4615.31 samples/sec Loss 1.0532 Epoch: 13 Global Step: 218850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:59:18,280-Speed 4652.08 samples/sec Loss 1.0529 Epoch: 13 Global Step: 218900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:59:29,281-Speed 4654.73 samples/sec Loss 1.0614 Epoch: 13 Global Step: 218950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:59:40,346-Speed 4627.34 samples/sec Loss 1.0630 Epoch: 13 Global Step: 219000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 06:59:51,110-Speed 4756.82 samples/sec Loss 1.0636 Epoch: 13 Global Step: 219050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:02,177-Speed 4626.71 samples/sec Loss 1.0601 Epoch: 13 Global Step: 219100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:13,095-Speed 4689.98 samples/sec Loss 1.0721 Epoch: 13 Global Step: 219150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:25,074-Speed 4274.37 samples/sec Loss 1.0537 Epoch: 13 Global Step: 219200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:35,968-Speed 4699.91 samples/sec Loss 1.0511 Epoch: 13 Global Step: 219250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:46,837-Speed 4711.13 samples/sec Loss 1.0543 Epoch: 13 Global Step: 219300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:00:57,778-Speed 4679.70 samples/sec Loss 1.0818 Epoch: 13 Global Step: 219350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:01:09,667-Speed 4306.79 samples/sec Loss 1.0742 Epoch: 13 Global Step: 219400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:01:20,640-Speed 4666.42 samples/sec Loss 1.0515 Epoch: 13 Global Step: 219450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:01:31,673-Speed 4640.81 samples/sec Loss 1.0689 Epoch: 13 Global Step: 219500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:01:42,906-Speed 4558.20 samples/sec Loss 1.0745 Epoch: 13 Global Step: 219550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:01:54,024-Speed 4605.44 samples/sec Loss 1.0715 Epoch: 13 Global Step: 219600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:04,976-Speed 4675.41 samples/sec Loss 1.0843 Epoch: 13 Global Step: 219650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:15,968-Speed 4658.10 samples/sec Loss 1.0866 Epoch: 13 Global Step: 219700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:27,112-Speed 4594.66 samples/sec Loss 1.0746 Epoch: 13 Global Step: 219750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:37,997-Speed 4704.30 samples/sec Loss 1.0670 Epoch: 13 Global Step: 219800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:48,778-Speed 4749.30 samples/sec Loss 1.0422 Epoch: 13 Global Step: 219850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:02:59,792-Speed 4649.09 samples/sec Loss 1.0655 Epoch: 13 Global Step: 219900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:03:11,750-Speed 4281.99 samples/sec Loss 1.0731 Epoch: 13 Global Step: 219950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:03:21,896-Speed 5046.54 samples/sec Loss 1.0432 Epoch: 13 Global Step: 220000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:03:46,509-[lfw][220000]XNorm: 22.876906 Training: 2021-03-18 07:03:46,509-[lfw][220000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 07:03:46,509-[lfw][220000]Accuracy-Highest: 0.99800 Training: 2021-03-18 07:04:14,198-[cfp_fp][220000]XNorm: 21.093298 Training: 2021-03-18 07:04:14,198-[cfp_fp][220000]Accuracy-Flip: 0.98557+-0.00587 Training: 2021-03-18 07:04:14,198-[cfp_fp][220000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:04:37,898-[agedb_30][220000]XNorm: 22.787288 Training: 2021-03-18 07:04:37,898-[agedb_30][220000]Accuracy-Flip: 0.98200+-0.00682 Training: 2021-03-18 07:04:37,898-[agedb_30][220000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:04:48,765-Speed 589.40 samples/sec Loss 1.0775 Epoch: 13 Global Step: 220050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:04:59,568-Speed 4740.07 samples/sec Loss 1.0689 Epoch: 13 Global Step: 220100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:05:10,602-Speed 4640.23 samples/sec Loss 1.0696 Epoch: 13 Global Step: 220150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:05:21,605-Speed 4653.47 samples/sec Loss 1.0748 Epoch: 13 Global Step: 220200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:05:32,458-Speed 4718.05 samples/sec Loss 1.0626 Epoch: 13 Global Step: 220250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:05:43,343-Speed 4703.80 samples/sec Loss 1.0631 Epoch: 13 Global Step: 220300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:05:55,178-Speed 4326.71 samples/sec Loss 1.0764 Epoch: 13 Global Step: 220350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:06:06,315-Speed 4597.45 samples/sec Loss 1.0781 Epoch: 13 Global Step: 220400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:06:17,174-Speed 4715.40 samples/sec Loss 1.0725 Epoch: 13 Global Step: 220450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:06:29,073-Speed 4303.25 samples/sec Loss 1.0594 Epoch: 13 Global Step: 220500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:06:39,738-Speed 4800.96 samples/sec Loss 1.0703 Epoch: 13 Global Step: 220550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:06:50,561-Speed 4730.80 samples/sec Loss 1.0684 Epoch: 13 Global Step: 220600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:01,603-Speed 4637.21 samples/sec Loss 1.0561 Epoch: 13 Global Step: 220650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:12,618-Speed 4648.11 samples/sec Loss 1.0849 Epoch: 13 Global Step: 220700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:23,542-Speed 4687.42 samples/sec Loss 1.0602 Epoch: 13 Global Step: 220750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:35,090-Speed 4434.08 samples/sec Loss 1.0654 Epoch: 13 Global Step: 220800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:47,162-Speed 4241.42 samples/sec Loss 1.0715 Epoch: 13 Global Step: 220850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:07:58,185-Speed 4645.13 samples/sec Loss 1.0455 Epoch: 13 Global Step: 220900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:08:09,144-Speed 4672.29 samples/sec Loss 1.0696 Epoch: 13 Global Step: 220950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:08:20,091-Speed 4677.13 samples/sec Loss 1.0713 Epoch: 13 Global Step: 221000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:08:30,983-Speed 4700.88 samples/sec Loss 1.0734 Epoch: 13 Global Step: 221050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:08:41,949-Speed 4669.22 samples/sec Loss 1.0822 Epoch: 13 Global Step: 221100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:08:53,574-Speed 4404.81 samples/sec Loss 1.0810 Epoch: 13 Global Step: 221150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:04,431-Speed 4716.01 samples/sec Loss 1.0723 Epoch: 13 Global Step: 221200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:15,275-Speed 4721.49 samples/sec Loss 1.0546 Epoch: 13 Global Step: 221250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:26,077-Speed 4740.54 samples/sec Loss 1.1005 Epoch: 13 Global Step: 221300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:37,275-Speed 4572.18 samples/sec Loss 1.0553 Epoch: 13 Global Step: 221350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:48,164-Speed 4702.34 samples/sec Loss 1.0716 Epoch: 13 Global Step: 221400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:09:59,409-Speed 4553.11 samples/sec Loss 1.0586 Epoch: 13 Global Step: 221450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:10:10,324-Speed 4691.06 samples/sec Loss 1.0758 Epoch: 13 Global Step: 221500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:10:21,498-Speed 4582.18 samples/sec Loss 1.0531 Epoch: 13 Global Step: 221550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:10:32,575-Speed 4622.88 samples/sec Loss 1.0727 Epoch: 13 Global Step: 221600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:10:43,577-Speed 4653.99 samples/sec Loss 1.0688 Epoch: 13 Global Step: 221650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:10:54,663-Speed 4618.41 samples/sec Loss 1.0769 Epoch: 13 Global Step: 221700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:11:05,547-Speed 4704.59 samples/sec Loss 1.0594 Epoch: 13 Global Step: 221750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:11:16,658-Speed 4608.34 samples/sec Loss 1.0548 Epoch: 13 Global Step: 221800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:11:27,563-Speed 4695.23 samples/sec Loss 1.0651 Epoch: 13 Global Step: 221850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:11:38,577-Speed 4648.71 samples/sec Loss 1.0869 Epoch: 13 Global Step: 221900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:11:49,554-Speed 4664.82 samples/sec Loss 1.0916 Epoch: 13 Global Step: 221950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:12:01,340-Speed 4344.34 samples/sec Loss 1.0845 Epoch: 13 Global Step: 222000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:12:25,200-[lfw][222000]XNorm: 22.732059 Training: 2021-03-18 07:12:25,200-[lfw][222000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-18 07:12:25,200-[lfw][222000]Accuracy-Highest: 0.99800 Training: 2021-03-18 07:12:52,776-[cfp_fp][222000]XNorm: 21.020170 Training: 2021-03-18 07:12:52,777-[cfp_fp][222000]Accuracy-Flip: 0.98429+-0.00717 Training: 2021-03-18 07:12:52,777-[cfp_fp][222000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:13:16,546-[agedb_30][222000]XNorm: 22.741520 Training: 2021-03-18 07:13:16,546-[agedb_30][222000]Accuracy-Flip: 0.98167+-0.00658 Training: 2021-03-18 07:13:16,547-[agedb_30][222000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:13:27,473-Speed 594.43 samples/sec Loss 1.0562 Epoch: 13 Global Step: 222050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:13:38,522-Speed 4634.20 samples/sec Loss 1.0718 Epoch: 13 Global Step: 222100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:13:49,331-Speed 4737.06 samples/sec Loss 1.0886 Epoch: 13 Global Step: 222150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:00,346-Speed 4648.35 samples/sec Loss 1.0590 Epoch: 13 Global Step: 222200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:12,003-Speed 4392.69 samples/sec Loss 1.0592 Epoch: 13 Global Step: 222250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:22,756-Speed 4761.74 samples/sec Loss 1.0665 Epoch: 13 Global Step: 222300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:33,578-Speed 4731.33 samples/sec Loss 1.0587 Epoch: 13 Global Step: 222350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:44,417-Speed 4723.85 samples/sec Loss 1.0636 Epoch: 13 Global Step: 222400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:14:55,272-Speed 4716.89 samples/sec Loss 1.0733 Epoch: 13 Global Step: 222450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:15:06,303-Speed 4641.78 samples/sec Loss 1.0985 Epoch: 13 Global Step: 222500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:15:17,313-Speed 4650.70 samples/sec Loss 1.0678 Epoch: 13 Global Step: 222550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:15:28,176-Speed 4713.74 samples/sec Loss 1.0917 Epoch: 13 Global Step: 222600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:15:39,297-Speed 4604.29 samples/sec Loss 1.0779 Epoch: 13 Global Step: 222650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:15:49,937-Speed 4812.16 samples/sec Loss 1.0880 Epoch: 13 Global Step: 222700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:01,754-Speed 4332.89 samples/sec Loss 1.0884 Epoch: 13 Global Step: 222750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:12,340-Speed 4836.73 samples/sec Loss 1.0562 Epoch: 13 Global Step: 222800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:23,539-Speed 4572.42 samples/sec Loss 1.0548 Epoch: 13 Global Step: 222850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:34,091-Speed 4852.02 samples/sec Loss 1.0688 Epoch: 13 Global Step: 222900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:45,181-Speed 4617.10 samples/sec Loss 1.0780 Epoch: 13 Global Step: 222950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:16:56,076-Speed 4699.61 samples/sec Loss 1.0760 Epoch: 13 Global Step: 223000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:17:07,044-Speed 4668.47 samples/sec Loss 1.0779 Epoch: 13 Global Step: 223050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:17:17,959-Speed 4691.15 samples/sec Loss 1.0887 Epoch: 13 Global Step: 223100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:17:28,975-Speed 4648.50 samples/sec Loss 1.0629 Epoch: 13 Global Step: 223150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:17:40,875-Speed 4302.90 samples/sec Loss 1.0747 Epoch: 13 Global Step: 223200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:17:51,900-Speed 4644.27 samples/sec Loss 1.0611 Epoch: 13 Global Step: 223250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:03,012-Speed 4607.72 samples/sec Loss 1.0690 Epoch: 13 Global Step: 223300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:13,931-Speed 4689.23 samples/sec Loss 1.0771 Epoch: 13 Global Step: 223350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:25,827-Speed 4304.33 samples/sec Loss 1.0827 Epoch: 13 Global Step: 223400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:37,020-Speed 4574.97 samples/sec Loss 1.0506 Epoch: 13 Global Step: 223450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:47,968-Speed 4676.98 samples/sec Loss 1.0659 Epoch: 13 Global Step: 223500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:18:58,836-Speed 4711.45 samples/sec Loss 1.0626 Epoch: 13 Global Step: 223550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:19:09,686-Speed 4719.10 samples/sec Loss 1.0579 Epoch: 13 Global Step: 223600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:19:20,550-Speed 4713.00 samples/sec Loss 1.0719 Epoch: 13 Global Step: 223650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:19:32,224-Speed 4386.09 samples/sec Loss 1.0797 Epoch: 13 Global Step: 223700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:19:44,079-Speed 4319.17 samples/sec Loss 1.0819 Epoch: 13 Global Step: 223750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:19:55,046-Speed 4668.64 samples/sec Loss 1.0850 Epoch: 13 Global Step: 223800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:20:06,034-Speed 4660.16 samples/sec Loss 1.0798 Epoch: 13 Global Step: 223850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:20:17,375-Speed 4514.74 samples/sec Loss 1.0774 Epoch: 13 Global Step: 223900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:20:28,241-Speed 4712.20 samples/sec Loss 1.0908 Epoch: 13 Global Step: 223950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:20:40,037-Speed 4340.68 samples/sec Loss 1.0593 Epoch: 13 Global Step: 224000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:21:03,785-[lfw][224000]XNorm: 22.749085 Training: 2021-03-18 07:21:03,785-[lfw][224000]Accuracy-Flip: 0.99733+-0.00281 Training: 2021-03-18 07:21:03,785-[lfw][224000]Accuracy-Highest: 0.99800 Training: 2021-03-18 07:21:31,222-[cfp_fp][224000]XNorm: 20.884031 Training: 2021-03-18 07:21:31,223-[cfp_fp][224000]Accuracy-Flip: 0.98500+-0.00516 Training: 2021-03-18 07:21:31,223-[cfp_fp][224000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:21:54,883-[agedb_30][224000]XNorm: 22.782635 Training: 2021-03-18 07:21:54,883-[agedb_30][224000]Accuracy-Flip: 0.98150+-0.00685 Training: 2021-03-18 07:21:54,883-[agedb_30][224000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:22:05,610-Speed 598.33 samples/sec Loss 1.0668 Epoch: 13 Global Step: 224050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:22:16,462-Speed 4718.25 samples/sec Loss 1.0810 Epoch: 13 Global Step: 224100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:22:27,526-Speed 4627.53 samples/sec Loss 1.0851 Epoch: 13 Global Step: 224150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:22:38,470-Speed 4679.02 samples/sec Loss 1.0838 Epoch: 13 Global Step: 224200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:22:49,323-Speed 4717.89 samples/sec Loss 1.0838 Epoch: 13 Global Step: 224250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:00,334-Speed 4650.03 samples/sec Loss 1.0733 Epoch: 13 Global Step: 224300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:11,405-Speed 4624.95 samples/sec Loss 1.0723 Epoch: 13 Global Step: 224350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:22,287-Speed 4705.30 samples/sec Loss 1.0686 Epoch: 13 Global Step: 224400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:33,089-Speed 4740.12 samples/sec Loss 1.0646 Epoch: 13 Global Step: 224450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:44,072-Speed 4661.86 samples/sec Loss 1.0526 Epoch: 13 Global Step: 224500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:23:55,149-Speed 4622.53 samples/sec Loss 1.0959 Epoch: 13 Global Step: 224550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:24:06,059-Speed 4693.30 samples/sec Loss 1.0574 Epoch: 13 Global Step: 224600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:24:16,918-Speed 4715.60 samples/sec Loss 1.0706 Epoch: 13 Global Step: 224650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:24:28,010-Speed 4615.98 samples/sec Loss 1.0714 Epoch: 13 Global Step: 224700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:24:38,848-Speed 4724.38 samples/sec Loss 1.0644 Epoch: 13 Global Step: 224750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:24:50,514-Speed 4389.42 samples/sec Loss 1.0671 Epoch: 13 Global Step: 224800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:01,643-Speed 4600.96 samples/sec Loss 1.0802 Epoch: 13 Global Step: 224850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:12,887-Speed 4553.91 samples/sec Loss 1.0778 Epoch: 13 Global Step: 224900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:23,757-Speed 4710.57 samples/sec Loss 1.0772 Epoch: 13 Global Step: 224950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:34,679-Speed 4687.82 samples/sec Loss 1.0699 Epoch: 13 Global Step: 225000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:45,624-Speed 4678.23 samples/sec Loss 1.0624 Epoch: 13 Global Step: 225050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:25:56,678-Speed 4632.13 samples/sec Loss 1.0892 Epoch: 13 Global Step: 225100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:26:08,513-Speed 4326.72 samples/sec Loss 1.0733 Epoch: 13 Global Step: 225150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:26:19,464-Speed 4675.41 samples/sec Loss 1.0852 Epoch: 13 Global Step: 225200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:26:30,333-Speed 4711.15 samples/sec Loss 1.0520 Epoch: 13 Global Step: 225250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:26:41,202-Speed 4710.64 samples/sec Loss 1.0728 Epoch: 13 Global Step: 225300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:26:52,146-Speed 4678.79 samples/sec Loss 1.0796 Epoch: 13 Global Step: 225350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:03,023-Speed 4707.29 samples/sec Loss 1.0671 Epoch: 13 Global Step: 225400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:14,127-Speed 4611.45 samples/sec Loss 1.0784 Epoch: 13 Global Step: 225450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:25,840-Speed 4371.68 samples/sec Loss 1.0786 Epoch: 13 Global Step: 225500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:36,671-Speed 4727.31 samples/sec Loss 1.0840 Epoch: 13 Global Step: 225550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:47,623-Speed 4675.39 samples/sec Loss 1.0579 Epoch: 13 Global Step: 225600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:27:58,589-Speed 4669.05 samples/sec Loss 1.0762 Epoch: 13 Global Step: 225650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:28:09,396-Speed 4738.00 samples/sec Loss 1.0661 Epoch: 13 Global Step: 225700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:28:20,256-Speed 4715.01 samples/sec Loss 1.0816 Epoch: 13 Global Step: 225750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:28:30,843-Speed 4836.10 samples/sec Loss 1.0749 Epoch: 13 Global Step: 225800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:28:41,681-Speed 4724.69 samples/sec Loss 1.0944 Epoch: 13 Global Step: 225850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:28:52,379-Speed 4785.86 samples/sec Loss 1.0775 Epoch: 13 Global Step: 225900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:29:03,306-Speed 4686.07 samples/sec Loss 1.0695 Epoch: 13 Global Step: 225950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:29:14,329-Speed 4645.12 samples/sec Loss 1.0767 Epoch: 13 Global Step: 226000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:29:38,714-[lfw][226000]XNorm: 22.778304 Training: 2021-03-18 07:29:38,714-[lfw][226000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 07:29:38,714-[lfw][226000]Accuracy-Highest: 0.99817 Training: 2021-03-18 07:30:06,265-[cfp_fp][226000]XNorm: 20.910753 Training: 2021-03-18 07:30:06,265-[cfp_fp][226000]Accuracy-Flip: 0.98300+-0.00611 Training: 2021-03-18 07:30:06,267-[cfp_fp][226000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:30:30,117-[agedb_30][226000]XNorm: 22.636992 Training: 2021-03-18 07:30:30,118-[agedb_30][226000]Accuracy-Flip: 0.97983+-0.00677 Training: 2021-03-18 07:30:30,118-[agedb_30][226000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:30:41,653-Speed 586.33 samples/sec Loss 1.0735 Epoch: 13 Global Step: 226050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:30:52,525-Speed 4709.89 samples/sec Loss 1.0790 Epoch: 13 Global Step: 226100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:03,474-Speed 4676.56 samples/sec Loss 1.0742 Epoch: 13 Global Step: 226150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:14,510-Speed 4639.48 samples/sec Loss 1.0732 Epoch: 13 Global Step: 226200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:25,715-Speed 4569.55 samples/sec Loss 1.0609 Epoch: 13 Global Step: 226250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:36,841-Speed 4602.16 samples/sec Loss 1.1023 Epoch: 13 Global Step: 226300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:48,374-Speed 4439.50 samples/sec Loss 1.0647 Epoch: 13 Global Step: 226350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:31:59,418-Speed 4636.43 samples/sec Loss 1.0731 Epoch: 13 Global Step: 226400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:32:10,235-Speed 4733.52 samples/sec Loss 1.0682 Epoch: 13 Global Step: 226450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:32:21,934-Speed 4376.57 samples/sec Loss 1.0660 Epoch: 13 Global Step: 226500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:32:32,899-Speed 4669.65 samples/sec Loss 1.0793 Epoch: 13 Global Step: 226550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:32:43,801-Speed 4696.65 samples/sec Loss 1.0866 Epoch: 13 Global Step: 226600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:32:55,616-Speed 4333.56 samples/sec Loss 1.0734 Epoch: 13 Global Step: 226650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:33:06,614-Speed 4655.59 samples/sec Loss 1.0833 Epoch: 13 Global Step: 226700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:33:17,675-Speed 4629.42 samples/sec Loss 1.0652 Epoch: 13 Global Step: 226750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:33:28,518-Speed 4722.09 samples/sec Loss 1.0870 Epoch: 13 Global Step: 226800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:33:40,228-Speed 4372.62 samples/sec Loss 1.0812 Epoch: 13 Global Step: 226850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:33:51,258-Speed 4642.30 samples/sec Loss 1.0576 Epoch: 13 Global Step: 226900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:02,001-Speed 4766.26 samples/sec Loss 1.0847 Epoch: 13 Global Step: 226950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:13,000-Speed 4654.89 samples/sec Loss 1.0843 Epoch: 13 Global Step: 227000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:23,897-Speed 4698.95 samples/sec Loss 1.0831 Epoch: 13 Global Step: 227050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:34,822-Speed 4686.60 samples/sec Loss 1.0690 Epoch: 13 Global Step: 227100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:46,056-Speed 4557.88 samples/sec Loss 1.0889 Epoch: 13 Global Step: 227150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:34:56,916-Speed 4715.06 samples/sec Loss 1.0808 Epoch: 13 Global Step: 227200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:35:07,798-Speed 4705.19 samples/sec Loss 1.0748 Epoch: 13 Global Step: 227250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:35:18,795-Speed 4656.04 samples/sec Loss 1.0843 Epoch: 13 Global Step: 227300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:35:29,714-Speed 4689.56 samples/sec Loss 1.0745 Epoch: 13 Global Step: 227350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:35:40,598-Speed 4704.22 samples/sec Loss 1.0589 Epoch: 13 Global Step: 227400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:35:51,582-Speed 4661.47 samples/sec Loss 1.0830 Epoch: 13 Global Step: 227450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:02,755-Speed 4582.96 samples/sec Loss 1.0607 Epoch: 13 Global Step: 227500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:13,633-Speed 4707.02 samples/sec Loss 1.0635 Epoch: 13 Global Step: 227550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:24,630-Speed 4655.99 samples/sec Loss 1.0817 Epoch: 13 Global Step: 227600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:36,509-Speed 4310.51 samples/sec Loss 1.0901 Epoch: 13 Global Step: 227650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:47,355-Speed 4720.67 samples/sec Loss 1.0861 Epoch: 13 Global Step: 227700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:36:58,176-Speed 4731.65 samples/sec Loss 1.0928 Epoch: 13 Global Step: 227750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:37:09,194-Speed 4647.20 samples/sec Loss 1.0794 Epoch: 13 Global Step: 227800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:37:20,323-Speed 4600.93 samples/sec Loss 1.0885 Epoch: 13 Global Step: 227850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:37:31,082-Speed 4759.31 samples/sec Loss 1.0705 Epoch: 13 Global Step: 227900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:37:42,100-Speed 4647.19 samples/sec Loss 1.0886 Epoch: 13 Global Step: 227950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:37:53,723-Speed 4405.31 samples/sec Loss 1.0795 Epoch: 13 Global Step: 228000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:38:17,942-[lfw][228000]XNorm: 22.609551 Training: 2021-03-18 07:38:17,942-[lfw][228000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 07:38:17,942-[lfw][228000]Accuracy-Highest: 0.99817 Training: 2021-03-18 07:38:45,513-[cfp_fp][228000]XNorm: 20.697802 Training: 2021-03-18 07:38:45,514-[cfp_fp][228000]Accuracy-Flip: 0.98500+-0.00614 Training: 2021-03-18 07:38:45,514-[cfp_fp][228000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:39:09,404-[agedb_30][228000]XNorm: 22.487113 Training: 2021-03-18 07:39:09,404-[agedb_30][228000]Accuracy-Flip: 0.98083+-0.00583 Training: 2021-03-18 07:39:09,405-[agedb_30][228000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:39:20,189-Speed 592.14 samples/sec Loss 1.0752 Epoch: 13 Global Step: 228050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:39:31,361-Speed 4583.29 samples/sec Loss 1.0883 Epoch: 13 Global Step: 228100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:39:42,235-Speed 4708.72 samples/sec Loss 1.0967 Epoch: 13 Global Step: 228150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:39:53,212-Speed 4664.45 samples/sec Loss 1.0765 Epoch: 13 Global Step: 228200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:04,235-Speed 4645.39 samples/sec Loss 1.0971 Epoch: 13 Global Step: 228250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:15,308-Speed 4624.01 samples/sec Loss 1.0732 Epoch: 13 Global Step: 228300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:26,952-Speed 4397.44 samples/sec Loss 1.0953 Epoch: 13 Global Step: 228350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:37,956-Speed 4653.10 samples/sec Loss 1.0590 Epoch: 13 Global Step: 228400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:48,839-Speed 4705.16 samples/sec Loss 1.0773 Epoch: 13 Global Step: 228450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:40:59,816-Speed 4664.42 samples/sec Loss 1.0803 Epoch: 13 Global Step: 228500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:41:10,804-Speed 4659.75 samples/sec Loss 1.0814 Epoch: 13 Global Step: 228550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:41:21,800-Speed 4657.01 samples/sec Loss 1.0770 Epoch: 13 Global Step: 228600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:41:33,128-Speed 4519.70 samples/sec Loss 1.0714 Epoch: 13 Global Step: 228650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:41:44,100-Speed 4666.62 samples/sec Loss 1.0683 Epoch: 13 Global Step: 228700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:41:55,105-Speed 4652.84 samples/sec Loss 1.0777 Epoch: 13 Global Step: 228750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:42:06,144-Speed 4638.12 samples/sec Loss 1.0666 Epoch: 13 Global Step: 228800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:42:17,187-Speed 4636.66 samples/sec Loss 1.0804 Epoch: 13 Global Step: 228850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:42:29,042-Speed 4319.22 samples/sec Loss 1.0704 Epoch: 13 Global Step: 228900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:42:40,128-Speed 4618.55 samples/sec Loss 1.0767 Epoch: 13 Global Step: 228950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:42:51,071-Speed 4679.13 samples/sec Loss 1.0784 Epoch: 13 Global Step: 229000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:43:01,986-Speed 4691.05 samples/sec Loss 1.0611 Epoch: 13 Global Step: 229050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:43:12,892-Speed 4694.63 samples/sec Loss 1.0918 Epoch: 13 Global Step: 229100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:43:23,829-Speed 4681.96 samples/sec Loss 1.0905 Epoch: 13 Global Step: 229150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-18 07:43:35,637-Speed 4336.01 samples/sec Loss 1.0785 Epoch: 13 Global Step: 229200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:43:46,476-Speed 4724.37 samples/sec Loss 1.0619 Epoch: 13 Global Step: 229250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:43:57,484-Speed 4651.50 samples/sec Loss 1.0816 Epoch: 13 Global Step: 229300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:44:08,659-Speed 4581.76 samples/sec Loss 1.0876 Epoch: 13 Global Step: 229350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:44:19,721-Speed 4628.54 samples/sec Loss 1.0825 Epoch: 13 Global Step: 229400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:44:31,417-Speed 4377.85 samples/sec Loss 1.0689 Epoch: 13 Global Step: 229450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:44:42,312-Speed 4699.92 samples/sec Loss 1.0823 Epoch: 13 Global Step: 229500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:44:53,423-Speed 4608.51 samples/sec Loss 1.1018 Epoch: 13 Global Step: 229550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:45:05,346-Speed 4294.33 samples/sec Loss 1.0882 Epoch: 13 Global Step: 229600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:45:16,276-Speed 4684.81 samples/sec Loss 1.0725 Epoch: 13 Global Step: 229650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:45:27,366-Speed 4616.95 samples/sec Loss 1.0963 Epoch: 13 Global Step: 229700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:45:38,351-Speed 4661.61 samples/sec Loss 1.0942 Epoch: 13 Global Step: 229750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:45:49,313-Speed 4670.90 samples/sec Loss 1.0907 Epoch: 13 Global Step: 229800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:46:01,435-Speed 4223.88 samples/sec Loss 1.0671 Epoch: 13 Global Step: 229850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:46:12,457-Speed 4645.44 samples/sec Loss 1.0694 Epoch: 13 Global Step: 229900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:46:23,378-Speed 4688.62 samples/sec Loss 1.0689 Epoch: 13 Global Step: 229950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:46:34,331-Speed 4674.74 samples/sec Loss 1.0806 Epoch: 13 Global Step: 230000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:46:58,150-[lfw][230000]XNorm: 22.334656 Training: 2021-03-18 07:46:58,150-[lfw][230000]Accuracy-Flip: 0.99733+-0.00281 Training: 2021-03-18 07:46:58,151-[lfw][230000]Accuracy-Highest: 0.99817 Training: 2021-03-18 07:47:25,610-[cfp_fp][230000]XNorm: 20.421025 Training: 2021-03-18 07:47:25,610-[cfp_fp][230000]Accuracy-Flip: 0.98471+-0.00531 Training: 2021-03-18 07:47:25,611-[cfp_fp][230000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:47:49,308-[agedb_30][230000]XNorm: 22.222701 Training: 2021-03-18 07:47:49,308-[agedb_30][230000]Accuracy-Flip: 0.98050+-0.00637 Training: 2021-03-18 07:47:49,308-[agedb_30][230000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:48:00,046-Speed 597.33 samples/sec Loss 1.0690 Epoch: 13 Global Step: 230050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:48:10,967-Speed 4688.44 samples/sec Loss 1.0687 Epoch: 13 Global Step: 230100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:48:21,723-Speed 4760.63 samples/sec Loss 1.0781 Epoch: 13 Global Step: 230150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:48:32,605-Speed 4705.31 samples/sec Loss 1.0900 Epoch: 13 Global Step: 230200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:48:43,555-Speed 4676.00 samples/sec Loss 1.0731 Epoch: 13 Global Step: 230250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:48:54,392-Speed 4724.98 samples/sec Loss 1.0908 Epoch: 13 Global Step: 230300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:49:05,278-Speed 4703.73 samples/sec Loss 1.0728 Epoch: 13 Global Step: 230350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:49:16,324-Speed 4635.46 samples/sec Loss 1.0783 Epoch: 13 Global Step: 230400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:49:27,257-Speed 4683.12 samples/sec Loss 1.0799 Epoch: 13 Global Step: 230450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:49:38,988-Speed 4364.79 samples/sec Loss 1.0877 Epoch: 13 Global Step: 230500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:49:50,077-Speed 4617.42 samples/sec Loss 1.0870 Epoch: 13 Global Step: 230550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:00,865-Speed 4746.08 samples/sec Loss 1.0807 Epoch: 13 Global Step: 230600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:11,820-Speed 4673.97 samples/sec Loss 1.0748 Epoch: 13 Global Step: 230650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:22,907-Speed 4618.01 samples/sec Loss 1.0956 Epoch: 13 Global Step: 230700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:33,935-Speed 4643.39 samples/sec Loss 1.0894 Epoch: 13 Global Step: 230750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:46,079-Speed 4216.31 samples/sec Loss 1.0958 Epoch: 13 Global Step: 230800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:50:56,902-Speed 4730.97 samples/sec Loss 1.0915 Epoch: 13 Global Step: 230850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:51:07,826-Speed 4687.24 samples/sec Loss 1.0930 Epoch: 13 Global Step: 230900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:51:18,727-Speed 4696.96 samples/sec Loss 1.0866 Epoch: 13 Global Step: 230950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:51:29,708-Speed 4662.73 samples/sec Loss 1.0864 Epoch: 13 Global Step: 231000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:51:40,763-Speed 4631.80 samples/sec Loss 1.0777 Epoch: 13 Global Step: 231050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:51:51,656-Speed 4700.78 samples/sec Loss 1.0736 Epoch: 13 Global Step: 231100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:03,286-Speed 4402.58 samples/sec Loss 1.1061 Epoch: 13 Global Step: 231150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:14,145-Speed 4715.31 samples/sec Loss 1.0643 Epoch: 13 Global Step: 231200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:25,302-Speed 4589.26 samples/sec Loss 1.0601 Epoch: 13 Global Step: 231250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:36,286-Speed 4661.67 samples/sec Loss 1.0963 Epoch: 13 Global Step: 231300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:47,226-Speed 4680.28 samples/sec Loss 1.0645 Epoch: 13 Global Step: 231350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:52:58,256-Speed 4642.22 samples/sec Loss 1.0749 Epoch: 13 Global Step: 231400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:53:09,119-Speed 4713.39 samples/sec Loss 1.0868 Epoch: 13 Global Step: 231450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:53:20,250-Speed 4599.88 samples/sec Loss 1.0920 Epoch: 13 Global Step: 231500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:53:31,275-Speed 4644.65 samples/sec Loss 1.0792 Epoch: 13 Global Step: 231550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:53:42,112-Speed 4724.46 samples/sec Loss 1.0812 Epoch: 13 Global Step: 231600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:53:53,071-Speed 4672.63 samples/sec Loss 1.0936 Epoch: 13 Global Step: 231650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:03,964-Speed 4700.22 samples/sec Loss 1.0852 Epoch: 13 Global Step: 231700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:15,633-Speed 4387.81 samples/sec Loss 1.0750 Epoch: 13 Global Step: 231750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:26,498-Speed 4712.72 samples/sec Loss 1.0851 Epoch: 13 Global Step: 231800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:37,506-Speed 4651.29 samples/sec Loss 1.0875 Epoch: 13 Global Step: 231850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:48,539-Speed 4641.20 samples/sec Loss 1.0897 Epoch: 13 Global Step: 231900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:54:59,510-Speed 4667.18 samples/sec Loss 1.0757 Epoch: 13 Global Step: 231950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:55:11,284-Speed 4348.65 samples/sec Loss 1.0949 Epoch: 13 Global Step: 232000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:55:35,415-[lfw][232000]XNorm: 22.416691 Training: 2021-03-18 07:55:35,416-[lfw][232000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-18 07:55:35,416-[lfw][232000]Accuracy-Highest: 0.99817 Training: 2021-03-18 07:56:02,860-[cfp_fp][232000]XNorm: 20.706631 Training: 2021-03-18 07:56:02,861-[cfp_fp][232000]Accuracy-Flip: 0.98457+-0.00490 Training: 2021-03-18 07:56:02,861-[cfp_fp][232000]Accuracy-Highest: 0.98571 Training: 2021-03-18 07:56:26,593-[agedb_30][232000]XNorm: 22.438286 Training: 2021-03-18 07:56:26,593-[agedb_30][232000]Accuracy-Flip: 0.98167+-0.00699 Training: 2021-03-18 07:56:26,593-[agedb_30][232000]Accuracy-Highest: 0.98200 Training: 2021-03-18 07:56:37,355-Speed 594.86 samples/sec Loss 1.0993 Epoch: 13 Global Step: 232050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:56:48,254-Speed 4697.66 samples/sec Loss 1.0641 Epoch: 13 Global Step: 232100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:56:59,251-Speed 4656.18 samples/sec Loss 1.1091 Epoch: 13 Global Step: 232150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:57:10,242-Speed 4658.65 samples/sec Loss 1.0781 Epoch: 13 Global Step: 232200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:57:21,088-Speed 4720.82 samples/sec Loss 1.0761 Epoch: 13 Global Step: 232250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:57:32,231-Speed 4595.27 samples/sec Loss 1.0935 Epoch: 13 Global Step: 232300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:57:43,596-Speed 4505.18 samples/sec Loss 1.0706 Epoch: 13 Global Step: 232350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:57:54,772-Speed 4581.72 samples/sec Loss 1.0825 Epoch: 13 Global Step: 232400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:58:06,522-Speed 4357.37 samples/sec Loss 1.1087 Epoch: 13 Global Step: 232450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:58:17,333-Speed 4736.17 samples/sec Loss 1.0973 Epoch: 13 Global Step: 232500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:58:28,122-Speed 4746.13 samples/sec Loss 1.0730 Epoch: 13 Global Step: 232550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:58:39,023-Speed 4697.06 samples/sec Loss 1.0848 Epoch: 13 Global Step: 232600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:58:50,180-Speed 4589.41 samples/sec Loss 1.0855 Epoch: 13 Global Step: 232650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:00,910-Speed 4772.08 samples/sec Loss 1.0906 Epoch: 13 Global Step: 232700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:12,688-Speed 4347.08 samples/sec Loss 1.0905 Epoch: 13 Global Step: 232750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:23,455-Speed 4755.42 samples/sec Loss 1.0770 Epoch: 13 Global Step: 232800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:34,285-Speed 4728.17 samples/sec Loss 1.0929 Epoch: 13 Global Step: 232850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:45,075-Speed 4745.43 samples/sec Loss 1.0771 Epoch: 13 Global Step: 232900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 07:59:56,209-Speed 4598.64 samples/sec Loss 1.0713 Epoch: 13 Global Step: 232950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:00:07,016-Speed 4738.16 samples/sec Loss 1.0855 Epoch: 13 Global Step: 233000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:00:18,017-Speed 4654.28 samples/sec Loss 1.1006 Epoch: 13 Global Step: 233050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:00:29,248-Speed 4559.07 samples/sec Loss 1.1013 Epoch: 13 Global Step: 233100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:00:40,002-Speed 4761.43 samples/sec Loss 1.0719 Epoch: 13 Global Step: 233150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:00:50,923-Speed 4688.56 samples/sec Loss 1.0829 Epoch: 13 Global Step: 233200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:02,581-Speed 4392.02 samples/sec Loss 1.0838 Epoch: 13 Global Step: 233250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:13,381-Speed 4741.10 samples/sec Loss 1.0751 Epoch: 13 Global Step: 233300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:24,424-Speed 4636.70 samples/sec Loss 1.0839 Epoch: 13 Global Step: 233350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:35,405-Speed 4662.88 samples/sec Loss 1.0719 Epoch: 13 Global Step: 233400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:46,547-Speed 4595.22 samples/sec Loss 1.0940 Epoch: 13 Global Step: 233450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:01:57,668-Speed 4604.24 samples/sec Loss 1.0804 Epoch: 13 Global Step: 233500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:02:08,649-Speed 4662.85 samples/sec Loss 1.0928 Epoch: 13 Global Step: 233550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:02:19,755-Speed 4610.58 samples/sec Loss 1.0925 Epoch: 13 Global Step: 233600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:02:30,780-Speed 4643.94 samples/sec Loss 1.0729 Epoch: 13 Global Step: 233650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:02:55,458-Speed 2074.82 samples/sec Loss 1.0417 Epoch: 14 Global Step: 233700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:03:06,545-Speed 4618.21 samples/sec Loss 0.9639 Epoch: 14 Global Step: 233750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:03:17,581-Speed 4639.63 samples/sec Loss 0.9654 Epoch: 14 Global Step: 233800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:03:28,580-Speed 4655.40 samples/sec Loss 0.9651 Epoch: 14 Global Step: 233850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:03:40,546-Speed 4278.98 samples/sec Loss 0.9610 Epoch: 14 Global Step: 233900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:03:51,551-Speed 4652.64 samples/sec Loss 0.9683 Epoch: 14 Global Step: 233950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:04:02,722-Speed 4583.93 samples/sec Loss 0.9734 Epoch: 14 Global Step: 234000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:04:27,069-[lfw][234000]XNorm: 22.570215 Training: 2021-03-18 08:04:27,070-[lfw][234000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 08:04:27,070-[lfw][234000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:04:54,555-[cfp_fp][234000]XNorm: 20.691838 Training: 2021-03-18 08:04:54,555-[cfp_fp][234000]Accuracy-Flip: 0.98514+-0.00466 Training: 2021-03-18 08:04:54,555-[cfp_fp][234000]Accuracy-Highest: 0.98571 Training: 2021-03-18 08:05:18,379-[agedb_30][234000]XNorm: 22.605851 Training: 2021-03-18 08:05:18,379-[agedb_30][234000]Accuracy-Flip: 0.98250+-0.00659 Training: 2021-03-18 08:05:18,379-[agedb_30][234000]Accuracy-Highest: 0.98250 Training: 2021-03-18 08:05:29,240-Speed 591.79 samples/sec Loss 0.9390 Epoch: 14 Global Step: 234050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:05:40,276-Speed 4639.92 samples/sec Loss 0.9431 Epoch: 14 Global Step: 234100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:05:51,255-Speed 4663.56 samples/sec Loss 0.9463 Epoch: 14 Global Step: 234150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:02,081-Speed 4729.57 samples/sec Loss 0.9361 Epoch: 14 Global Step: 234200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:13,027-Speed 4677.64 samples/sec Loss 0.9479 Epoch: 14 Global Step: 234250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:24,046-Speed 4646.93 samples/sec Loss 0.9663 Epoch: 14 Global Step: 234300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:35,028-Speed 4662.67 samples/sec Loss 0.9362 Epoch: 14 Global Step: 234350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:46,010-Speed 4662.46 samples/sec Loss 0.9492 Epoch: 14 Global Step: 234400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:06:56,918-Speed 4694.12 samples/sec Loss 0.9295 Epoch: 14 Global Step: 234450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:07:07,892-Speed 4665.79 samples/sec Loss 0.9498 Epoch: 14 Global Step: 234500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:07:18,953-Speed 4629.22 samples/sec Loss 0.9409 Epoch: 14 Global Step: 234550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:07:30,464-Speed 4448.25 samples/sec Loss 0.9160 Epoch: 14 Global Step: 234600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:07:41,470-Speed 4652.26 samples/sec Loss 0.9211 Epoch: 14 Global Step: 234650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:07:52,419-Speed 4676.36 samples/sec Loss 0.9243 Epoch: 14 Global Step: 234700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:03,209-Speed 4745.49 samples/sec Loss 0.9286 Epoch: 14 Global Step: 234750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:14,185-Speed 4665.09 samples/sec Loss 0.9431 Epoch: 14 Global Step: 234800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:25,165-Speed 4663.32 samples/sec Loss 0.9435 Epoch: 14 Global Step: 234850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:36,125-Speed 4671.84 samples/sec Loss 0.9320 Epoch: 14 Global Step: 234900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:47,805-Speed 4383.93 samples/sec Loss 0.9094 Epoch: 14 Global Step: 234950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:08:58,790-Speed 4661.35 samples/sec Loss 0.9349 Epoch: 14 Global Step: 235000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:09:09,830-Speed 4637.90 samples/sec Loss 0.9350 Epoch: 14 Global Step: 235050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:09:20,973-Speed 4594.90 samples/sec Loss 0.9180 Epoch: 14 Global Step: 235100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:09:32,007-Speed 4640.58 samples/sec Loss 0.9336 Epoch: 14 Global Step: 235150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:09:43,047-Speed 4638.03 samples/sec Loss 0.9202 Epoch: 14 Global Step: 235200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:09:54,730-Speed 4382.58 samples/sec Loss 0.9220 Epoch: 14 Global Step: 235250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:10:05,704-Speed 4666.09 samples/sec Loss 0.9295 Epoch: 14 Global Step: 235300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:10:16,750-Speed 4635.45 samples/sec Loss 0.9298 Epoch: 14 Global Step: 235350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:10:28,571-Speed 4331.37 samples/sec Loss 0.9241 Epoch: 14 Global Step: 235400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:10:39,428-Speed 4716.16 samples/sec Loss 0.9372 Epoch: 14 Global Step: 235450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:10:50,538-Speed 4608.62 samples/sec Loss 0.9194 Epoch: 14 Global Step: 235500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:01,705-Speed 4585.32 samples/sec Loss 0.9272 Epoch: 14 Global Step: 235550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:13,813-Speed 4228.57 samples/sec Loss 0.9313 Epoch: 14 Global Step: 235600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:24,656-Speed 4722.73 samples/sec Loss 0.9312 Epoch: 14 Global Step: 235650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:35,907-Speed 4550.85 samples/sec Loss 0.9380 Epoch: 14 Global Step: 235700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:46,706-Speed 4741.58 samples/sec Loss 0.9246 Epoch: 14 Global Step: 235750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:11:57,570-Speed 4712.94 samples/sec Loss 0.9425 Epoch: 14 Global Step: 235800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:12:08,617-Speed 4635.35 samples/sec Loss 0.9053 Epoch: 14 Global Step: 235850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:12:19,663-Speed 4635.34 samples/sec Loss 0.9224 Epoch: 14 Global Step: 235900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:12:30,872-Speed 4568.26 samples/sec Loss 0.9130 Epoch: 14 Global Step: 235950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:12:41,514-Speed 4811.36 samples/sec Loss 0.9209 Epoch: 14 Global Step: 236000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:13:05,705-[lfw][236000]XNorm: 22.484318 Training: 2021-03-18 08:13:05,706-[lfw][236000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 08:13:05,706-[lfw][236000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:13:33,386-[cfp_fp][236000]XNorm: 20.750522 Training: 2021-03-18 08:13:33,386-[cfp_fp][236000]Accuracy-Flip: 0.98686+-0.00481 Training: 2021-03-18 08:13:33,386-[cfp_fp][236000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:13:57,036-[agedb_30][236000]XNorm: 22.546291 Training: 2021-03-18 08:13:57,036-[agedb_30][236000]Accuracy-Flip: 0.98283+-0.00671 Training: 2021-03-18 08:13:57,036-[agedb_30][236000]Accuracy-Highest: 0.98283 Training: 2021-03-18 08:14:07,935-Speed 592.46 samples/sec Loss 0.9191 Epoch: 14 Global Step: 236050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:14:18,831-Speed 4699.01 samples/sec Loss 0.9152 Epoch: 14 Global Step: 236100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:14:30,296-Speed 4466.35 samples/sec Loss 0.9402 Epoch: 14 Global Step: 236150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:14:41,224-Speed 4685.14 samples/sec Loss 0.9170 Epoch: 14 Global Step: 236200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:14:51,901-Speed 4795.95 samples/sec Loss 0.9325 Epoch: 14 Global Step: 236250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:02,836-Speed 4682.33 samples/sec Loss 0.9134 Epoch: 14 Global Step: 236300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:13,951-Speed 4606.84 samples/sec Loss 0.9086 Epoch: 14 Global Step: 236350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:24,886-Speed 4682.40 samples/sec Loss 0.9175 Epoch: 14 Global Step: 236400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:35,763-Speed 4707.37 samples/sec Loss 0.9125 Epoch: 14 Global Step: 236450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:47,601-Speed 4325.58 samples/sec Loss 0.9174 Epoch: 14 Global Step: 236500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:15:58,568-Speed 4668.91 samples/sec Loss 0.9213 Epoch: 14 Global Step: 236550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:16:09,635-Speed 4626.47 samples/sec Loss 0.9093 Epoch: 14 Global Step: 236600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:16:20,712-Speed 4622.75 samples/sec Loss 0.9045 Epoch: 14 Global Step: 236650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:16:32,258-Speed 4434.45 samples/sec Loss 0.9289 Epoch: 14 Global Step: 236700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:16:43,207-Speed 4676.58 samples/sec Loss 0.9280 Epoch: 14 Global Step: 236750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:16:54,278-Speed 4625.15 samples/sec Loss 0.9021 Epoch: 14 Global Step: 236800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:05,045-Speed 4755.46 samples/sec Loss 0.9180 Epoch: 14 Global Step: 236850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:16,144-Speed 4613.36 samples/sec Loss 0.9047 Epoch: 14 Global Step: 236900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:27,180-Speed 4639.42 samples/sec Loss 0.9311 Epoch: 14 Global Step: 236950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:38,112-Speed 4683.93 samples/sec Loss 0.9112 Epoch: 14 Global Step: 237000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:48,997-Speed 4704.10 samples/sec Loss 0.9098 Epoch: 14 Global Step: 237050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:17:59,866-Speed 4711.12 samples/sec Loss 0.9127 Epoch: 14 Global Step: 237100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:18:10,909-Speed 4636.77 samples/sec Loss 0.9226 Epoch: 14 Global Step: 237150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:18:21,772-Speed 4713.47 samples/sec Loss 0.9248 Epoch: 14 Global Step: 237200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:18:32,829-Speed 4630.85 samples/sec Loss 0.9066 Epoch: 14 Global Step: 237250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:18:44,010-Speed 4579.20 samples/sec Loss 0.9032 Epoch: 14 Global Step: 237300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:18:54,850-Speed 4723.54 samples/sec Loss 0.9102 Epoch: 14 Global Step: 237350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:19:05,772-Speed 4688.42 samples/sec Loss 0.9040 Epoch: 14 Global Step: 237400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:19:17,500-Speed 4365.59 samples/sec Loss 0.9391 Epoch: 14 Global Step: 237450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:19:28,265-Speed 4756.40 samples/sec Loss 0.9111 Epoch: 14 Global Step: 237500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:19:39,203-Speed 4681.49 samples/sec Loss 0.9162 Epoch: 14 Global Step: 237550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:19:50,089-Speed 4703.59 samples/sec Loss 0.8982 Epoch: 14 Global Step: 237600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:00,872-Speed 4748.34 samples/sec Loss 0.9211 Epoch: 14 Global Step: 237650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:11,974-Speed 4612.22 samples/sec Loss 0.9051 Epoch: 14 Global Step: 237700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:22,930-Speed 4673.80 samples/sec Loss 0.9053 Epoch: 14 Global Step: 237750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:33,779-Speed 4719.32 samples/sec Loss 0.9146 Epoch: 14 Global Step: 237800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:45,602-Speed 4330.91 samples/sec Loss 0.9085 Epoch: 14 Global Step: 237850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:20:56,602-Speed 4654.47 samples/sec Loss 0.9009 Epoch: 14 Global Step: 237900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:21:07,511-Speed 4693.92 samples/sec Loss 0.9073 Epoch: 14 Global Step: 237950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:21:18,619-Speed 4609.30 samples/sec Loss 0.9179 Epoch: 14 Global Step: 238000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:21:42,464-[lfw][238000]XNorm: 22.551650 Training: 2021-03-18 08:21:42,464-[lfw][238000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 08:21:42,464-[lfw][238000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:22:09,982-[cfp_fp][238000]XNorm: 20.895990 Training: 2021-03-18 08:22:09,982-[cfp_fp][238000]Accuracy-Flip: 0.98657+-0.00439 Training: 2021-03-18 08:22:09,982-[cfp_fp][238000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:22:33,904-[agedb_30][238000]XNorm: 22.683853 Training: 2021-03-18 08:22:33,904-[agedb_30][238000]Accuracy-Flip: 0.98200+-0.00670 Training: 2021-03-18 08:22:33,904-[agedb_30][238000]Accuracy-Highest: 0.98283 Training: 2021-03-18 08:22:44,664-Speed 595.04 samples/sec Loss 0.9007 Epoch: 14 Global Step: 238050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:22:56,661-Speed 4268.05 samples/sec Loss 0.9046 Epoch: 14 Global Step: 238100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:23:07,778-Speed 4605.69 samples/sec Loss 0.9165 Epoch: 14 Global Step: 238150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:23:18,843-Speed 4627.61 samples/sec Loss 0.9174 Epoch: 14 Global Step: 238200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:23:29,905-Speed 4628.50 samples/sec Loss 0.9320 Epoch: 14 Global Step: 238250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:23:40,869-Speed 4670.36 samples/sec Loss 0.9011 Epoch: 14 Global Step: 238300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:23:52,688-Speed 4332.10 samples/sec Loss 0.9133 Epoch: 14 Global Step: 238350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:03,651-Speed 4670.55 samples/sec Loss 0.8952 Epoch: 14 Global Step: 238400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:14,642-Speed 4658.67 samples/sec Loss 0.9108 Epoch: 14 Global Step: 238450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:25,731-Speed 4617.33 samples/sec Loss 0.9144 Epoch: 14 Global Step: 238500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:37,696-Speed 4279.35 samples/sec Loss 0.9195 Epoch: 14 Global Step: 238550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:48,573-Speed 4707.76 samples/sec Loss 0.9067 Epoch: 14 Global Step: 238600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:24:59,774-Speed 4571.36 samples/sec Loss 0.9005 Epoch: 14 Global Step: 238650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:25:10,986-Speed 4566.58 samples/sec Loss 0.9077 Epoch: 14 Global Step: 238700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:25:21,992-Speed 4652.08 samples/sec Loss 0.9111 Epoch: 14 Global Step: 238750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:25:33,033-Speed 4637.90 samples/sec Loss 0.9168 Epoch: 14 Global Step: 238800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:25:43,768-Speed 4769.45 samples/sec Loss 0.9027 Epoch: 14 Global Step: 238850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:25:55,480-Speed 4371.89 samples/sec Loss 0.9070 Epoch: 14 Global Step: 238900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:26:06,414-Speed 4683.07 samples/sec Loss 0.9130 Epoch: 14 Global Step: 238950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:26:17,419-Speed 4652.62 samples/sec Loss 0.9083 Epoch: 14 Global Step: 239000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:26:28,252-Speed 4726.74 samples/sec Loss 0.9165 Epoch: 14 Global Step: 239050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:26:39,167-Speed 4691.04 samples/sec Loss 0.9237 Epoch: 14 Global Step: 239100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:26:50,034-Speed 4711.58 samples/sec Loss 0.9081 Epoch: 14 Global Step: 239150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:00,842-Speed 4737.82 samples/sec Loss 0.9136 Epoch: 14 Global Step: 239200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:11,944-Speed 4611.85 samples/sec Loss 0.9068 Epoch: 14 Global Step: 239250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:22,787-Speed 4722.34 samples/sec Loss 0.9153 Epoch: 14 Global Step: 239300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:33,967-Speed 4579.64 samples/sec Loss 0.9128 Epoch: 14 Global Step: 239350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:45,783-Speed 4333.54 samples/sec Loss 0.9117 Epoch: 14 Global Step: 239400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:27:56,671-Speed 4702.55 samples/sec Loss 0.9064 Epoch: 14 Global Step: 239450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:28:08,505-Speed 4326.79 samples/sec Loss 0.8942 Epoch: 14 Global Step: 239500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:28:19,334-Speed 4728.19 samples/sec Loss 0.9176 Epoch: 14 Global Step: 239550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:28:30,232-Speed 4698.81 samples/sec Loss 0.9178 Epoch: 14 Global Step: 239600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:28:41,312-Speed 4621.21 samples/sec Loss 0.9045 Epoch: 14 Global Step: 239650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:28:52,339-Speed 4643.49 samples/sec Loss 0.8991 Epoch: 14 Global Step: 239700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:03,121-Speed 4748.67 samples/sec Loss 0.9048 Epoch: 14 Global Step: 239750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:13,989-Speed 4711.62 samples/sec Loss 0.8959 Epoch: 14 Global Step: 239800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:24,937-Speed 4676.71 samples/sec Loss 0.8974 Epoch: 14 Global Step: 239850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:36,021-Speed 4619.66 samples/sec Loss 0.9187 Epoch: 14 Global Step: 239900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:47,128-Speed 4609.76 samples/sec Loss 0.9080 Epoch: 14 Global Step: 239950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:29:58,010-Speed 4705.52 samples/sec Loss 0.9006 Epoch: 14 Global Step: 240000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:30:21,899-[lfw][240000]XNorm: 22.452478 Training: 2021-03-18 08:30:21,900-[lfw][240000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 08:30:21,900-[lfw][240000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:30:49,384-[cfp_fp][240000]XNorm: 20.776148 Training: 2021-03-18 08:30:49,384-[cfp_fp][240000]Accuracy-Flip: 0.98586+-0.00540 Training: 2021-03-18 08:30:49,384-[cfp_fp][240000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:31:13,121-[agedb_30][240000]XNorm: 22.533606 Training: 2021-03-18 08:31:13,121-[agedb_30][240000]Accuracy-Flip: 0.98317+-0.00669 Training: 2021-03-18 08:31:13,121-[agedb_30][240000]Accuracy-Highest: 0.98317 Training: 2021-03-18 08:31:23,854-Speed 596.43 samples/sec Loss 0.9045 Epoch: 14 Global Step: 240050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:31:34,886-Speed 4641.55 samples/sec Loss 0.9092 Epoch: 14 Global Step: 240100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:31:46,170-Speed 4537.64 samples/sec Loss 0.9026 Epoch: 14 Global Step: 240150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:31:57,129-Speed 4672.16 samples/sec Loss 0.8999 Epoch: 14 Global Step: 240200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:32:08,122-Speed 4657.80 samples/sec Loss 0.9166 Epoch: 14 Global Step: 240250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:32:19,829-Speed 4373.71 samples/sec Loss 0.8890 Epoch: 14 Global Step: 240300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:32:30,795-Speed 4669.25 samples/sec Loss 0.8984 Epoch: 14 Global Step: 240350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:32:41,979-Speed 4578.16 samples/sec Loss 0.9023 Epoch: 14 Global Step: 240400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:32:53,154-Speed 4581.96 samples/sec Loss 0.8985 Epoch: 14 Global Step: 240450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:04,105-Speed 4675.87 samples/sec Loss 0.9043 Epoch: 14 Global Step: 240500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:14,995-Speed 4701.65 samples/sec Loss 0.9051 Epoch: 14 Global Step: 240550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:26,782-Speed 4343.94 samples/sec Loss 0.9077 Epoch: 14 Global Step: 240600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:37,693-Speed 4693.17 samples/sec Loss 0.9196 Epoch: 14 Global Step: 240650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:48,543-Speed 4719.16 samples/sec Loss 0.8964 Epoch: 14 Global Step: 240700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:33:59,521-Speed 4663.87 samples/sec Loss 0.8857 Epoch: 14 Global Step: 240750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:34:10,676-Speed 4590.27 samples/sec Loss 0.8987 Epoch: 14 Global Step: 240800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:34:21,307-Speed 4816.31 samples/sec Loss 0.9166 Epoch: 14 Global Step: 240850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:34:32,347-Speed 4638.00 samples/sec Loss 0.9060 Epoch: 14 Global Step: 240900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:34:43,231-Speed 4704.59 samples/sec Loss 0.8928 Epoch: 14 Global Step: 240950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:34:54,773-Speed 4436.07 samples/sec Loss 0.8915 Epoch: 14 Global Step: 241000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:35:05,642-Speed 4710.98 samples/sec Loss 0.9039 Epoch: 14 Global Step: 241050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:35:16,436-Speed 4743.96 samples/sec Loss 0.8996 Epoch: 14 Global Step: 241100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:35:27,368-Speed 4683.64 samples/sec Loss 0.8935 Epoch: 14 Global Step: 241150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:35:38,303-Speed 4682.54 samples/sec Loss 0.9071 Epoch: 14 Global Step: 241200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:35:49,989-Speed 4381.34 samples/sec Loss 0.9076 Epoch: 14 Global Step: 241250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:01,112-Speed 4603.53 samples/sec Loss 0.9022 Epoch: 14 Global Step: 241300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:12,011-Speed 4698.27 samples/sec Loss 0.9073 Epoch: 14 Global Step: 241350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:22,808-Speed 4742.44 samples/sec Loss 0.9094 Epoch: 14 Global Step: 241400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:33,572-Speed 4756.84 samples/sec Loss 0.9305 Epoch: 14 Global Step: 241450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:44,490-Speed 4689.68 samples/sec Loss 0.8989 Epoch: 14 Global Step: 241500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:36:56,307-Speed 4333.18 samples/sec Loss 0.9031 Epoch: 14 Global Step: 241550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:37:07,147-Speed 4723.42 samples/sec Loss 0.8874 Epoch: 14 Global Step: 241600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:37:18,241-Speed 4615.44 samples/sec Loss 0.9114 Epoch: 14 Global Step: 241650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:37:30,111-Speed 4313.46 samples/sec Loss 0.9024 Epoch: 14 Global Step: 241700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:37:41,155-Speed 4636.24 samples/sec Loss 0.8989 Epoch: 14 Global Step: 241750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:37:52,398-Speed 4554.18 samples/sec Loss 0.9046 Epoch: 14 Global Step: 241800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:38:03,480-Speed 4620.43 samples/sec Loss 0.9160 Epoch: 14 Global Step: 241850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:38:14,527-Speed 4635.09 samples/sec Loss 0.9157 Epoch: 14 Global Step: 241900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:38:25,722-Speed 4573.94 samples/sec Loss 0.8993 Epoch: 14 Global Step: 241950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:38:37,019-Speed 4532.21 samples/sec Loss 0.9021 Epoch: 14 Global Step: 242000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:39:00,807-[lfw][242000]XNorm: 22.504009 Training: 2021-03-18 08:39:00,807-[lfw][242000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 08:39:00,807-[lfw][242000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:39:28,329-[cfp_fp][242000]XNorm: 20.826371 Training: 2021-03-18 08:39:28,330-[cfp_fp][242000]Accuracy-Flip: 0.98671+-0.00465 Training: 2021-03-18 08:39:28,330-[cfp_fp][242000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:39:52,057-[agedb_30][242000]XNorm: 22.597715 Training: 2021-03-18 08:39:52,058-[agedb_30][242000]Accuracy-Flip: 0.98250+-0.00704 Training: 2021-03-18 08:39:52,066-[agedb_30][242000]Accuracy-Highest: 0.98317 Training: 2021-03-18 08:40:02,963-Speed 595.74 samples/sec Loss 0.9056 Epoch: 14 Global Step: 242050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:40:13,850-Speed 4702.95 samples/sec Loss 0.8908 Epoch: 14 Global Step: 242100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:40:25,164-Speed 4525.65 samples/sec Loss 0.9066 Epoch: 14 Global Step: 242150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:40:37,005-Speed 4324.30 samples/sec Loss 0.9086 Epoch: 14 Global Step: 242200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:40:47,820-Speed 4734.55 samples/sec Loss 0.8964 Epoch: 14 Global Step: 242250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:40:59,403-Speed 4420.30 samples/sec Loss 0.8994 Epoch: 14 Global Step: 242300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:41:10,322-Speed 4689.56 samples/sec Loss 0.8998 Epoch: 14 Global Step: 242350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:41:21,364-Speed 4637.08 samples/sec Loss 0.9046 Epoch: 14 Global Step: 242400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:41:32,322-Speed 4672.93 samples/sec Loss 0.9111 Epoch: 14 Global Step: 242450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:41:43,285-Speed 4670.40 samples/sec Loss 0.9165 Epoch: 14 Global Step: 242500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:41:54,153-Speed 4711.25 samples/sec Loss 0.9035 Epoch: 14 Global Step: 242550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:42:05,275-Speed 4603.83 samples/sec Loss 0.8958 Epoch: 14 Global Step: 242600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:42:16,244-Speed 4668.18 samples/sec Loss 0.9199 Epoch: 14 Global Step: 242650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:42:27,330-Speed 4618.83 samples/sec Loss 0.9036 Epoch: 14 Global Step: 242700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:42:38,353-Speed 4644.89 samples/sec Loss 0.8989 Epoch: 14 Global Step: 242750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:42:49,385-Speed 4641.43 samples/sec Loss 0.8897 Epoch: 14 Global Step: 242800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:00,392-Speed 4651.63 samples/sec Loss 0.8963 Epoch: 14 Global Step: 242850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:11,590-Speed 4572.73 samples/sec Loss 0.9087 Epoch: 14 Global Step: 242900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:22,590-Speed 4654.55 samples/sec Loss 0.9123 Epoch: 14 Global Step: 242950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:33,468-Speed 4707.35 samples/sec Loss 0.8812 Epoch: 14 Global Step: 243000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:44,439-Speed 4667.25 samples/sec Loss 0.8961 Epoch: 14 Global Step: 243050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:43:55,791-Speed 4510.25 samples/sec Loss 0.9089 Epoch: 14 Global Step: 243100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:44:07,678-Speed 4307.76 samples/sec Loss 0.9196 Epoch: 14 Global Step: 243150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-18 08:44:18,950-Speed 4542.16 samples/sec Loss 0.8959 Epoch: 14 Global Step: 243200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:44:30,011-Speed 4629.31 samples/sec Loss 0.9141 Epoch: 14 Global Step: 243250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:44:40,803-Speed 4744.42 samples/sec Loss 0.9080 Epoch: 14 Global Step: 243300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:44:51,654-Speed 4719.03 samples/sec Loss 0.9094 Epoch: 14 Global Step: 243350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:02,808-Speed 4590.53 samples/sec Loss 0.8807 Epoch: 14 Global Step: 243400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:13,835-Speed 4643.43 samples/sec Loss 0.9185 Epoch: 14 Global Step: 243450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:25,778-Speed 4287.04 samples/sec Loss 0.9070 Epoch: 14 Global Step: 243500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:36,849-Speed 4625.11 samples/sec Loss 0.9050 Epoch: 14 Global Step: 243550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:47,943-Speed 4615.62 samples/sec Loss 0.9101 Epoch: 14 Global Step: 243600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:45:58,774-Speed 4727.18 samples/sec Loss 0.8992 Epoch: 14 Global Step: 243650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:46:09,752-Speed 4664.15 samples/sec Loss 0.8881 Epoch: 14 Global Step: 243700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:46:20,999-Speed 4552.47 samples/sec Loss 0.9104 Epoch: 14 Global Step: 243750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:46:31,914-Speed 4691.25 samples/sec Loss 0.8947 Epoch: 14 Global Step: 243800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:46:43,980-Speed 4243.57 samples/sec Loss 0.8951 Epoch: 14 Global Step: 243850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:46:55,072-Speed 4616.51 samples/sec Loss 0.9024 Epoch: 14 Global Step: 243900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:47:06,008-Speed 4682.16 samples/sec Loss 0.8885 Epoch: 14 Global Step: 243950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:47:16,936-Speed 4685.11 samples/sec Loss 0.9038 Epoch: 14 Global Step: 244000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:47:40,650-[lfw][244000]XNorm: 22.558241 Training: 2021-03-18 08:47:40,650-[lfw][244000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 08:47:40,650-[lfw][244000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:48:08,347-[cfp_fp][244000]XNorm: 20.941718 Training: 2021-03-18 08:48:08,348-[cfp_fp][244000]Accuracy-Flip: 0.98543+-0.00481 Training: 2021-03-18 08:48:08,348-[cfp_fp][244000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:48:32,250-[agedb_30][244000]XNorm: 22.643840 Training: 2021-03-18 08:48:32,250-[agedb_30][244000]Accuracy-Flip: 0.98167+-0.00764 Training: 2021-03-18 08:48:32,250-[agedb_30][244000]Accuracy-Highest: 0.98317 Training: 2021-03-18 08:48:43,328-Speed 592.65 samples/sec Loss 0.9053 Epoch: 14 Global Step: 244050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:48:54,422-Speed 4615.58 samples/sec Loss 0.9128 Epoch: 14 Global Step: 244100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:49:06,221-Speed 4339.43 samples/sec Loss 0.9157 Epoch: 14 Global Step: 244150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:49:16,987-Speed 4756.23 samples/sec Loss 0.8902 Epoch: 14 Global Step: 244200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:49:28,167-Speed 4579.66 samples/sec Loss 0.9090 Epoch: 14 Global Step: 244250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:49:39,484-Speed 4524.31 samples/sec Loss 0.8982 Epoch: 14 Global Step: 244300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:49:50,460-Speed 4665.04 samples/sec Loss 0.9052 Epoch: 14 Global Step: 244350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:01,761-Speed 4531.09 samples/sec Loss 0.9178 Epoch: 14 Global Step: 244400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:13,840-Speed 4238.65 samples/sec Loss 0.9048 Epoch: 14 Global Step: 244450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:25,530-Speed 4380.24 samples/sec Loss 0.8937 Epoch: 14 Global Step: 244500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:36,418-Speed 4702.70 samples/sec Loss 0.9060 Epoch: 14 Global Step: 244550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:47,474-Speed 4631.38 samples/sec Loss 0.8985 Epoch: 14 Global Step: 244600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:50:58,741-Speed 4544.34 samples/sec Loss 0.8848 Epoch: 14 Global Step: 244650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:51:09,643-Speed 4696.31 samples/sec Loss 0.8899 Epoch: 14 Global Step: 244700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:51:20,573-Speed 4684.61 samples/sec Loss 0.8994 Epoch: 14 Global Step: 244750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:51:31,730-Speed 4589.21 samples/sec Loss 0.9033 Epoch: 14 Global Step: 244800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:51:42,771-Speed 4637.80 samples/sec Loss 0.9036 Epoch: 14 Global Step: 244850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:51:53,752-Speed 4662.59 samples/sec Loss 0.9050 Epoch: 14 Global Step: 244900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:52:04,767-Speed 4648.89 samples/sec Loss 0.9045 Epoch: 14 Global Step: 244950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:52:15,707-Speed 4680.34 samples/sec Loss 0.9035 Epoch: 14 Global Step: 245000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:52:27,439-Speed 4364.15 samples/sec Loss 0.9091 Epoch: 14 Global Step: 245050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:52:38,240-Speed 4740.55 samples/sec Loss 0.8931 Epoch: 14 Global Step: 245100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:52:50,165-Speed 4293.78 samples/sec Loss 0.9054 Epoch: 14 Global Step: 245150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:01,227-Speed 4628.86 samples/sec Loss 0.9104 Epoch: 14 Global Step: 245200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:12,141-Speed 4691.27 samples/sec Loss 0.9050 Epoch: 14 Global Step: 245250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:23,037-Speed 4699.42 samples/sec Loss 0.8895 Epoch: 14 Global Step: 245300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:34,138-Speed 4612.52 samples/sec Loss 0.8977 Epoch: 14 Global Step: 245350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:45,005-Speed 4711.62 samples/sec Loss 0.9027 Epoch: 14 Global Step: 245400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:53:55,930-Speed 4686.73 samples/sec Loss 0.8942 Epoch: 14 Global Step: 245450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:54:06,768-Speed 4724.63 samples/sec Loss 0.9041 Epoch: 14 Global Step: 245500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:54:17,895-Speed 4601.53 samples/sec Loss 0.9032 Epoch: 14 Global Step: 245550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:54:28,939-Speed 4636.45 samples/sec Loss 0.8939 Epoch: 14 Global Step: 245600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:54:39,728-Speed 4745.85 samples/sec Loss 0.9157 Epoch: 14 Global Step: 245650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:54:50,619-Speed 4701.48 samples/sec Loss 0.8853 Epoch: 14 Global Step: 245700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:01,729-Speed 4608.37 samples/sec Loss 0.8824 Epoch: 14 Global Step: 245750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:12,607-Speed 4707.25 samples/sec Loss 0.8849 Epoch: 14 Global Step: 245800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:24,364-Speed 4355.09 samples/sec Loss 0.9001 Epoch: 14 Global Step: 245850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:35,323-Speed 4672.35 samples/sec Loss 0.8951 Epoch: 14 Global Step: 245900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:46,560-Speed 4556.31 samples/sec Loss 0.9101 Epoch: 14 Global Step: 245950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:55:57,780-Speed 4563.75 samples/sec Loss 0.9105 Epoch: 14 Global Step: 246000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:56:21,927-[lfw][246000]XNorm: 22.553324 Training: 2021-03-18 08:56:21,928-[lfw][246000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 08:56:21,928-[lfw][246000]Accuracy-Highest: 0.99817 Training: 2021-03-18 08:56:49,457-[cfp_fp][246000]XNorm: 20.960151 Training: 2021-03-18 08:56:49,457-[cfp_fp][246000]Accuracy-Flip: 0.98514+-0.00516 Training: 2021-03-18 08:56:49,457-[cfp_fp][246000]Accuracy-Highest: 0.98686 Training: 2021-03-18 08:57:13,211-[agedb_30][246000]XNorm: 22.653225 Training: 2021-03-18 08:57:13,211-[agedb_30][246000]Accuracy-Flip: 0.98267+-0.00633 Training: 2021-03-18 08:57:13,211-[agedb_30][246000]Accuracy-Highest: 0.98317 Training: 2021-03-18 08:57:23,826-Speed 595.04 samples/sec Loss 0.8918 Epoch: 14 Global Step: 246050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:57:34,871-Speed 4635.80 samples/sec Loss 0.8924 Epoch: 14 Global Step: 246100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:57:45,733-Speed 4713.89 samples/sec Loss 0.8902 Epoch: 14 Global Step: 246150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:57:56,824-Speed 4616.52 samples/sec Loss 0.9070 Epoch: 14 Global Step: 246200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:58:08,168-Speed 4513.44 samples/sec Loss 0.8884 Epoch: 14 Global Step: 246250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:58:19,040-Speed 4709.73 samples/sec Loss 0.9085 Epoch: 14 Global Step: 246300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:58:29,970-Speed 4684.43 samples/sec Loss 0.8957 Epoch: 14 Global Step: 246350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:58:41,617-Speed 4396.51 samples/sec Loss 0.8969 Epoch: 14 Global Step: 246400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:58:52,831-Speed 4565.90 samples/sec Loss 0.8843 Epoch: 14 Global Step: 246450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:03,666-Speed 4725.89 samples/sec Loss 0.8986 Epoch: 14 Global Step: 246500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:14,661-Speed 4656.87 samples/sec Loss 0.8928 Epoch: 14 Global Step: 246550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:25,856-Speed 4573.79 samples/sec Loss 0.8994 Epoch: 14 Global Step: 246600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:37,346-Speed 4456.38 samples/sec Loss 0.8829 Epoch: 14 Global Step: 246650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:48,286-Speed 4680.35 samples/sec Loss 0.8908 Epoch: 14 Global Step: 246700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 08:59:59,154-Speed 4711.39 samples/sec Loss 0.8949 Epoch: 14 Global Step: 246750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:00:10,242-Speed 4617.86 samples/sec Loss 0.9020 Epoch: 14 Global Step: 246800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:00:21,300-Speed 4630.30 samples/sec Loss 0.9159 Epoch: 14 Global Step: 246850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:00:32,268-Speed 4668.35 samples/sec Loss 0.9031 Epoch: 14 Global Step: 246900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:00:43,214-Speed 4677.96 samples/sec Loss 0.9029 Epoch: 14 Global Step: 246950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:00:54,109-Speed 4700.15 samples/sec Loss 0.9036 Epoch: 14 Global Step: 247000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:01:04,931-Speed 4730.97 samples/sec Loss 0.8979 Epoch: 14 Global Step: 247050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:01:16,713-Speed 4345.89 samples/sec Loss 0.8859 Epoch: 14 Global Step: 247100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:01:27,681-Speed 4668.44 samples/sec Loss 0.9019 Epoch: 14 Global Step: 247150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:01:38,630-Speed 4676.41 samples/sec Loss 0.9038 Epoch: 14 Global Step: 247200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:01:49,428-Speed 4742.25 samples/sec Loss 0.8848 Epoch: 14 Global Step: 247250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:01,144-Speed 4370.08 samples/sec Loss 0.9048 Epoch: 14 Global Step: 247300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:12,740-Speed 4415.75 samples/sec Loss 0.9078 Epoch: 14 Global Step: 247350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:23,685-Speed 4678.27 samples/sec Loss 0.8873 Epoch: 14 Global Step: 247400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:34,769-Speed 4619.48 samples/sec Loss 0.8986 Epoch: 14 Global Step: 247450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:45,461-Speed 4788.91 samples/sec Loss 0.9011 Epoch: 14 Global Step: 247500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:02:56,180-Speed 4776.87 samples/sec Loss 0.8946 Epoch: 14 Global Step: 247550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:03:07,117-Speed 4681.58 samples/sec Loss 0.8890 Epoch: 14 Global Step: 247600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:03:17,873-Speed 4760.81 samples/sec Loss 0.9019 Epoch: 14 Global Step: 247650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:03:28,985-Speed 4608.01 samples/sec Loss 0.9179 Epoch: 14 Global Step: 247700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:03:39,936-Speed 4675.45 samples/sec Loss 0.8853 Epoch: 14 Global Step: 247750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:03:50,950-Speed 4648.61 samples/sec Loss 0.9114 Epoch: 14 Global Step: 247800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:04:01,825-Speed 4708.48 samples/sec Loss 0.9068 Epoch: 14 Global Step: 247850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:04:13,755-Speed 4292.15 samples/sec Loss 0.8853 Epoch: 14 Global Step: 247900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:04:24,594-Speed 4723.99 samples/sec Loss 0.8949 Epoch: 14 Global Step: 247950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:04:36,523-Speed 4292.19 samples/sec Loss 0.8942 Epoch: 14 Global Step: 248000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:05:00,619-[lfw][248000]XNorm: 22.483837 Training: 2021-03-18 09:05:00,620-[lfw][248000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 09:05:00,620-[lfw][248000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:05:28,537-[cfp_fp][248000]XNorm: 20.933383 Training: 2021-03-18 09:05:28,537-[cfp_fp][248000]Accuracy-Flip: 0.98529+-0.00478 Training: 2021-03-18 09:05:28,537-[cfp_fp][248000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:05:52,338-[agedb_30][248000]XNorm: 22.628704 Training: 2021-03-18 09:05:52,338-[agedb_30][248000]Accuracy-Flip: 0.98167+-0.00654 Training: 2021-03-18 09:05:52,338-[agedb_30][248000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:06:03,225-Speed 590.53 samples/sec Loss 0.9018 Epoch: 14 Global Step: 248050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:06:14,094-Speed 4711.08 samples/sec Loss 0.8957 Epoch: 14 Global Step: 248100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:06:25,037-Speed 4678.80 samples/sec Loss 0.8987 Epoch: 14 Global Step: 248150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:06:35,785-Speed 4764.06 samples/sec Loss 0.9111 Epoch: 14 Global Step: 248200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:06:46,899-Speed 4607.11 samples/sec Loss 0.8870 Epoch: 14 Global Step: 248250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:06:57,933-Speed 4640.47 samples/sec Loss 0.9087 Epoch: 14 Global Step: 248300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:07:09,020-Speed 4618.39 samples/sec Loss 0.8998 Epoch: 14 Global Step: 248350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:07:19,895-Speed 4708.22 samples/sec Loss 0.8791 Epoch: 14 Global Step: 248400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:07:30,795-Speed 4697.58 samples/sec Loss 0.8822 Epoch: 14 Global Step: 248450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:07:41,756-Speed 4671.45 samples/sec Loss 0.9019 Epoch: 14 Global Step: 248500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:07:52,736-Speed 4663.06 samples/sec Loss 0.9037 Epoch: 14 Global Step: 248550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:03,705-Speed 4668.13 samples/sec Loss 0.9051 Epoch: 14 Global Step: 248600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:14,763-Speed 4630.31 samples/sec Loss 0.9015 Epoch: 14 Global Step: 248650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:25,824-Speed 4629.17 samples/sec Loss 0.9075 Epoch: 14 Global Step: 248700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:37,494-Speed 4387.44 samples/sec Loss 0.8909 Epoch: 14 Global Step: 248750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:48,565-Speed 4625.22 samples/sec Loss 0.8979 Epoch: 14 Global Step: 248800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:08:59,469-Speed 4695.66 samples/sec Loss 0.9045 Epoch: 14 Global Step: 248850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:09:10,582-Speed 4607.63 samples/sec Loss 0.8966 Epoch: 14 Global Step: 248900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:09:21,654-Speed 4624.57 samples/sec Loss 0.9013 Epoch: 14 Global Step: 248950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:09:32,625-Speed 4666.75 samples/sec Loss 0.9056 Epoch: 14 Global Step: 249000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:09:43,763-Speed 4597.08 samples/sec Loss 0.8831 Epoch: 14 Global Step: 249050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:09:54,872-Speed 4609.05 samples/sec Loss 0.8952 Epoch: 14 Global Step: 249100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:10:05,835-Speed 4670.64 samples/sec Loss 0.9114 Epoch: 14 Global Step: 249150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:10:16,833-Speed 4655.84 samples/sec Loss 0.8904 Epoch: 14 Global Step: 249200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:10:27,941-Speed 4609.60 samples/sec Loss 0.8926 Epoch: 14 Global Step: 249250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:10:39,671-Speed 4365.08 samples/sec Loss 0.9052 Epoch: 14 Global Step: 249300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:10:50,691-Speed 4646.53 samples/sec Loss 0.8959 Epoch: 14 Global Step: 249350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:01,780-Speed 4617.26 samples/sec Loss 0.8951 Epoch: 14 Global Step: 249400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:12,570-Speed 4745.82 samples/sec Loss 0.9161 Epoch: 14 Global Step: 249450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:23,554-Speed 4661.56 samples/sec Loss 0.8815 Epoch: 14 Global Step: 249500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:34,509-Speed 4673.66 samples/sec Loss 0.9019 Epoch: 14 Global Step: 249550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:45,213-Speed 4783.78 samples/sec Loss 0.9093 Epoch: 14 Global Step: 249600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:11:57,072-Speed 4317.57 samples/sec Loss 0.9057 Epoch: 14 Global Step: 249650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:12:07,871-Speed 4741.58 samples/sec Loss 0.8938 Epoch: 14 Global Step: 249700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:12:18,900-Speed 4642.40 samples/sec Loss 0.9150 Epoch: 14 Global Step: 249750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:12:30,033-Speed 4599.58 samples/sec Loss 0.8929 Epoch: 14 Global Step: 249800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:12:41,026-Speed 4658.01 samples/sec Loss 0.8977 Epoch: 14 Global Step: 249850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:12:52,072-Speed 4635.23 samples/sec Loss 0.8978 Epoch: 14 Global Step: 249900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:13:03,837-Speed 4352.17 samples/sec Loss 0.8865 Epoch: 14 Global Step: 249950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:13:14,757-Speed 4688.85 samples/sec Loss 0.8969 Epoch: 14 Global Step: 250000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:13:38,529-[lfw][250000]XNorm: 22.499173 Training: 2021-03-18 09:13:38,530-[lfw][250000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 09:13:38,530-[lfw][250000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:14:06,049-[cfp_fp][250000]XNorm: 20.801161 Training: 2021-03-18 09:14:06,050-[cfp_fp][250000]Accuracy-Flip: 0.98586+-0.00463 Training: 2021-03-18 09:14:06,050-[cfp_fp][250000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:14:29,907-[agedb_30][250000]XNorm: 22.619705 Training: 2021-03-18 09:14:29,907-[agedb_30][250000]Accuracy-Flip: 0.98250+-0.00544 Training: 2021-03-18 09:14:29,907-[agedb_30][250000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:14:40,663-Speed 596.01 samples/sec Loss 0.8981 Epoch: 14 Global Step: 250050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:14:51,401-Speed 4768.46 samples/sec Loss 0.8903 Epoch: 14 Global Step: 250100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:15:03,225-Speed 4330.33 samples/sec Loss 0.9120 Epoch: 14 Global Step: 250150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:15:15,003-Speed 4347.22 samples/sec Loss 0.8870 Epoch: 14 Global Step: 250200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:15:25,959-Speed 4673.73 samples/sec Loss 0.8914 Epoch: 14 Global Step: 250250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:15:36,927-Speed 4668.23 samples/sec Loss 0.8845 Epoch: 14 Global Step: 250300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:15:47,768-Speed 4723.00 samples/sec Loss 0.8988 Epoch: 14 Global Step: 250350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:16:11,947-Speed 2117.60 samples/sec Loss 0.8753 Epoch: 15 Global Step: 250400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:16:23,995-Speed 4249.98 samples/sec Loss 0.8795 Epoch: 15 Global Step: 250450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:16:35,350-Speed 4509.29 samples/sec Loss 0.8781 Epoch: 15 Global Step: 250500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:16:47,160-Speed 4335.68 samples/sec Loss 0.8708 Epoch: 15 Global Step: 250550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:16:58,593-Speed 4478.48 samples/sec Loss 0.8827 Epoch: 15 Global Step: 250600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:17:09,950-Speed 4508.41 samples/sec Loss 0.8832 Epoch: 15 Global Step: 250650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:17:22,206-Speed 4177.99 samples/sec Loss 0.8904 Epoch: 15 Global Step: 250700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:17:34,336-Speed 4221.13 samples/sec Loss 0.8823 Epoch: 15 Global Step: 250750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:17:45,779-Speed 4474.53 samples/sec Loss 0.9061 Epoch: 15 Global Step: 250800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:17:56,945-Speed 4585.99 samples/sec Loss 0.8768 Epoch: 15 Global Step: 250850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:18:08,419-Speed 4462.45 samples/sec Loss 0.8932 Epoch: 15 Global Step: 250900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:18:19,622-Speed 4570.58 samples/sec Loss 0.8929 Epoch: 15 Global Step: 250950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:18:31,011-Speed 4495.88 samples/sec Loss 0.8599 Epoch: 15 Global Step: 251000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:18:42,551-Speed 4436.99 samples/sec Loss 0.8889 Epoch: 15 Global Step: 251050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:18:53,808-Speed 4548.39 samples/sec Loss 0.8950 Epoch: 15 Global Step: 251100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:19:05,180-Speed 4502.74 samples/sec Loss 0.8774 Epoch: 15 Global Step: 251150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:19:16,408-Speed 4560.05 samples/sec Loss 0.8785 Epoch: 15 Global Step: 251200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:19:27,449-Speed 4637.84 samples/sec Loss 0.8730 Epoch: 15 Global Step: 251250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:19:38,673-Speed 4562.07 samples/sec Loss 0.8777 Epoch: 15 Global Step: 251300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:19:50,058-Speed 4497.43 samples/sec Loss 0.8792 Epoch: 15 Global Step: 251350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:01,812-Speed 4356.20 samples/sec Loss 0.8836 Epoch: 15 Global Step: 251400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:13,173-Speed 4506.95 samples/sec Loss 0.8880 Epoch: 15 Global Step: 251450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:24,850-Speed 4385.06 samples/sec Loss 0.8989 Epoch: 15 Global Step: 251500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:36,190-Speed 4515.04 samples/sec Loss 0.8818 Epoch: 15 Global Step: 251550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:47,652-Speed 4467.31 samples/sec Loss 0.9075 Epoch: 15 Global Step: 251600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:20:59,582-Speed 4292.04 samples/sec Loss 0.8768 Epoch: 15 Global Step: 251650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:21:11,065-Speed 4459.26 samples/sec Loss 0.8762 Epoch: 15 Global Step: 251700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:21:22,439-Speed 4501.53 samples/sec Loss 0.8837 Epoch: 15 Global Step: 251750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:21:34,024-Speed 4420.00 samples/sec Loss 0.8777 Epoch: 15 Global Step: 251800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:21:45,469-Speed 4473.58 samples/sec Loss 0.8699 Epoch: 15 Global Step: 251850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:21:57,172-Speed 4375.49 samples/sec Loss 0.8813 Epoch: 15 Global Step: 251900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:22:08,778-Speed 4411.65 samples/sec Loss 0.8828 Epoch: 15 Global Step: 251950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:22:20,194-Speed 4485.22 samples/sec Loss 0.8766 Epoch: 15 Global Step: 252000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:22:44,345-[lfw][252000]XNorm: 22.482675 Training: 2021-03-18 09:22:44,345-[lfw][252000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 09:22:44,347-[lfw][252000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:23:11,815-[cfp_fp][252000]XNorm: 20.854595 Training: 2021-03-18 09:23:11,815-[cfp_fp][252000]Accuracy-Flip: 0.98629+-0.00475 Training: 2021-03-18 09:23:11,815-[cfp_fp][252000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:23:35,601-[agedb_30][252000]XNorm: 22.577558 Training: 2021-03-18 09:23:35,602-[agedb_30][252000]Accuracy-Flip: 0.98267+-0.00716 Training: 2021-03-18 09:23:35,602-[agedb_30][252000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:23:46,528-Speed 593.05 samples/sec Loss 0.8675 Epoch: 15 Global Step: 252050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:23:58,602-Speed 4240.94 samples/sec Loss 0.8909 Epoch: 15 Global Step: 252100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:24:09,916-Speed 4525.76 samples/sec Loss 0.8865 Epoch: 15 Global Step: 252150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:24:21,217-Speed 4530.77 samples/sec Loss 0.8766 Epoch: 15 Global Step: 252200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:24:32,670-Speed 4470.63 samples/sec Loss 0.8835 Epoch: 15 Global Step: 252250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:24:43,919-Speed 4551.92 samples/sec Loss 0.8787 Epoch: 15 Global Step: 252300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:24:55,054-Speed 4598.08 samples/sec Loss 0.8788 Epoch: 15 Global Step: 252350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:25:06,500-Speed 4473.64 samples/sec Loss 0.8603 Epoch: 15 Global Step: 252400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:25:18,803-Speed 4161.59 samples/sec Loss 0.8727 Epoch: 15 Global Step: 252450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:25:30,086-Speed 4538.45 samples/sec Loss 0.8725 Epoch: 15 Global Step: 252500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:25:41,611-Speed 4442.60 samples/sec Loss 0.8827 Epoch: 15 Global Step: 252550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:25:52,998-Speed 4496.73 samples/sec Loss 0.8743 Epoch: 15 Global Step: 252600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:26:04,437-Speed 4476.24 samples/sec Loss 0.8806 Epoch: 15 Global Step: 252650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:26:15,779-Speed 4514.42 samples/sec Loss 0.8916 Epoch: 15 Global Step: 252700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:26:27,087-Speed 4527.95 samples/sec Loss 0.8844 Epoch: 15 Global Step: 252750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:26:38,309-Speed 4562.78 samples/sec Loss 0.8888 Epoch: 15 Global Step: 252800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:26:49,754-Speed 4473.77 samples/sec Loss 0.8761 Epoch: 15 Global Step: 252850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:27:01,954-Speed 4196.95 samples/sec Loss 0.8877 Epoch: 15 Global Step: 252900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:27:14,257-Speed 4161.60 samples/sec Loss 0.8836 Epoch: 15 Global Step: 252950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:27:26,664-Speed 4126.93 samples/sec Loss 0.8823 Epoch: 15 Global Step: 253000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:27:37,654-Speed 4659.25 samples/sec Loss 0.8800 Epoch: 15 Global Step: 253050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:27:48,697-Speed 4636.85 samples/sec Loss 0.8771 Epoch: 15 Global Step: 253100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:00,124-Speed 4480.64 samples/sec Loss 0.8709 Epoch: 15 Global Step: 253150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:11,399-Speed 4541.33 samples/sec Loss 0.8923 Epoch: 15 Global Step: 253200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:22,968-Speed 4425.81 samples/sec Loss 0.8759 Epoch: 15 Global Step: 253250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:34,266-Speed 4532.09 samples/sec Loss 0.8962 Epoch: 15 Global Step: 253300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:45,547-Speed 4539.00 samples/sec Loss 0.8672 Epoch: 15 Global Step: 253350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:28:56,892-Speed 4513.62 samples/sec Loss 0.8854 Epoch: 15 Global Step: 253400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:29:08,387-Speed 4454.01 samples/sec Loss 0.8665 Epoch: 15 Global Step: 253450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:29:19,910-Speed 4443.95 samples/sec Loss 0.8803 Epoch: 15 Global Step: 253500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:29:31,246-Speed 4517.03 samples/sec Loss 0.8792 Epoch: 15 Global Step: 253550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:29:44,139-Speed 3971.46 samples/sec Loss 0.8942 Epoch: 15 Global Step: 253600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:29:55,292-Speed 4590.91 samples/sec Loss 0.8835 Epoch: 15 Global Step: 253650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:30:06,470-Speed 4580.63 samples/sec Loss 0.8983 Epoch: 15 Global Step: 253700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:30:17,525-Speed 4631.90 samples/sec Loss 0.8941 Epoch: 15 Global Step: 253750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:30:28,827-Speed 4530.23 samples/sec Loss 0.8660 Epoch: 15 Global Step: 253800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:30:39,981-Speed 4590.76 samples/sec Loss 0.8752 Epoch: 15 Global Step: 253850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:30:51,399-Speed 4484.44 samples/sec Loss 0.8908 Epoch: 15 Global Step: 253900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:31:02,525-Speed 4602.20 samples/sec Loss 0.8800 Epoch: 15 Global Step: 253950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:31:13,914-Speed 4495.73 samples/sec Loss 0.8865 Epoch: 15 Global Step: 254000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:31:37,561-[lfw][254000]XNorm: 22.511161 Training: 2021-03-18 09:31:37,562-[lfw][254000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 09:31:37,562-[lfw][254000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:32:04,923-[cfp_fp][254000]XNorm: 20.898707 Training: 2021-03-18 09:32:04,923-[cfp_fp][254000]Accuracy-Flip: 0.98586+-0.00555 Training: 2021-03-18 09:32:04,923-[cfp_fp][254000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:32:28,527-[agedb_30][254000]XNorm: 22.600917 Training: 2021-03-18 09:32:28,527-[agedb_30][254000]Accuracy-Flip: 0.98317+-0.00673 Training: 2021-03-18 09:32:28,527-[agedb_30][254000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:32:39,892-Speed 595.50 samples/sec Loss 0.8790 Epoch: 15 Global Step: 254050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:32:51,509-Speed 4407.84 samples/sec Loss 0.8891 Epoch: 15 Global Step: 254100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:02,577-Speed 4626.14 samples/sec Loss 0.8794 Epoch: 15 Global Step: 254150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:14,110-Speed 4440.30 samples/sec Loss 0.8949 Epoch: 15 Global Step: 254200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:25,397-Speed 4536.45 samples/sec Loss 0.8885 Epoch: 15 Global Step: 254250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:36,746-Speed 4511.69 samples/sec Loss 0.8927 Epoch: 15 Global Step: 254300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:47,776-Speed 4642.14 samples/sec Loss 0.8828 Epoch: 15 Global Step: 254350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:33:58,822-Speed 4635.31 samples/sec Loss 0.8832 Epoch: 15 Global Step: 254400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:34:10,721-Speed 4303.53 samples/sec Loss 0.8762 Epoch: 15 Global Step: 254450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:34:22,109-Speed 4496.07 samples/sec Loss 0.8784 Epoch: 15 Global Step: 254500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:34:33,312-Speed 4570.79 samples/sec Loss 0.8960 Epoch: 15 Global Step: 254550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:34:45,071-Speed 4354.17 samples/sec Loss 0.8823 Epoch: 15 Global Step: 254600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:34:56,323-Speed 4550.54 samples/sec Loss 0.8665 Epoch: 15 Global Step: 254650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:35:07,738-Speed 4485.63 samples/sec Loss 0.8904 Epoch: 15 Global Step: 254700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:35:19,273-Speed 4439.53 samples/sec Loss 0.8818 Epoch: 15 Global Step: 254750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:35:30,339-Speed 4627.09 samples/sec Loss 0.8701 Epoch: 15 Global Step: 254800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:35:41,735-Speed 4493.18 samples/sec Loss 0.8731 Epoch: 15 Global Step: 254850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:35:53,001-Speed 4544.75 samples/sec Loss 0.8780 Epoch: 15 Global Step: 254900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:36:05,190-Speed 4200.73 samples/sec Loss 0.8825 Epoch: 15 Global Step: 254950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:36:16,501-Speed 4526.88 samples/sec Loss 0.8774 Epoch: 15 Global Step: 255000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:36:28,259-Speed 4355.04 samples/sec Loss 0.8824 Epoch: 15 Global Step: 255050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:36:39,200-Speed 4679.60 samples/sec Loss 0.8612 Epoch: 15 Global Step: 255100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:36:50,408-Speed 4568.49 samples/sec Loss 0.8641 Epoch: 15 Global Step: 255150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:02,212-Speed 4337.84 samples/sec Loss 0.8819 Epoch: 15 Global Step: 255200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:13,376-Speed 4586.53 samples/sec Loss 0.8780 Epoch: 15 Global Step: 255250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:24,569-Speed 4574.33 samples/sec Loss 0.8717 Epoch: 15 Global Step: 255300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:36,191-Speed 4405.97 samples/sec Loss 0.8809 Epoch: 15 Global Step: 255350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:47,666-Speed 4462.04 samples/sec Loss 0.8803 Epoch: 15 Global Step: 255400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:37:59,246-Speed 4421.41 samples/sec Loss 0.8926 Epoch: 15 Global Step: 255450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:38:10,220-Speed 4666.05 samples/sec Loss 0.8782 Epoch: 15 Global Step: 255500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:38:21,048-Speed 4728.82 samples/sec Loss 0.8594 Epoch: 15 Global Step: 255550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:38:33,365-Speed 4157.15 samples/sec Loss 0.8861 Epoch: 15 Global Step: 255600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:38:44,741-Speed 4500.91 samples/sec Loss 0.8799 Epoch: 15 Global Step: 255650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:38:56,106-Speed 4505.66 samples/sec Loss 0.8827 Epoch: 15 Global Step: 255700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:39:08,411-Speed 4161.28 samples/sec Loss 0.8840 Epoch: 15 Global Step: 255750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:39:19,541-Speed 4600.45 samples/sec Loss 0.8765 Epoch: 15 Global Step: 255800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:39:31,476-Speed 4290.20 samples/sec Loss 0.8731 Epoch: 15 Global Step: 255850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:39:42,474-Speed 4655.40 samples/sec Loss 0.8898 Epoch: 15 Global Step: 255900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:39:53,529-Speed 4631.95 samples/sec Loss 0.8665 Epoch: 15 Global Step: 255950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:40:04,652-Speed 4603.41 samples/sec Loss 0.8831 Epoch: 15 Global Step: 256000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:40:29,365-[lfw][256000]XNorm: 22.579679 Training: 2021-03-18 09:40:29,366-[lfw][256000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-18 09:40:29,366-[lfw][256000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:40:56,884-[cfp_fp][256000]XNorm: 20.922217 Training: 2021-03-18 09:40:56,884-[cfp_fp][256000]Accuracy-Flip: 0.98643+-0.00516 Training: 2021-03-18 09:40:56,884-[cfp_fp][256000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:41:20,623-[agedb_30][256000]XNorm: 22.715832 Training: 2021-03-18 09:41:20,624-[agedb_30][256000]Accuracy-Flip: 0.98167+-0.00650 Training: 2021-03-18 09:41:20,624-[agedb_30][256000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:41:31,596-Speed 588.89 samples/sec Loss 0.9093 Epoch: 15 Global Step: 256050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:41:42,744-Speed 4592.95 samples/sec Loss 0.8676 Epoch: 15 Global Step: 256100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:41:53,847-Speed 4611.76 samples/sec Loss 0.8862 Epoch: 15 Global Step: 256150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:42:04,825-Speed 4664.39 samples/sec Loss 0.8910 Epoch: 15 Global Step: 256200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:42:15,843-Speed 4647.28 samples/sec Loss 0.8974 Epoch: 15 Global Step: 256250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:42:26,997-Speed 4590.50 samples/sec Loss 0.9020 Epoch: 15 Global Step: 256300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:42:38,179-Speed 4579.37 samples/sec Loss 0.8529 Epoch: 15 Global Step: 256350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:42:51,052-Speed 3977.55 samples/sec Loss 0.8897 Epoch: 15 Global Step: 256400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:02,076-Speed 4644.41 samples/sec Loss 0.8833 Epoch: 15 Global Step: 256450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:13,300-Speed 4562.04 samples/sec Loss 0.8684 Epoch: 15 Global Step: 256500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:24,504-Speed 4570.33 samples/sec Loss 0.8743 Epoch: 15 Global Step: 256550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:35,890-Speed 4497.13 samples/sec Loss 0.8873 Epoch: 15 Global Step: 256600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:47,101-Speed 4567.07 samples/sec Loss 0.8879 Epoch: 15 Global Step: 256650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:43:58,389-Speed 4535.88 samples/sec Loss 0.8814 Epoch: 15 Global Step: 256700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:44:09,719-Speed 4519.25 samples/sec Loss 0.8732 Epoch: 15 Global Step: 256750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:44:21,244-Speed 4442.78 samples/sec Loss 0.8731 Epoch: 15 Global Step: 256800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:44:32,343-Speed 4613.40 samples/sec Loss 0.8739 Epoch: 15 Global Step: 256850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:44:43,565-Speed 4562.65 samples/sec Loss 0.8808 Epoch: 15 Global Step: 256900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:44:54,702-Speed 4597.77 samples/sec Loss 0.8839 Epoch: 15 Global Step: 256950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:45:05,868-Speed 4585.75 samples/sec Loss 0.8831 Epoch: 15 Global Step: 257000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:45:17,316-Speed 4472.79 samples/sec Loss 0.8817 Epoch: 15 Global Step: 257050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:45:28,561-Speed 4553.59 samples/sec Loss 0.8740 Epoch: 15 Global Step: 257100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:45:40,025-Speed 4466.32 samples/sec Loss 0.8799 Epoch: 15 Global Step: 257150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:45:51,874-Speed 4321.38 samples/sec Loss 0.8701 Epoch: 15 Global Step: 257200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-18 09:46:03,289-Speed 4485.51 samples/sec Loss 0.8933 Epoch: 15 Global Step: 257250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:46:14,392-Speed 4611.62 samples/sec Loss 0.8749 Epoch: 15 Global Step: 257300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:46:25,798-Speed 4489.16 samples/sec Loss 0.8820 Epoch: 15 Global Step: 257350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:46:37,042-Speed 4554.21 samples/sec Loss 0.8795 Epoch: 15 Global Step: 257400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:46:48,457-Speed 4485.56 samples/sec Loss 0.8723 Epoch: 15 Global Step: 257450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:46:59,729-Speed 4542.44 samples/sec Loss 0.8872 Epoch: 15 Global Step: 257500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:47:11,149-Speed 4484.03 samples/sec Loss 0.8839 Epoch: 15 Global Step: 257550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:47:22,286-Speed 4597.54 samples/sec Loss 0.8778 Epoch: 15 Global Step: 257600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:47:33,719-Speed 4478.63 samples/sec Loss 0.8713 Epoch: 15 Global Step: 257650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:47:45,509-Speed 4342.96 samples/sec Loss 0.8666 Epoch: 15 Global Step: 257700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:47:56,708-Speed 4571.98 samples/sec Loss 0.8803 Epoch: 15 Global Step: 257750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:48:07,817-Speed 4609.22 samples/sec Loss 0.8851 Epoch: 15 Global Step: 257800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:48:18,755-Speed 4681.37 samples/sec Loss 0.8941 Epoch: 15 Global Step: 257850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:48:29,956-Speed 4571.25 samples/sec Loss 0.8814 Epoch: 15 Global Step: 257900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:48:41,180-Speed 4562.04 samples/sec Loss 0.8739 Epoch: 15 Global Step: 257950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:48:53,272-Speed 4234.52 samples/sec Loss 0.8873 Epoch: 15 Global Step: 258000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:49:17,593-[lfw][258000]XNorm: 22.505361 Training: 2021-03-18 09:49:17,593-[lfw][258000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 09:49:17,593-[lfw][258000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:49:45,065-[cfp_fp][258000]XNorm: 20.873915 Training: 2021-03-18 09:49:45,065-[cfp_fp][258000]Accuracy-Flip: 0.98686+-0.00490 Training: 2021-03-18 09:49:45,065-[cfp_fp][258000]Accuracy-Highest: 0.98686 Training: 2021-03-18 09:50:08,720-[agedb_30][258000]XNorm: 22.618689 Training: 2021-03-18 09:50:08,720-[agedb_30][258000]Accuracy-Flip: 0.98267+-0.00616 Training: 2021-03-18 09:50:08,720-[agedb_30][258000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:50:20,063-Speed 589.93 samples/sec Loss 0.8864 Epoch: 15 Global Step: 258050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:50:31,500-Speed 4476.76 samples/sec Loss 0.8910 Epoch: 15 Global Step: 258100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:50:42,568-Speed 4626.43 samples/sec Loss 0.8860 Epoch: 15 Global Step: 258150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:50:53,934-Speed 4504.83 samples/sec Loss 0.8664 Epoch: 15 Global Step: 258200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:51:04,915-Speed 4662.93 samples/sec Loss 0.8838 Epoch: 15 Global Step: 258250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:51:16,083-Speed 4584.69 samples/sec Loss 0.8560 Epoch: 15 Global Step: 258300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:51:26,845-Speed 4757.84 samples/sec Loss 0.8794 Epoch: 15 Global Step: 258350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:51:37,704-Speed 4715.23 samples/sec Loss 0.8900 Epoch: 15 Global Step: 258400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:51:49,070-Speed 4504.90 samples/sec Loss 0.8684 Epoch: 15 Global Step: 258450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:01,158-Speed 4235.74 samples/sec Loss 0.8815 Epoch: 15 Global Step: 258500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:12,433-Speed 4541.64 samples/sec Loss 0.8756 Epoch: 15 Global Step: 258550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:23,982-Speed 4433.53 samples/sec Loss 0.8877 Epoch: 15 Global Step: 258600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:36,381-Speed 4129.38 samples/sec Loss 0.8703 Epoch: 15 Global Step: 258650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:48,319-Speed 4289.10 samples/sec Loss 0.8848 Epoch: 15 Global Step: 258700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:52:59,294-Speed 4665.45 samples/sec Loss 0.8892 Epoch: 15 Global Step: 258750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:53:10,529-Speed 4557.43 samples/sec Loss 0.8803 Epoch: 15 Global Step: 258800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:53:21,463-Speed 4683.12 samples/sec Loss 0.8633 Epoch: 15 Global Step: 258850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:53:32,899-Speed 4477.36 samples/sec Loss 0.9093 Epoch: 15 Global Step: 258900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:53:43,926-Speed 4643.34 samples/sec Loss 0.8745 Epoch: 15 Global Step: 258950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:53:55,217-Speed 4534.81 samples/sec Loss 0.8598 Epoch: 15 Global Step: 259000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:54:06,267-Speed 4633.59 samples/sec Loss 0.8741 Epoch: 15 Global Step: 259050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:54:17,420-Speed 4591.16 samples/sec Loss 0.8771 Epoch: 15 Global Step: 259100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:54:28,748-Speed 4520.34 samples/sec Loss 0.8807 Epoch: 15 Global Step: 259150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:54:39,790-Speed 4637.24 samples/sec Loss 0.8801 Epoch: 15 Global Step: 259200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:54:51,040-Speed 4551.26 samples/sec Loss 0.8708 Epoch: 15 Global Step: 259250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:55:03,704-Speed 4043.39 samples/sec Loss 0.8975 Epoch: 15 Global Step: 259300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:55:14,922-Speed 4564.39 samples/sec Loss 0.8671 Epoch: 15 Global Step: 259350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:55:26,133-Speed 4567.01 samples/sec Loss 0.8770 Epoch: 15 Global Step: 259400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:55:37,375-Speed 4554.83 samples/sec Loss 0.8815 Epoch: 15 Global Step: 259450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:55:48,548-Speed 4582.74 samples/sec Loss 0.8634 Epoch: 15 Global Step: 259500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:00,094-Speed 4434.61 samples/sec Loss 0.8724 Epoch: 15 Global Step: 259550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:11,311-Speed 4564.93 samples/sec Loss 0.8799 Epoch: 15 Global Step: 259600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:22,326-Speed 4648.38 samples/sec Loss 0.8753 Epoch: 15 Global Step: 259650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:33,320-Speed 4657.68 samples/sec Loss 0.8625 Epoch: 15 Global Step: 259700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:44,556-Speed 4557.13 samples/sec Loss 0.8826 Epoch: 15 Global Step: 259750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:56:55,784-Speed 4560.05 samples/sec Loss 0.8743 Epoch: 15 Global Step: 259800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:57:06,977-Speed 4574.83 samples/sec Loss 0.8871 Epoch: 15 Global Step: 259850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:57:18,265-Speed 4535.73 samples/sec Loss 0.8754 Epoch: 15 Global Step: 259900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:57:29,528-Speed 4546.16 samples/sec Loss 0.8860 Epoch: 15 Global Step: 259950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:57:40,875-Speed 4512.68 samples/sec Loss 0.8719 Epoch: 15 Global Step: 260000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:58:05,670-[lfw][260000]XNorm: 22.461671 Training: 2021-03-18 09:58:05,671-[lfw][260000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 09:58:05,671-[lfw][260000]Accuracy-Highest: 0.99817 Training: 2021-03-18 09:58:33,345-[cfp_fp][260000]XNorm: 20.904740 Training: 2021-03-18 09:58:33,345-[cfp_fp][260000]Accuracy-Flip: 0.98743+-0.00393 Training: 2021-03-18 09:58:33,345-[cfp_fp][260000]Accuracy-Highest: 0.98743 Training: 2021-03-18 09:58:57,210-[agedb_30][260000]XNorm: 22.570414 Training: 2021-03-18 09:58:57,211-[agedb_30][260000]Accuracy-Flip: 0.98233+-0.00638 Training: 2021-03-18 09:58:57,211-[agedb_30][260000]Accuracy-Highest: 0.98317 Training: 2021-03-18 09:59:08,261-Speed 585.91 samples/sec Loss 0.8851 Epoch: 15 Global Step: 260050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:59:20,077-Speed 4333.30 samples/sec Loss 0.8848 Epoch: 15 Global Step: 260100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:59:31,261-Speed 4578.42 samples/sec Loss 0.8687 Epoch: 15 Global Step: 260150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:59:42,492-Speed 4559.25 samples/sec Loss 0.8785 Epoch: 15 Global Step: 260200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 09:59:53,694-Speed 4571.25 samples/sec Loss 0.8611 Epoch: 15 Global Step: 260250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:00:05,101-Speed 4488.48 samples/sec Loss 0.8875 Epoch: 15 Global Step: 260300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:00:16,416-Speed 4525.43 samples/sec Loss 0.8906 Epoch: 15 Global Step: 260350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:00:27,688-Speed 4542.55 samples/sec Loss 0.8806 Epoch: 15 Global Step: 260400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:00:38,956-Speed 4544.18 samples/sec Loss 0.8694 Epoch: 15 Global Step: 260450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:00:51,094-Speed 4218.44 samples/sec Loss 0.8886 Epoch: 15 Global Step: 260500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:02,249-Speed 4589.88 samples/sec Loss 0.8676 Epoch: 15 Global Step: 260550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:13,472-Speed 4562.37 samples/sec Loss 0.8823 Epoch: 15 Global Step: 260600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:24,777-Speed 4529.33 samples/sec Loss 0.8789 Epoch: 15 Global Step: 260650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:36,275-Speed 4453.35 samples/sec Loss 0.8801 Epoch: 15 Global Step: 260700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:47,594-Speed 4523.52 samples/sec Loss 0.8751 Epoch: 15 Global Step: 260750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:01:58,682-Speed 4618.04 samples/sec Loss 0.8670 Epoch: 15 Global Step: 260800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:02:10,447-Speed 4352.07 samples/sec Loss 0.8813 Epoch: 15 Global Step: 260850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:02:21,644-Speed 4572.88 samples/sec Loss 0.9075 Epoch: 15 Global Step: 260900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:02:32,925-Speed 4538.73 samples/sec Loss 0.8674 Epoch: 15 Global Step: 260950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:02:43,888-Speed 4670.65 samples/sec Loss 0.8590 Epoch: 15 Global Step: 261000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:02:55,244-Speed 4508.83 samples/sec Loss 0.8812 Epoch: 15 Global Step: 261050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:03:06,475-Speed 4559.08 samples/sec Loss 0.8823 Epoch: 15 Global Step: 261100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:03:18,035-Speed 4429.22 samples/sec Loss 0.8577 Epoch: 15 Global Step: 261150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:03:30,102-Speed 4243.44 samples/sec Loss 0.8744 Epoch: 15 Global Step: 261200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:03:41,232-Speed 4600.43 samples/sec Loss 0.8828 Epoch: 15 Global Step: 261250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:03:52,450-Speed 4564.06 samples/sec Loss 0.8845 Epoch: 15 Global Step: 261300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:04:03,817-Speed 4504.57 samples/sec Loss 0.8782 Epoch: 15 Global Step: 261350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:04:15,179-Speed 4506.89 samples/sec Loss 0.8775 Epoch: 15 Global Step: 261400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:04:26,688-Speed 4448.89 samples/sec Loss 0.8810 Epoch: 15 Global Step: 261450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:04:38,963-Speed 4170.95 samples/sec Loss 0.8684 Epoch: 15 Global Step: 261500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:04:51,413-Speed 4112.76 samples/sec Loss 0.8760 Epoch: 15 Global Step: 261550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:02,965-Speed 4432.66 samples/sec Loss 0.8992 Epoch: 15 Global Step: 261600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:14,360-Speed 4493.29 samples/sec Loss 0.8640 Epoch: 15 Global Step: 261650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:25,618-Speed 4548.42 samples/sec Loss 0.9028 Epoch: 15 Global Step: 261700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:37,303-Speed 4381.86 samples/sec Loss 0.8809 Epoch: 15 Global Step: 261750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:48,374-Speed 4624.95 samples/sec Loss 0.8601 Epoch: 15 Global Step: 261800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:05:59,771-Speed 4492.33 samples/sec Loss 0.8812 Epoch: 15 Global Step: 261850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:06:10,850-Speed 4622.02 samples/sec Loss 0.8732 Epoch: 15 Global Step: 261900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:06:22,297-Speed 4473.05 samples/sec Loss 0.8651 Epoch: 15 Global Step: 261950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:06:33,718-Speed 4483.43 samples/sec Loss 0.8772 Epoch: 15 Global Step: 262000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:06:58,298-[lfw][262000]XNorm: 22.594443 Training: 2021-03-18 10:06:58,299-[lfw][262000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 10:06:58,299-[lfw][262000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:07:26,671-[cfp_fp][262000]XNorm: 20.963952 Training: 2021-03-18 10:07:26,671-[cfp_fp][262000]Accuracy-Flip: 0.98571+-0.00499 Training: 2021-03-18 10:07:26,671-[cfp_fp][262000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:07:50,532-[agedb_30][262000]XNorm: 22.716673 Training: 2021-03-18 10:07:50,533-[agedb_30][262000]Accuracy-Flip: 0.98300+-0.00609 Training: 2021-03-18 10:07:50,533-[agedb_30][262000]Accuracy-Highest: 0.98317 Training: 2021-03-18 10:08:01,678-Speed 582.08 samples/sec Loss 0.8875 Epoch: 15 Global Step: 262050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:08:13,613-Speed 4290.15 samples/sec Loss 0.8729 Epoch: 15 Global Step: 262100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:08:25,427-Speed 4334.54 samples/sec Loss 0.8740 Epoch: 15 Global Step: 262150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:08:37,351-Speed 4293.79 samples/sec Loss 0.8676 Epoch: 15 Global Step: 262200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:08:48,575-Speed 4562.23 samples/sec Loss 0.8644 Epoch: 15 Global Step: 262250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:08:59,689-Speed 4607.06 samples/sec Loss 0.8814 Epoch: 15 Global Step: 262300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:09:10,985-Speed 4532.89 samples/sec Loss 0.8699 Epoch: 15 Global Step: 262350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:09:22,414-Speed 4480.18 samples/sec Loss 0.8858 Epoch: 15 Global Step: 262400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:09:33,598-Speed 4578.28 samples/sec Loss 0.8759 Epoch: 15 Global Step: 262450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:09:44,899-Speed 4530.72 samples/sec Loss 0.8881 Epoch: 15 Global Step: 262500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:09:56,490-Speed 4417.61 samples/sec Loss 0.8780 Epoch: 15 Global Step: 262550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:10:07,894-Speed 4489.72 samples/sec Loss 0.8808 Epoch: 15 Global Step: 262600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:10:19,116-Speed 4562.89 samples/sec Loss 0.8773 Epoch: 15 Global Step: 262650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:10:30,267-Speed 4591.55 samples/sec Loss 0.8818 Epoch: 15 Global Step: 262700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:10:41,494-Speed 4560.89 samples/sec Loss 0.8940 Epoch: 15 Global Step: 262750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:10:52,821-Speed 4520.47 samples/sec Loss 0.8853 Epoch: 15 Global Step: 262800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:11:04,061-Speed 4555.22 samples/sec Loss 0.8695 Epoch: 15 Global Step: 262850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:11:15,409-Speed 4512.04 samples/sec Loss 0.8807 Epoch: 15 Global Step: 262900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:11:26,759-Speed 4511.31 samples/sec Loss 0.8784 Epoch: 15 Global Step: 262950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:11:39,066-Speed 4160.57 samples/sec Loss 0.8543 Epoch: 15 Global Step: 263000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:11:50,155-Speed 4617.31 samples/sec Loss 0.8795 Epoch: 15 Global Step: 263050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:01,344-Speed 4576.23 samples/sec Loss 0.8793 Epoch: 15 Global Step: 263100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:12,453-Speed 4609.10 samples/sec Loss 0.8895 Epoch: 15 Global Step: 263150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:23,862-Speed 4488.20 samples/sec Loss 0.8794 Epoch: 15 Global Step: 263200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:35,092-Speed 4559.45 samples/sec Loss 0.8780 Epoch: 15 Global Step: 263250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:47,093-Speed 4266.44 samples/sec Loss 0.8864 Epoch: 15 Global Step: 263300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:12:58,412-Speed 4523.82 samples/sec Loss 0.8727 Epoch: 15 Global Step: 263350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:13:09,834-Speed 4483.03 samples/sec Loss 0.8850 Epoch: 15 Global Step: 263400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:13:21,298-Speed 4466.57 samples/sec Loss 0.8798 Epoch: 15 Global Step: 263450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:13:32,719-Speed 4483.06 samples/sec Loss 0.8832 Epoch: 15 Global Step: 263500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:13:43,887-Speed 4584.84 samples/sec Loss 0.8853 Epoch: 15 Global Step: 263550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:13:54,879-Speed 4658.25 samples/sec Loss 0.8791 Epoch: 15 Global Step: 263600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:14:06,732-Speed 4319.90 samples/sec Loss 0.8832 Epoch: 15 Global Step: 263650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:14:17,986-Speed 4549.81 samples/sec Loss 0.8839 Epoch: 15 Global Step: 263700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:14:29,543-Speed 4430.77 samples/sec Loss 0.8918 Epoch: 15 Global Step: 263750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:14:40,919-Speed 4500.87 samples/sec Loss 0.8803 Epoch: 15 Global Step: 263800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:14:52,315-Speed 4492.92 samples/sec Loss 0.8683 Epoch: 15 Global Step: 263850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:15:03,794-Speed 4460.84 samples/sec Loss 0.8710 Epoch: 15 Global Step: 263900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:15:15,995-Speed 4196.67 samples/sec Loss 0.8678 Epoch: 15 Global Step: 263950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:15:27,029-Speed 4640.28 samples/sec Loss 0.8736 Epoch: 15 Global Step: 264000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:15:51,171-[lfw][264000]XNorm: 22.636867 Training: 2021-03-18 10:15:51,172-[lfw][264000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 10:15:51,172-[lfw][264000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:16:18,920-[cfp_fp][264000]XNorm: 21.003574 Training: 2021-03-18 10:16:18,920-[cfp_fp][264000]Accuracy-Flip: 0.98543+-0.00451 Training: 2021-03-18 10:16:18,920-[cfp_fp][264000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:16:42,719-[agedb_30][264000]XNorm: 22.729990 Training: 2021-03-18 10:16:42,720-[agedb_30][264000]Accuracy-Flip: 0.98333+-0.00683 Training: 2021-03-18 10:16:42,720-[agedb_30][264000]Accuracy-Highest: 0.98333 Training: 2021-03-18 10:16:53,780-Speed 590.20 samples/sec Loss 0.8766 Epoch: 15 Global Step: 264050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:17:05,004-Speed 4562.21 samples/sec Loss 0.8665 Epoch: 15 Global Step: 264100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:17:16,363-Speed 4507.86 samples/sec Loss 0.8745 Epoch: 15 Global Step: 264150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:17:27,388-Speed 4644.07 samples/sec Loss 0.8618 Epoch: 15 Global Step: 264200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:17:38,341-Speed 4675.23 samples/sec Loss 0.8944 Epoch: 15 Global Step: 264250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:17:49,742-Speed 4491.28 samples/sec Loss 0.8894 Epoch: 15 Global Step: 264300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:02,029-Speed 4167.24 samples/sec Loss 0.8957 Epoch: 15 Global Step: 264350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:14,316-Speed 4167.18 samples/sec Loss 0.8661 Epoch: 15 Global Step: 264400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:25,507-Speed 4575.42 samples/sec Loss 0.8817 Epoch: 15 Global Step: 264450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:36,514-Speed 4651.98 samples/sec Loss 0.8853 Epoch: 15 Global Step: 264500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:47,800-Speed 4536.90 samples/sec Loss 0.8837 Epoch: 15 Global Step: 264550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:18:59,048-Speed 4552.18 samples/sec Loss 0.8818 Epoch: 15 Global Step: 264600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:19:10,472-Speed 4482.21 samples/sec Loss 0.8795 Epoch: 15 Global Step: 264650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:19:21,740-Speed 4544.17 samples/sec Loss 0.8621 Epoch: 15 Global Step: 264700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:19:33,248-Speed 4449.13 samples/sec Loss 0.8887 Epoch: 15 Global Step: 264750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:19:44,660-Speed 4486.95 samples/sec Loss 0.8801 Epoch: 15 Global Step: 264800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:19:55,749-Speed 4617.33 samples/sec Loss 0.8707 Epoch: 15 Global Step: 264850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:20:07,826-Speed 4239.75 samples/sec Loss 0.8883 Epoch: 15 Global Step: 264900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:20:19,309-Speed 4458.95 samples/sec Loss 0.8678 Epoch: 15 Global Step: 264950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:20:30,837-Speed 4441.71 samples/sec Loss 0.8771 Epoch: 15 Global Step: 265000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:20:42,959-Speed 4223.73 samples/sec Loss 0.8832 Epoch: 15 Global Step: 265050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:20:54,363-Speed 4490.21 samples/sec Loss 0.8794 Epoch: 15 Global Step: 265100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:21:05,875-Speed 4447.79 samples/sec Loss 0.8681 Epoch: 15 Global Step: 265150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:21:16,796-Speed 4688.42 samples/sec Loss 0.8797 Epoch: 15 Global Step: 265200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:21:27,963-Speed 4585.11 samples/sec Loss 0.8698 Epoch: 15 Global Step: 265250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:21:39,145-Speed 4578.98 samples/sec Loss 0.8856 Epoch: 15 Global Step: 265300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:21:50,328-Speed 4578.95 samples/sec Loss 0.8825 Epoch: 15 Global Step: 265350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:01,446-Speed 4605.46 samples/sec Loss 0.8709 Epoch: 15 Global Step: 265400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:12,598-Speed 4591.37 samples/sec Loss 0.8622 Epoch: 15 Global Step: 265450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:24,373-Speed 4348.22 samples/sec Loss 0.8722 Epoch: 15 Global Step: 265500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:35,669-Speed 4533.06 samples/sec Loss 0.8742 Epoch: 15 Global Step: 265550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:46,968-Speed 4531.39 samples/sec Loss 0.8778 Epoch: 15 Global Step: 265600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:22:58,072-Speed 4611.45 samples/sec Loss 0.8762 Epoch: 15 Global Step: 265650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:23:09,188-Speed 4606.10 samples/sec Loss 0.8907 Epoch: 15 Global Step: 265700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:23:20,727-Speed 4437.36 samples/sec Loss 0.8692 Epoch: 15 Global Step: 265750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:23:31,958-Speed 4559.24 samples/sec Loss 0.8593 Epoch: 15 Global Step: 265800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:23:44,281-Speed 4154.95 samples/sec Loss 0.8654 Epoch: 15 Global Step: 265850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:23:55,726-Speed 4474.02 samples/sec Loss 0.8829 Epoch: 15 Global Step: 265900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:24:06,695-Speed 4667.85 samples/sec Loss 0.8780 Epoch: 15 Global Step: 265950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:24:17,869-Speed 4582.76 samples/sec Loss 0.8639 Epoch: 15 Global Step: 266000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:24:42,906-[lfw][266000]XNorm: 22.439927 Training: 2021-03-18 10:24:42,906-[lfw][266000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-18 10:24:42,906-[lfw][266000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:25:10,427-[cfp_fp][266000]XNorm: 20.853824 Training: 2021-03-18 10:25:10,427-[cfp_fp][266000]Accuracy-Flip: 0.98686+-0.00460 Training: 2021-03-18 10:25:10,428-[cfp_fp][266000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:25:35,352-[agedb_30][266000]XNorm: 22.579311 Training: 2021-03-18 10:25:35,352-[agedb_30][266000]Accuracy-Flip: 0.98133+-0.00627 Training: 2021-03-18 10:25:35,352-[agedb_30][266000]Accuracy-Highest: 0.98333 Training: 2021-03-18 10:25:46,434-Speed 578.11 samples/sec Loss 0.8736 Epoch: 15 Global Step: 266050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:25:58,659-Speed 4188.19 samples/sec Loss 0.8811 Epoch: 15 Global Step: 266100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:26:10,058-Speed 4491.93 samples/sec Loss 0.8582 Epoch: 15 Global Step: 266150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:26:21,213-Speed 4590.57 samples/sec Loss 0.8552 Epoch: 15 Global Step: 266200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:26:32,502-Speed 4535.55 samples/sec Loss 0.8673 Epoch: 15 Global Step: 266250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:26:43,993-Speed 4455.98 samples/sec Loss 0.8684 Epoch: 15 Global Step: 266300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:26:55,096-Speed 4611.70 samples/sec Loss 0.8631 Epoch: 15 Global Step: 266350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:27:06,191-Speed 4614.74 samples/sec Loss 0.8691 Epoch: 15 Global Step: 266400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:27:18,111-Speed 4295.54 samples/sec Loss 0.8788 Epoch: 15 Global Step: 266450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:27:29,234-Speed 4603.53 samples/sec Loss 0.8735 Epoch: 15 Global Step: 266500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:27:40,555-Speed 4522.78 samples/sec Loss 0.8853 Epoch: 15 Global Step: 266550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:27:51,586-Speed 4642.01 samples/sec Loss 0.8826 Epoch: 15 Global Step: 266600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:02,781-Speed 4573.86 samples/sec Loss 0.8890 Epoch: 15 Global Step: 266650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:14,089-Speed 4527.98 samples/sec Loss 0.8813 Epoch: 15 Global Step: 266700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:26,104-Speed 4261.58 samples/sec Loss 0.8722 Epoch: 15 Global Step: 266750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:36,778-Speed 4796.97 samples/sec Loss 0.8829 Epoch: 15 Global Step: 266800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:48,043-Speed 4545.43 samples/sec Loss 0.8823 Epoch: 15 Global Step: 266850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:28:59,601-Speed 4429.78 samples/sec Loss 0.8803 Epoch: 15 Global Step: 266900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:29:10,908-Speed 4528.56 samples/sec Loss 0.8694 Epoch: 15 Global Step: 266950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:29:22,431-Speed 4443.64 samples/sec Loss 0.8815 Epoch: 15 Global Step: 267000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:29:34,126-Speed 4378.35 samples/sec Loss 0.8800 Epoch: 15 Global Step: 267050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:29:58,467-Speed 2103.54 samples/sec Loss 0.8653 Epoch: 16 Global Step: 267100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:30:09,500-Speed 4640.93 samples/sec Loss 0.8802 Epoch: 16 Global Step: 267150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:30:20,871-Speed 4503.06 samples/sec Loss 0.8615 Epoch: 16 Global Step: 267200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:30:32,182-Speed 4526.72 samples/sec Loss 0.8542 Epoch: 16 Global Step: 267250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:30:43,461-Speed 4539.82 samples/sec Loss 0.8483 Epoch: 16 Global Step: 267300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:30:56,944-Speed 3797.49 samples/sec Loss 0.8663 Epoch: 16 Global Step: 267350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:31:08,452-Speed 4449.31 samples/sec Loss 0.8488 Epoch: 16 Global Step: 267400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:31:19,582-Speed 4600.59 samples/sec Loss 0.8699 Epoch: 16 Global Step: 267450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:31:30,572-Speed 4659.41 samples/sec Loss 0.8487 Epoch: 16 Global Step: 267500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:31:42,009-Speed 4477.39 samples/sec Loss 0.8770 Epoch: 16 Global Step: 267550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:31:53,281-Speed 4542.69 samples/sec Loss 0.8577 Epoch: 16 Global Step: 267600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:32:04,692-Speed 4486.88 samples/sec Loss 0.8736 Epoch: 16 Global Step: 267650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:32:16,828-Speed 4219.31 samples/sec Loss 0.8750 Epoch: 16 Global Step: 267700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:32:28,048-Speed 4563.26 samples/sec Loss 0.8440 Epoch: 16 Global Step: 267750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:32:39,250-Speed 4571.31 samples/sec Loss 0.8683 Epoch: 16 Global Step: 267800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:32:50,794-Speed 4435.34 samples/sec Loss 0.8693 Epoch: 16 Global Step: 267850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:33:02,034-Speed 4555.70 samples/sec Loss 0.8539 Epoch: 16 Global Step: 267900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:33:14,541-Speed 4093.92 samples/sec Loss 0.8599 Epoch: 16 Global Step: 267950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:33:25,648-Speed 4609.86 samples/sec Loss 0.8623 Epoch: 16 Global Step: 268000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:33:50,231-[lfw][268000]XNorm: 22.631122 Training: 2021-03-18 10:33:50,231-[lfw][268000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 10:33:50,231-[lfw][268000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:34:18,668-[cfp_fp][268000]XNorm: 20.998433 Training: 2021-03-18 10:34:18,669-[cfp_fp][268000]Accuracy-Flip: 0.98571+-0.00557 Training: 2021-03-18 10:34:18,669-[cfp_fp][268000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:34:42,492-[agedb_30][268000]XNorm: 22.690036 Training: 2021-03-18 10:34:42,492-[agedb_30][268000]Accuracy-Flip: 0.98300+-0.00670 Training: 2021-03-18 10:34:42,492-[agedb_30][268000]Accuracy-Highest: 0.98333 Training: 2021-03-18 10:34:53,790-Speed 580.89 samples/sec Loss 0.8786 Epoch: 16 Global Step: 268050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:35:04,779-Speed 4659.80 samples/sec Loss 0.8609 Epoch: 16 Global Step: 268100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:35:16,191-Speed 4486.71 samples/sec Loss 0.8480 Epoch: 16 Global Step: 268150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:35:27,415-Speed 4561.97 samples/sec Loss 0.8549 Epoch: 16 Global Step: 268200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:35:38,975-Speed 4429.35 samples/sec Loss 0.8510 Epoch: 16 Global Step: 268250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:35:49,895-Speed 4689.16 samples/sec Loss 0.8694 Epoch: 16 Global Step: 268300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:01,227-Speed 4518.55 samples/sec Loss 0.8645 Epoch: 16 Global Step: 268350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:12,362-Speed 4598.58 samples/sec Loss 0.8613 Epoch: 16 Global Step: 268400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:24,002-Speed 4398.89 samples/sec Loss 0.8600 Epoch: 16 Global Step: 268450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:35,111-Speed 4609.40 samples/sec Loss 0.8618 Epoch: 16 Global Step: 268500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:46,477-Speed 4504.73 samples/sec Loss 0.8552 Epoch: 16 Global Step: 268550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:36:57,724-Speed 4552.62 samples/sec Loss 0.8588 Epoch: 16 Global Step: 268600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:37:09,053-Speed 4519.82 samples/sec Loss 0.8528 Epoch: 16 Global Step: 268650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:37:20,351-Speed 4532.28 samples/sec Loss 0.8739 Epoch: 16 Global Step: 268700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:37:32,687-Speed 4150.64 samples/sec Loss 0.8512 Epoch: 16 Global Step: 268750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:37:43,990-Speed 4530.32 samples/sec Loss 0.8599 Epoch: 16 Global Step: 268800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:37:55,501-Speed 4448.14 samples/sec Loss 0.8539 Epoch: 16 Global Step: 268850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:38:06,876-Speed 4501.49 samples/sec Loss 0.8694 Epoch: 16 Global Step: 268900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:38:18,970-Speed 4233.69 samples/sec Loss 0.8640 Epoch: 16 Global Step: 268950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:38:30,488-Speed 4445.68 samples/sec Loss 0.8505 Epoch: 16 Global Step: 269000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:38:41,827-Speed 4515.53 samples/sec Loss 0.8591 Epoch: 16 Global Step: 269050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:38:53,479-Speed 4394.45 samples/sec Loss 0.8739 Epoch: 16 Global Step: 269100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:39:04,902-Speed 4482.60 samples/sec Loss 0.8684 Epoch: 16 Global Step: 269150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:39:16,085-Speed 4578.57 samples/sec Loss 0.8789 Epoch: 16 Global Step: 269200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:39:27,967-Speed 4309.42 samples/sec Loss 0.8668 Epoch: 16 Global Step: 269250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:39:39,182-Speed 4565.57 samples/sec Loss 0.8521 Epoch: 16 Global Step: 269300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:39:50,189-Speed 4651.70 samples/sec Loss 0.8677 Epoch: 16 Global Step: 269350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:01,323-Speed 4598.80 samples/sec Loss 0.8512 Epoch: 16 Global Step: 269400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:12,432-Speed 4609.02 samples/sec Loss 0.8766 Epoch: 16 Global Step: 269450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:24,368-Speed 4289.62 samples/sec Loss 0.8713 Epoch: 16 Global Step: 269500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:35,531-Speed 4587.10 samples/sec Loss 0.8704 Epoch: 16 Global Step: 269550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:46,964-Speed 4478.33 samples/sec Loss 0.8836 Epoch: 16 Global Step: 269600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:40:58,295-Speed 4518.84 samples/sec Loss 0.8724 Epoch: 16 Global Step: 269650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:41:09,822-Speed 4442.26 samples/sec Loss 0.8762 Epoch: 16 Global Step: 269700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:41:21,365-Speed 4435.77 samples/sec Loss 0.8705 Epoch: 16 Global Step: 269750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:41:32,817-Speed 4471.15 samples/sec Loss 0.8782 Epoch: 16 Global Step: 269800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:41:43,959-Speed 4595.52 samples/sec Loss 0.8689 Epoch: 16 Global Step: 269850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:41:55,304-Speed 4513.23 samples/sec Loss 0.8688 Epoch: 16 Global Step: 269900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:42:06,851-Speed 4434.46 samples/sec Loss 0.8628 Epoch: 16 Global Step: 269950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:42:18,010-Speed 4588.45 samples/sec Loss 0.8781 Epoch: 16 Global Step: 270000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:42:42,672-[lfw][270000]XNorm: 22.530820 Training: 2021-03-18 10:42:42,673-[lfw][270000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-18 10:42:42,673-[lfw][270000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:43:10,459-[cfp_fp][270000]XNorm: 20.956120 Training: 2021-03-18 10:43:10,459-[cfp_fp][270000]Accuracy-Flip: 0.98700+-0.00577 Training: 2021-03-18 10:43:10,459-[cfp_fp][270000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:43:35,331-[agedb_30][270000]XNorm: 22.627284 Training: 2021-03-18 10:43:35,331-[agedb_30][270000]Accuracy-Flip: 0.98233+-0.00633 Training: 2021-03-18 10:43:35,331-[agedb_30][270000]Accuracy-Highest: 0.98333 Training: 2021-03-18 10:43:46,537-Speed 578.35 samples/sec Loss 0.8622 Epoch: 16 Global Step: 270050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:43:57,606-Speed 4626.28 samples/sec Loss 0.8545 Epoch: 16 Global Step: 270100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:44:08,848-Speed 4554.65 samples/sec Loss 0.8484 Epoch: 16 Global Step: 270150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:44:21,055-Speed 4194.42 samples/sec Loss 0.8634 Epoch: 16 Global Step: 270200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:44:33,145-Speed 4235.30 samples/sec Loss 0.8738 Epoch: 16 Global Step: 270250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:44:44,536-Speed 4495.09 samples/sec Loss 0.8432 Epoch: 16 Global Step: 270300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:44:55,862-Speed 4520.84 samples/sec Loss 0.8505 Epoch: 16 Global Step: 270350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:45:07,105-Speed 4554.26 samples/sec Loss 0.8762 Epoch: 16 Global Step: 270400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:45:18,502-Speed 4492.63 samples/sec Loss 0.8612 Epoch: 16 Global Step: 270450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:45:29,713-Speed 4567.22 samples/sec Loss 0.8720 Epoch: 16 Global Step: 270500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:45:41,773-Speed 4245.56 samples/sec Loss 0.8593 Epoch: 16 Global Step: 270550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:45:52,793-Speed 4646.25 samples/sec Loss 0.8612 Epoch: 16 Global Step: 270600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:46:03,894-Speed 4612.53 samples/sec Loss 0.8546 Epoch: 16 Global Step: 270650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:46:15,248-Speed 4509.83 samples/sec Loss 0.8687 Epoch: 16 Global Step: 270700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:46:26,697-Speed 4472.43 samples/sec Loss 0.8583 Epoch: 16 Global Step: 270750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:46:38,956-Speed 4176.80 samples/sec Loss 0.8704 Epoch: 16 Global Step: 270800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:46:50,354-Speed 4492.42 samples/sec Loss 0.8715 Epoch: 16 Global Step: 270850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:01,607-Speed 4550.01 samples/sec Loss 0.8540 Epoch: 16 Global Step: 270900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:12,912-Speed 4529.20 samples/sec Loss 0.8689 Epoch: 16 Global Step: 270950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:23,883-Speed 4667.27 samples/sec Loss 0.8534 Epoch: 16 Global Step: 271000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:35,246-Speed 4506.27 samples/sec Loss 0.8737 Epoch: 16 Global Step: 271050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:46,664-Speed 4484.47 samples/sec Loss 0.8696 Epoch: 16 Global Step: 271100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:47:58,111-Speed 4473.02 samples/sec Loss 0.8710 Epoch: 16 Global Step: 271150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:48:09,198-Speed 4618.49 samples/sec Loss 0.8660 Epoch: 16 Global Step: 271200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 10:48:20,500-Speed 4530.07 samples/sec Loss 0.8846 Epoch: 16 Global Step: 271250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:48:31,924-Speed 4482.13 samples/sec Loss 0.8541 Epoch: 16 Global Step: 271300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:48:43,163-Speed 4555.93 samples/sec Loss 0.8659 Epoch: 16 Global Step: 271350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:48:54,433-Speed 4543.49 samples/sec Loss 0.8527 Epoch: 16 Global Step: 271400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:49:05,375-Speed 4679.46 samples/sec Loss 0.8671 Epoch: 16 Global Step: 271450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:49:16,493-Speed 4605.55 samples/sec Loss 0.8603 Epoch: 16 Global Step: 271500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:49:28,309-Speed 4333.42 samples/sec Loss 0.8600 Epoch: 16 Global Step: 271550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:49:39,765-Speed 4469.56 samples/sec Loss 0.8625 Epoch: 16 Global Step: 271600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:49:51,298-Speed 4439.54 samples/sec Loss 0.8570 Epoch: 16 Global Step: 271650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:02,907-Speed 4410.78 samples/sec Loss 0.8618 Epoch: 16 Global Step: 271700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:14,866-Speed 4281.45 samples/sec Loss 0.8718 Epoch: 16 Global Step: 271750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:25,881-Speed 4648.66 samples/sec Loss 0.8793 Epoch: 16 Global Step: 271800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:36,755-Speed 4708.65 samples/sec Loss 0.8801 Epoch: 16 Global Step: 271850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:47,994-Speed 4556.05 samples/sec Loss 0.8716 Epoch: 16 Global Step: 271900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:50:59,236-Speed 4554.58 samples/sec Loss 0.8593 Epoch: 16 Global Step: 271950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:51:10,652-Speed 4485.31 samples/sec Loss 0.8528 Epoch: 16 Global Step: 272000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:51:34,472-[lfw][272000]XNorm: 22.515867 Training: 2021-03-18 10:51:34,473-[lfw][272000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 10:51:34,473-[lfw][272000]Accuracy-Highest: 0.99817 Training: 2021-03-18 10:52:02,048-[cfp_fp][272000]XNorm: 20.984893 Training: 2021-03-18 10:52:02,048-[cfp_fp][272000]Accuracy-Flip: 0.98571+-0.00503 Training: 2021-03-18 10:52:02,048-[cfp_fp][272000]Accuracy-Highest: 0.98743 Training: 2021-03-18 10:52:26,311-[agedb_30][272000]XNorm: 22.629159 Training: 2021-03-18 10:52:26,312-[agedb_30][272000]Accuracy-Flip: 0.98283+-0.00650 Training: 2021-03-18 10:52:26,312-[agedb_30][272000]Accuracy-Highest: 0.98333 Training: 2021-03-18 10:52:37,505-Speed 589.50 samples/sec Loss 0.8648 Epoch: 16 Global Step: 272050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:52:49,562-Speed 4246.90 samples/sec Loss 0.8702 Epoch: 16 Global Step: 272100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:00,702-Speed 4596.19 samples/sec Loss 0.8559 Epoch: 16 Global Step: 272150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:13,130-Speed 4120.05 samples/sec Loss 0.8627 Epoch: 16 Global Step: 272200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:24,134-Speed 4653.40 samples/sec Loss 0.8908 Epoch: 16 Global Step: 272250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:35,379-Speed 4553.22 samples/sec Loss 0.8475 Epoch: 16 Global Step: 272300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:46,472-Speed 4616.13 samples/sec Loss 0.8652 Epoch: 16 Global Step: 272350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:53:58,302-Speed 4327.86 samples/sec Loss 0.8710 Epoch: 16 Global Step: 272400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:54:09,575-Speed 4542.42 samples/sec Loss 0.8708 Epoch: 16 Global Step: 272450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:54:20,573-Speed 4655.58 samples/sec Loss 0.8444 Epoch: 16 Global Step: 272500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:54:31,841-Speed 4544.31 samples/sec Loss 0.8734 Epoch: 16 Global Step: 272550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:54:42,648-Speed 4737.90 samples/sec Loss 0.8645 Epoch: 16 Global Step: 272600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:54:53,954-Speed 4529.08 samples/sec Loss 0.8671 Epoch: 16 Global Step: 272650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:55:04,942-Speed 4659.85 samples/sec Loss 0.8655 Epoch: 16 Global Step: 272700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:55:16,188-Speed 4553.24 samples/sec Loss 0.8556 Epoch: 16 Global Step: 272750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:55:27,389-Speed 4571.18 samples/sec Loss 0.8698 Epoch: 16 Global Step: 272800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:55:38,616-Speed 4560.65 samples/sec Loss 0.8800 Epoch: 16 Global Step: 272850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:55:49,926-Speed 4527.62 samples/sec Loss 0.8578 Epoch: 16 Global Step: 272900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:01,372-Speed 4473.28 samples/sec Loss 0.8593 Epoch: 16 Global Step: 272950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:12,413-Speed 4637.64 samples/sec Loss 0.8802 Epoch: 16 Global Step: 273000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:23,711-Speed 4532.15 samples/sec Loss 0.8697 Epoch: 16 Global Step: 273050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:35,966-Speed 4178.11 samples/sec Loss 0.8601 Epoch: 16 Global Step: 273100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:48,150-Speed 4202.49 samples/sec Loss 0.8444 Epoch: 16 Global Step: 273150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:56:59,361-Speed 4567.28 samples/sec Loss 0.8721 Epoch: 16 Global Step: 273200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:57:10,746-Speed 4497.34 samples/sec Loss 0.8603 Epoch: 16 Global Step: 273250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:57:22,037-Speed 4535.28 samples/sec Loss 0.8692 Epoch: 16 Global Step: 273300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:57:34,332-Speed 4164.43 samples/sec Loss 0.8532 Epoch: 16 Global Step: 273350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:57:45,326-Speed 4657.40 samples/sec Loss 0.8679 Epoch: 16 Global Step: 273400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:57:56,309-Speed 4662.13 samples/sec Loss 0.8663 Epoch: 16 Global Step: 273450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:58:07,931-Speed 4405.42 samples/sec Loss 0.8755 Epoch: 16 Global Step: 273500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:58:19,097-Speed 4585.73 samples/sec Loss 0.8639 Epoch: 16 Global Step: 273550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:58:30,609-Speed 4447.77 samples/sec Loss 0.8619 Epoch: 16 Global Step: 273600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:58:42,056-Speed 4473.14 samples/sec Loss 0.8468 Epoch: 16 Global Step: 273650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:58:54,116-Speed 4245.84 samples/sec Loss 0.8631 Epoch: 16 Global Step: 273700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:59:05,397-Speed 4538.54 samples/sec Loss 0.8780 Epoch: 16 Global Step: 273750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:59:16,862-Speed 4466.20 samples/sec Loss 0.8675 Epoch: 16 Global Step: 273800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:59:27,897-Speed 4640.27 samples/sec Loss 0.8523 Epoch: 16 Global Step: 273850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:59:39,087-Speed 4575.64 samples/sec Loss 0.8656 Epoch: 16 Global Step: 273900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 10:59:50,193-Speed 4610.33 samples/sec Loss 0.8579 Epoch: 16 Global Step: 273950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:00:01,887-Speed 4378.83 samples/sec Loss 0.8648 Epoch: 16 Global Step: 274000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:00:26,507-[lfw][274000]XNorm: 22.563622 Training: 2021-03-18 11:00:26,507-[lfw][274000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:00:26,507-[lfw][274000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:00:54,196-[cfp_fp][274000]XNorm: 20.976209 Training: 2021-03-18 11:00:54,196-[cfp_fp][274000]Accuracy-Flip: 0.98671+-0.00542 Training: 2021-03-18 11:00:54,196-[cfp_fp][274000]Accuracy-Highest: 0.98743 Training: 2021-03-18 11:01:18,350-[agedb_30][274000]XNorm: 22.714831 Training: 2021-03-18 11:01:18,350-[agedb_30][274000]Accuracy-Flip: 0.98333+-0.00628 Training: 2021-03-18 11:01:18,350-[agedb_30][274000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:01:29,231-Speed 586.19 samples/sec Loss 0.8724 Epoch: 16 Global Step: 274050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:01:40,657-Speed 4481.30 samples/sec Loss 0.8746 Epoch: 16 Global Step: 274100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:01:51,717-Speed 4629.35 samples/sec Loss 0.8702 Epoch: 16 Global Step: 274150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:03,147-Speed 4479.91 samples/sec Loss 0.8623 Epoch: 16 Global Step: 274200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:14,290-Speed 4595.33 samples/sec Loss 0.8758 Epoch: 16 Global Step: 274250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:25,686-Speed 4493.16 samples/sec Loss 0.8632 Epoch: 16 Global Step: 274300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:36,737-Speed 4633.24 samples/sec Loss 0.8678 Epoch: 16 Global Step: 274350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:48,105-Speed 4504.14 samples/sec Loss 0.8719 Epoch: 16 Global Step: 274400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:02:59,986-Speed 4309.63 samples/sec Loss 0.8681 Epoch: 16 Global Step: 274450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:03:12,037-Speed 4248.82 samples/sec Loss 0.8666 Epoch: 16 Global Step: 274500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:03:23,250-Speed 4566.53 samples/sec Loss 0.8470 Epoch: 16 Global Step: 274550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:03:34,466-Speed 4564.91 samples/sec Loss 0.8671 Epoch: 16 Global Step: 274600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:03:45,865-Speed 4492.27 samples/sec Loss 0.8558 Epoch: 16 Global Step: 274650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:03:57,030-Speed 4585.68 samples/sec Loss 0.8622 Epoch: 16 Global Step: 274700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:04:08,486-Speed 4469.64 samples/sec Loss 0.8635 Epoch: 16 Global Step: 274750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:04:20,566-Speed 4238.84 samples/sec Loss 0.8525 Epoch: 16 Global Step: 274800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:04:31,720-Speed 4590.55 samples/sec Loss 0.8759 Epoch: 16 Global Step: 274850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:04:42,898-Speed 4580.88 samples/sec Loss 0.8633 Epoch: 16 Global Step: 274900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:04:54,204-Speed 4528.76 samples/sec Loss 0.8724 Epoch: 16 Global Step: 274950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:05:05,338-Speed 4598.89 samples/sec Loss 0.8774 Epoch: 16 Global Step: 275000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:05:17,467-Speed 4221.52 samples/sec Loss 0.8793 Epoch: 16 Global Step: 275050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:05:28,656-Speed 4576.26 samples/sec Loss 0.8582 Epoch: 16 Global Step: 275100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:05:39,547-Speed 4701.44 samples/sec Loss 0.8726 Epoch: 16 Global Step: 275150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:05:50,539-Speed 4658.10 samples/sec Loss 0.8551 Epoch: 16 Global Step: 275200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:01,859-Speed 4523.54 samples/sec Loss 0.8560 Epoch: 16 Global Step: 275250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:13,067-Speed 4568.34 samples/sec Loss 0.8581 Epoch: 16 Global Step: 275300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:24,419-Speed 4510.47 samples/sec Loss 0.8723 Epoch: 16 Global Step: 275350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:35,495-Speed 4622.92 samples/sec Loss 0.8534 Epoch: 16 Global Step: 275400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:47,085-Speed 4417.93 samples/sec Loss 0.8526 Epoch: 16 Global Step: 275450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:06:58,300-Speed 4565.91 samples/sec Loss 0.8732 Epoch: 16 Global Step: 275500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:07:09,946-Speed 4396.64 samples/sec Loss 0.8785 Epoch: 16 Global Step: 275550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:07:21,325-Speed 4499.84 samples/sec Loss 0.8604 Epoch: 16 Global Step: 275600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:07:32,721-Speed 4492.96 samples/sec Loss 0.8814 Epoch: 16 Global Step: 275650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:07:43,976-Speed 4549.42 samples/sec Loss 0.8872 Epoch: 16 Global Step: 275700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:07:55,004-Speed 4643.09 samples/sec Loss 0.8578 Epoch: 16 Global Step: 275750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:08:06,100-Speed 4614.81 samples/sec Loss 0.8780 Epoch: 16 Global Step: 275800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:08:17,301-Speed 4571.16 samples/sec Loss 0.8805 Epoch: 16 Global Step: 275850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:08:28,684-Speed 4498.01 samples/sec Loss 0.8607 Epoch: 16 Global Step: 275900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:08:39,912-Speed 4560.73 samples/sec Loss 0.8620 Epoch: 16 Global Step: 275950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:08:52,253-Speed 4148.74 samples/sec Loss 0.8669 Epoch: 16 Global Step: 276000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:09:16,747-[lfw][276000]XNorm: 22.506871 Training: 2021-03-18 11:09:16,747-[lfw][276000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:09:16,747-[lfw][276000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:09:44,329-[cfp_fp][276000]XNorm: 20.900861 Training: 2021-03-18 11:09:44,329-[cfp_fp][276000]Accuracy-Flip: 0.98729+-0.00497 Training: 2021-03-18 11:09:44,329-[cfp_fp][276000]Accuracy-Highest: 0.98743 Training: 2021-03-18 11:10:08,014-[agedb_30][276000]XNorm: 22.577866 Training: 2021-03-18 11:10:08,014-[agedb_30][276000]Accuracy-Flip: 0.98217+-0.00683 Training: 2021-03-18 11:10:08,014-[agedb_30][276000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:10:19,282-Speed 588.32 samples/sec Loss 0.8671 Epoch: 16 Global Step: 276050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:10:30,469-Speed 4576.82 samples/sec Loss 0.8581 Epoch: 16 Global Step: 276100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:10:42,624-Speed 4212.57 samples/sec Loss 0.8718 Epoch: 16 Global Step: 276150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:10:54,887-Speed 4175.64 samples/sec Loss 0.8620 Epoch: 16 Global Step: 276200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:11:06,239-Speed 4510.30 samples/sec Loss 0.8556 Epoch: 16 Global Step: 276250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:11:17,369-Speed 4600.98 samples/sec Loss 0.8691 Epoch: 16 Global Step: 276300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:11:28,601-Speed 4558.58 samples/sec Loss 0.8636 Epoch: 16 Global Step: 276350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:11:40,063-Speed 4467.41 samples/sec Loss 0.8623 Epoch: 16 Global Step: 276400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:11:50,994-Speed 4683.98 samples/sec Loss 0.8625 Epoch: 16 Global Step: 276450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:02,169-Speed 4582.21 samples/sec Loss 0.8498 Epoch: 16 Global Step: 276500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:13,554-Speed 4497.24 samples/sec Loss 0.8526 Epoch: 16 Global Step: 276550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:25,037-Speed 4458.91 samples/sec Loss 0.8621 Epoch: 16 Global Step: 276600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:36,736-Speed 4376.61 samples/sec Loss 0.8690 Epoch: 16 Global Step: 276650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:47,935-Speed 4572.48 samples/sec Loss 0.8627 Epoch: 16 Global Step: 276700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:12:58,848-Speed 4691.63 samples/sec Loss 0.8766 Epoch: 16 Global Step: 276750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:13:10,353-Speed 4450.77 samples/sec Loss 0.8726 Epoch: 16 Global Step: 276800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:13:21,649-Speed 4532.93 samples/sec Loss 0.8677 Epoch: 16 Global Step: 276850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:13:32,748-Speed 4613.04 samples/sec Loss 0.8726 Epoch: 16 Global Step: 276900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:13:43,954-Speed 4569.43 samples/sec Loss 0.8629 Epoch: 16 Global Step: 276950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:13:54,959-Speed 4652.76 samples/sec Loss 0.8476 Epoch: 16 Global Step: 277000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:14:06,134-Speed 4581.60 samples/sec Loss 0.8604 Epoch: 16 Global Step: 277050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:14:17,404-Speed 4543.45 samples/sec Loss 0.8529 Epoch: 16 Global Step: 277100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:14:28,558-Speed 4590.68 samples/sec Loss 0.8646 Epoch: 16 Global Step: 277150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:14:40,990-Speed 4118.67 samples/sec Loss 0.8653 Epoch: 16 Global Step: 277200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:14:52,120-Speed 4600.55 samples/sec Loss 0.8631 Epoch: 16 Global Step: 277250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:15:04,059-Speed 4288.68 samples/sec Loss 0.8516 Epoch: 16 Global Step: 277300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:15:15,758-Speed 4376.70 samples/sec Loss 0.8604 Epoch: 16 Global Step: 277350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:15:27,025-Speed 4544.86 samples/sec Loss 0.8750 Epoch: 16 Global Step: 277400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:15:38,643-Speed 4407.24 samples/sec Loss 0.8637 Epoch: 16 Global Step: 277450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:15:49,653-Speed 4650.75 samples/sec Loss 0.8676 Epoch: 16 Global Step: 277500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:01,716-Speed 4244.54 samples/sec Loss 0.8740 Epoch: 16 Global Step: 277550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:12,657-Speed 4680.12 samples/sec Loss 0.8600 Epoch: 16 Global Step: 277600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:24,028-Speed 4503.06 samples/sec Loss 0.8776 Epoch: 16 Global Step: 277650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:35,125-Speed 4614.13 samples/sec Loss 0.8526 Epoch: 16 Global Step: 277700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:47,130-Speed 4264.95 samples/sec Loss 0.8654 Epoch: 16 Global Step: 277750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:16:58,490-Speed 4507.48 samples/sec Loss 0.8693 Epoch: 16 Global Step: 277800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:17:09,886-Speed 4492.78 samples/sec Loss 0.8725 Epoch: 16 Global Step: 277850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:17:21,161-Speed 4541.51 samples/sec Loss 0.8791 Epoch: 16 Global Step: 277900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:17:32,278-Speed 4605.84 samples/sec Loss 0.8510 Epoch: 16 Global Step: 277950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:17:43,624-Speed 4513.38 samples/sec Loss 0.8752 Epoch: 16 Global Step: 278000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:18:07,592-[lfw][278000]XNorm: 22.523660 Training: 2021-03-18 11:18:07,592-[lfw][278000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:18:07,593-[lfw][278000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:18:35,109-[cfp_fp][278000]XNorm: 20.924043 Training: 2021-03-18 11:18:35,109-[cfp_fp][278000]Accuracy-Flip: 0.98786+-0.00492 Training: 2021-03-18 11:18:35,109-[cfp_fp][278000]Accuracy-Highest: 0.98786 Training: 2021-03-18 11:18:58,993-[agedb_30][278000]XNorm: 22.643106 Training: 2021-03-18 11:18:58,994-[agedb_30][278000]Accuracy-Flip: 0.98300+-0.00623 Training: 2021-03-18 11:18:58,994-[agedb_30][278000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:19:10,254-Speed 591.02 samples/sec Loss 0.8704 Epoch: 16 Global Step: 278050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:19:21,582-Speed 4519.97 samples/sec Loss 0.8776 Epoch: 16 Global Step: 278100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:19:33,278-Speed 4377.97 samples/sec Loss 0.8691 Epoch: 16 Global Step: 278150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:19:44,424-Speed 4593.50 samples/sec Loss 0.8709 Epoch: 16 Global Step: 278200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:19:55,700-Speed 4540.93 samples/sec Loss 0.8583 Epoch: 16 Global Step: 278250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:20:07,133-Speed 4478.62 samples/sec Loss 0.8588 Epoch: 16 Global Step: 278300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:20:18,120-Speed 4660.42 samples/sec Loss 0.8454 Epoch: 16 Global Step: 278350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:20:29,754-Speed 4401.08 samples/sec Loss 0.8651 Epoch: 16 Global Step: 278400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:20:40,858-Speed 4611.18 samples/sec Loss 0.8658 Epoch: 16 Global Step: 278450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:20:52,000-Speed 4595.67 samples/sec Loss 0.8575 Epoch: 16 Global Step: 278500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:03,130-Speed 4600.43 samples/sec Loss 0.8667 Epoch: 16 Global Step: 278550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:14,157-Speed 4643.59 samples/sec Loss 0.8646 Epoch: 16 Global Step: 278600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:25,517-Speed 4507.04 samples/sec Loss 0.8694 Epoch: 16 Global Step: 278650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:36,776-Speed 4547.89 samples/sec Loss 0.8670 Epoch: 16 Global Step: 278700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:48,697-Speed 4295.33 samples/sec Loss 0.8512 Epoch: 16 Global Step: 278750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:21:59,993-Speed 4532.77 samples/sec Loss 0.8682 Epoch: 16 Global Step: 278800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:22:11,011-Speed 4647.28 samples/sec Loss 0.8677 Epoch: 16 Global Step: 278850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:22:22,179-Speed 4584.68 samples/sec Loss 0.8542 Epoch: 16 Global Step: 278900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:22:34,065-Speed 4308.08 samples/sec Loss 0.8715 Epoch: 16 Global Step: 278950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:22:46,176-Speed 4227.66 samples/sec Loss 0.8726 Epoch: 16 Global Step: 279000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:22:57,611-Speed 4477.46 samples/sec Loss 0.8776 Epoch: 16 Global Step: 279050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:23:08,802-Speed 4575.40 samples/sec Loss 0.8635 Epoch: 16 Global Step: 279100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:23:19,997-Speed 4573.98 samples/sec Loss 0.8586 Epoch: 16 Global Step: 279150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:23:31,222-Speed 4561.45 samples/sec Loss 0.8589 Epoch: 16 Global Step: 279200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:23:42,786-Speed 4427.65 samples/sec Loss 0.8596 Epoch: 16 Global Step: 279250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:23:54,121-Speed 4517.31 samples/sec Loss 0.8689 Epoch: 16 Global Step: 279300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:24:05,331-Speed 4567.90 samples/sec Loss 0.8630 Epoch: 16 Global Step: 279350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:24:16,523-Speed 4574.88 samples/sec Loss 0.8599 Epoch: 16 Global Step: 279400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:24:28,196-Speed 4386.35 samples/sec Loss 0.8586 Epoch: 16 Global Step: 279450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:24:39,016-Speed 4732.45 samples/sec Loss 0.8720 Epoch: 16 Global Step: 279500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:24:51,354-Speed 4149.81 samples/sec Loss 0.8725 Epoch: 16 Global Step: 279550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:02,339-Speed 4661.09 samples/sec Loss 0.8598 Epoch: 16 Global Step: 279600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:13,788-Speed 4472.42 samples/sec Loss 0.8771 Epoch: 16 Global Step: 279650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:24,866-Speed 4622.02 samples/sec Loss 0.8507 Epoch: 16 Global Step: 279700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:35,964-Speed 4614.08 samples/sec Loss 0.8704 Epoch: 16 Global Step: 279750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:47,298-Speed 4517.49 samples/sec Loss 0.8678 Epoch: 16 Global Step: 279800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:25:58,443-Speed 4594.33 samples/sec Loss 0.8812 Epoch: 16 Global Step: 279850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:26:09,396-Speed 4674.80 samples/sec Loss 0.8696 Epoch: 16 Global Step: 279900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:26:20,788-Speed 4494.73 samples/sec Loss 0.8714 Epoch: 16 Global Step: 279950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:26:31,918-Speed 4600.46 samples/sec Loss 0.8570 Epoch: 16 Global Step: 280000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:26:56,551-[lfw][280000]XNorm: 22.606756 Training: 2021-03-18 11:26:56,552-[lfw][280000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-18 11:26:56,552-[lfw][280000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:27:24,285-[cfp_fp][280000]XNorm: 20.954332 Training: 2021-03-18 11:27:24,285-[cfp_fp][280000]Accuracy-Flip: 0.98700+-0.00435 Training: 2021-03-18 11:27:24,285-[cfp_fp][280000]Accuracy-Highest: 0.98786 Training: 2021-03-18 11:27:48,405-[agedb_30][280000]XNorm: 22.696281 Training: 2021-03-18 11:27:48,405-[agedb_30][280000]Accuracy-Flip: 0.98167+-0.00675 Training: 2021-03-18 11:27:48,405-[agedb_30][280000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:27:59,815-Speed 582.51 samples/sec Loss 0.8690 Epoch: 16 Global Step: 280050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:28:11,732-Speed 4296.71 samples/sec Loss 0.8580 Epoch: 16 Global Step: 280100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:28:23,916-Speed 4202.28 samples/sec Loss 0.8645 Epoch: 16 Global Step: 280150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:28:35,513-Speed 4415.32 samples/sec Loss 0.8637 Epoch: 16 Global Step: 280200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:28:46,861-Speed 4512.14 samples/sec Loss 0.8557 Epoch: 16 Global Step: 280250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:28:57,993-Speed 4599.43 samples/sec Loss 0.8655 Epoch: 16 Global Step: 280300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:29:09,243-Speed 4551.46 samples/sec Loss 0.8491 Epoch: 16 Global Step: 280350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:29:21,060-Speed 4333.15 samples/sec Loss 0.8785 Epoch: 16 Global Step: 280400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:29:32,365-Speed 4529.02 samples/sec Loss 0.8633 Epoch: 16 Global Step: 280450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:29:43,573-Speed 4568.68 samples/sec Loss 0.8561 Epoch: 16 Global Step: 280500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:29:55,569-Speed 4268.20 samples/sec Loss 0.8653 Epoch: 16 Global Step: 280550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:30:06,512-Speed 4679.05 samples/sec Loss 0.8704 Epoch: 16 Global Step: 280600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:30:17,653-Speed 4595.99 samples/sec Loss 0.8472 Epoch: 16 Global Step: 280650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:30:29,044-Speed 4494.79 samples/sec Loss 0.8585 Epoch: 16 Global Step: 280700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:30:40,134-Speed 4617.20 samples/sec Loss 0.8545 Epoch: 16 Global Step: 280750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:30:51,653-Speed 4445.13 samples/sec Loss 0.8658 Epoch: 16 Global Step: 280800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:03,080-Speed 4480.93 samples/sec Loss 0.8476 Epoch: 16 Global Step: 280850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:14,434-Speed 4509.60 samples/sec Loss 0.8674 Epoch: 16 Global Step: 280900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:25,746-Speed 4526.51 samples/sec Loss 0.8616 Epoch: 16 Global Step: 280950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:37,332-Speed 4419.51 samples/sec Loss 0.8695 Epoch: 16 Global Step: 281000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:48,535-Speed 4570.59 samples/sec Loss 0.8590 Epoch: 16 Global Step: 281050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:31:59,808-Speed 4541.89 samples/sec Loss 0.8636 Epoch: 16 Global Step: 281100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:32:10,908-Speed 4612.85 samples/sec Loss 0.8680 Epoch: 16 Global Step: 281150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:32:22,149-Speed 4554.86 samples/sec Loss 0.8536 Epoch: 16 Global Step: 281200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:32:33,395-Speed 4553.02 samples/sec Loss 0.8522 Epoch: 16 Global Step: 281250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:32:45,052-Speed 4392.65 samples/sec Loss 0.8647 Epoch: 16 Global Step: 281300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:32:56,444-Speed 4494.62 samples/sec Loss 0.8536 Epoch: 16 Global Step: 281350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:33:07,572-Speed 4601.46 samples/sec Loss 0.8721 Epoch: 16 Global Step: 281400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:33:19,089-Speed 4445.59 samples/sec Loss 0.8609 Epoch: 16 Global Step: 281450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:33:30,278-Speed 4576.35 samples/sec Loss 0.8647 Epoch: 16 Global Step: 281500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:33:41,740-Speed 4467.46 samples/sec Loss 0.8533 Epoch: 16 Global Step: 281550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:33:53,922-Speed 4202.94 samples/sec Loss 0.8708 Epoch: 16 Global Step: 281600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:34:05,305-Speed 4498.40 samples/sec Loss 0.8545 Epoch: 16 Global Step: 281650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:34:16,526-Speed 4563.14 samples/sec Loss 0.8787 Epoch: 16 Global Step: 281700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:34:27,853-Speed 4520.27 samples/sec Loss 0.8624 Epoch: 16 Global Step: 281750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:34:39,788-Speed 4290.09 samples/sec Loss 0.8593 Epoch: 16 Global Step: 281800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:34:52,100-Speed 4158.73 samples/sec Loss 0.8622 Epoch: 16 Global Step: 281850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:35:03,442-Speed 4514.64 samples/sec Loss 0.8596 Epoch: 16 Global Step: 281900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:35:14,742-Speed 4531.16 samples/sec Loss 0.8603 Epoch: 16 Global Step: 281950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:35:25,996-Speed 4550.07 samples/sec Loss 0.8584 Epoch: 16 Global Step: 282000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:35:50,377-[lfw][282000]XNorm: 22.432992 Training: 2021-03-18 11:35:50,378-[lfw][282000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:35:50,378-[lfw][282000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:36:18,165-[cfp_fp][282000]XNorm: 20.886162 Training: 2021-03-18 11:36:18,165-[cfp_fp][282000]Accuracy-Flip: 0.98686+-0.00502 Training: 2021-03-18 11:36:18,165-[cfp_fp][282000]Accuracy-Highest: 0.98786 Training: 2021-03-18 11:36:41,981-[agedb_30][282000]XNorm: 22.573342 Training: 2021-03-18 11:36:41,981-[agedb_30][282000]Accuracy-Flip: 0.98250+-0.00684 Training: 2021-03-18 11:36:41,981-[agedb_30][282000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:36:53,342-Speed 586.17 samples/sec Loss 0.8534 Epoch: 16 Global Step: 282050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:37:04,646-Speed 4529.63 samples/sec Loss 0.8703 Epoch: 16 Global Step: 282100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:37:16,039-Speed 4494.31 samples/sec Loss 0.8526 Epoch: 16 Global Step: 282150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:37:27,258-Speed 4564.03 samples/sec Loss 0.8548 Epoch: 16 Global Step: 282200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:37:38,571-Speed 4526.01 samples/sec Loss 0.8776 Epoch: 16 Global Step: 282250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:37:49,778-Speed 4568.85 samples/sec Loss 0.8603 Epoch: 16 Global Step: 282300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:01,309-Speed 4440.48 samples/sec Loss 0.8724 Epoch: 16 Global Step: 282350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:12,460-Speed 4591.57 samples/sec Loss 0.8681 Epoch: 16 Global Step: 282400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:24,466-Speed 4264.74 samples/sec Loss 0.8565 Epoch: 16 Global Step: 282450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:35,943-Speed 4461.60 samples/sec Loss 0.8447 Epoch: 16 Global Step: 282500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:47,484-Speed 4436.50 samples/sec Loss 0.8601 Epoch: 16 Global Step: 282550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:38:58,920-Speed 4477.40 samples/sec Loss 0.8735 Epoch: 16 Global Step: 282600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:39:10,449-Speed 4441.45 samples/sec Loss 0.8679 Epoch: 16 Global Step: 282650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:39:21,740-Speed 4534.79 samples/sec Loss 0.8630 Epoch: 16 Global Step: 282700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:39:33,072-Speed 4518.46 samples/sec Loss 0.8500 Epoch: 16 Global Step: 282750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:39:44,212-Speed 4596.54 samples/sec Loss 0.8681 Epoch: 16 Global Step: 282800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:39:55,321-Speed 4609.19 samples/sec Loss 0.8583 Epoch: 16 Global Step: 282850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:40:06,452-Speed 4599.88 samples/sec Loss 0.8639 Epoch: 16 Global Step: 282900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:40:18,852-Speed 4129.43 samples/sec Loss 0.8577 Epoch: 16 Global Step: 282950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:40:30,891-Speed 4252.97 samples/sec Loss 0.8600 Epoch: 16 Global Step: 283000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:40:41,897-Speed 4652.59 samples/sec Loss 0.8704 Epoch: 16 Global Step: 283050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:40:53,106-Speed 4568.04 samples/sec Loss 0.8592 Epoch: 16 Global Step: 283100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:41:04,290-Speed 4578.01 samples/sec Loss 0.8694 Epoch: 16 Global Step: 283150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:41:16,494-Speed 4195.85 samples/sec Loss 0.8776 Epoch: 16 Global Step: 283200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:41:27,694-Speed 4571.76 samples/sec Loss 0.8638 Epoch: 16 Global Step: 283250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:41:39,065-Speed 4502.70 samples/sec Loss 0.8613 Epoch: 16 Global Step: 283300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:41:50,309-Speed 4553.97 samples/sec Loss 0.8581 Epoch: 16 Global Step: 283350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:02,123-Speed 4334.02 samples/sec Loss 0.8722 Epoch: 16 Global Step: 283400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:13,257-Speed 4598.77 samples/sec Loss 0.8657 Epoch: 16 Global Step: 283450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:24,787-Speed 4440.90 samples/sec Loss 0.8697 Epoch: 16 Global Step: 283500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:36,135-Speed 4512.17 samples/sec Loss 0.8755 Epoch: 16 Global Step: 283550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:47,439-Speed 4529.62 samples/sec Loss 0.8650 Epoch: 16 Global Step: 283600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:42:58,594-Speed 4590.16 samples/sec Loss 0.8488 Epoch: 16 Global Step: 283650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:43:09,976-Speed 4498.63 samples/sec Loss 0.8580 Epoch: 16 Global Step: 283700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:43:33,326-Speed 2192.80 samples/sec Loss 0.8619 Epoch: 17 Global Step: 283750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:43:44,612-Speed 4536.99 samples/sec Loss 0.8472 Epoch: 17 Global Step: 283800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:43:55,970-Speed 4507.86 samples/sec Loss 0.8432 Epoch: 17 Global Step: 283850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:44:07,198-Speed 4560.61 samples/sec Loss 0.8555 Epoch: 17 Global Step: 283900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:44:18,442-Speed 4553.83 samples/sec Loss 0.8539 Epoch: 17 Global Step: 283950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:44:29,775-Speed 4517.89 samples/sec Loss 0.8352 Epoch: 17 Global Step: 284000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:44:54,442-[lfw][284000]XNorm: 22.508358 Training: 2021-03-18 11:44:54,442-[lfw][284000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:44:54,442-[lfw][284000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:45:22,218-[cfp_fp][284000]XNorm: 20.924119 Training: 2021-03-18 11:45:22,219-[cfp_fp][284000]Accuracy-Flip: 0.98729+-0.00458 Training: 2021-03-18 11:45:22,219-[cfp_fp][284000]Accuracy-Highest: 0.98786 Training: 2021-03-18 11:45:46,092-[agedb_30][284000]XNorm: 22.639929 Training: 2021-03-18 11:45:46,092-[agedb_30][284000]Accuracy-Flip: 0.98167+-0.00675 Training: 2021-03-18 11:45:46,092-[agedb_30][284000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:45:57,350-Speed 584.65 samples/sec Loss 0.8411 Epoch: 17 Global Step: 284050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:46:08,596-Speed 4553.00 samples/sec Loss 0.8602 Epoch: 17 Global Step: 284100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:46:19,734-Speed 4597.22 samples/sec Loss 0.8442 Epoch: 17 Global Step: 284150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:46:30,730-Speed 4656.33 samples/sec Loss 0.8492 Epoch: 17 Global Step: 284200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:46:42,155-Speed 4481.89 samples/sec Loss 0.8485 Epoch: 17 Global Step: 284250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:46:53,451-Speed 4532.97 samples/sec Loss 0.8393 Epoch: 17 Global Step: 284300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:47:04,606-Speed 4590.10 samples/sec Loss 0.8502 Epoch: 17 Global Step: 284350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:47:16,709-Speed 4230.55 samples/sec Loss 0.8424 Epoch: 17 Global Step: 284400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:47:27,947-Speed 4556.43 samples/sec Loss 0.8470 Epoch: 17 Global Step: 284450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:47:38,846-Speed 4697.66 samples/sec Loss 0.8376 Epoch: 17 Global Step: 284500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:47:49,911-Speed 4627.86 samples/sec Loss 0.8510 Epoch: 17 Global Step: 284550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:00,989-Speed 4622.07 samples/sec Loss 0.8474 Epoch: 17 Global Step: 284600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:14,211-Speed 3872.45 samples/sec Loss 0.8411 Epoch: 17 Global Step: 284650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:25,513-Speed 4530.60 samples/sec Loss 0.8407 Epoch: 17 Global Step: 284700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:36,834-Speed 4522.93 samples/sec Loss 0.8515 Epoch: 17 Global Step: 284750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:48,074-Speed 4555.48 samples/sec Loss 0.8492 Epoch: 17 Global Step: 284800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:48:59,468-Speed 4493.97 samples/sec Loss 0.8326 Epoch: 17 Global Step: 284850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:49:10,816-Speed 4512.20 samples/sec Loss 0.8319 Epoch: 17 Global Step: 284900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:49:22,031-Speed 4565.45 samples/sec Loss 0.8458 Epoch: 17 Global Step: 284950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:49:33,109-Speed 4622.12 samples/sec Loss 0.8679 Epoch: 17 Global Step: 285000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:49:44,476-Speed 4504.51 samples/sec Loss 0.8446 Epoch: 17 Global Step: 285050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:49:55,573-Speed 4614.16 samples/sec Loss 0.8426 Epoch: 17 Global Step: 285100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:50:07,022-Speed 4472.50 samples/sec Loss 0.8525 Epoch: 17 Global Step: 285150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:50:18,179-Speed 4589.57 samples/sec Loss 0.8611 Epoch: 17 Global Step: 285200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 11:50:29,034-Speed 4716.98 samples/sec Loss 0.8401 Epoch: 17 Global Step: 285250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:50:41,135-Speed 4231.38 samples/sec Loss 0.8590 Epoch: 17 Global Step: 285300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:50:52,469-Speed 4517.53 samples/sec Loss 0.8674 Epoch: 17 Global Step: 285350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:03,495-Speed 4643.83 samples/sec Loss 0.8272 Epoch: 17 Global Step: 285400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:14,414-Speed 4689.51 samples/sec Loss 0.8439 Epoch: 17 Global Step: 285450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:25,456-Speed 4637.06 samples/sec Loss 0.8451 Epoch: 17 Global Step: 285500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:36,924-Speed 4464.84 samples/sec Loss 0.8408 Epoch: 17 Global Step: 285550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:48,434-Speed 4448.56 samples/sec Loss 0.8414 Epoch: 17 Global Step: 285600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:51:59,898-Speed 4466.42 samples/sec Loss 0.8346 Epoch: 17 Global Step: 285650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:52:11,074-Speed 4581.80 samples/sec Loss 0.8333 Epoch: 17 Global Step: 285700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:52:22,064-Speed 4658.97 samples/sec Loss 0.8488 Epoch: 17 Global Step: 285750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:52:34,147-Speed 4237.52 samples/sec Loss 0.8551 Epoch: 17 Global Step: 285800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:52:46,103-Speed 4282.65 samples/sec Loss 0.8307 Epoch: 17 Global Step: 285850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:52:57,038-Speed 4682.55 samples/sec Loss 0.8632 Epoch: 17 Global Step: 285900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:53:08,008-Speed 4667.51 samples/sec Loss 0.8578 Epoch: 17 Global Step: 285950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:53:19,305-Speed 4532.83 samples/sec Loss 0.8330 Epoch: 17 Global Step: 286000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:53:43,721-[lfw][286000]XNorm: 22.526878 Training: 2021-03-18 11:53:43,721-[lfw][286000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 11:53:43,721-[lfw][286000]Accuracy-Highest: 0.99817 Training: 2021-03-18 11:54:11,240-[cfp_fp][286000]XNorm: 20.965243 Training: 2021-03-18 11:54:11,241-[cfp_fp][286000]Accuracy-Flip: 0.98586+-0.00480 Training: 2021-03-18 11:54:11,241-[cfp_fp][286000]Accuracy-Highest: 0.98786 Training: 2021-03-18 11:54:35,322-[agedb_30][286000]XNorm: 22.674527 Training: 2021-03-18 11:54:35,322-[agedb_30][286000]Accuracy-Flip: 0.98217+-0.00654 Training: 2021-03-18 11:54:35,322-[agedb_30][286000]Accuracy-Highest: 0.98333 Training: 2021-03-18 11:54:47,434-Speed 580.97 samples/sec Loss 0.8543 Epoch: 17 Global Step: 286050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:54:58,610-Speed 4581.56 samples/sec Loss 0.8405 Epoch: 17 Global Step: 286100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:55:10,496-Speed 4307.64 samples/sec Loss 0.8428 Epoch: 17 Global Step: 286150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:55:21,611-Speed 4606.94 samples/sec Loss 0.8420 Epoch: 17 Global Step: 286200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:55:33,033-Speed 4482.82 samples/sec Loss 0.8523 Epoch: 17 Global Step: 286250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:55:44,260-Speed 4560.51 samples/sec Loss 0.8479 Epoch: 17 Global Step: 286300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:55:55,371-Speed 4608.38 samples/sec Loss 0.8474 Epoch: 17 Global Step: 286350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:56:07,003-Speed 4401.89 samples/sec Loss 0.8421 Epoch: 17 Global Step: 286400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:56:18,429-Speed 4481.07 samples/sec Loss 0.8500 Epoch: 17 Global Step: 286450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:56:29,444-Speed 4648.63 samples/sec Loss 0.8447 Epoch: 17 Global Step: 286500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:56:40,838-Speed 4494.11 samples/sec Loss 0.8343 Epoch: 17 Global Step: 286550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:56:51,912-Speed 4623.47 samples/sec Loss 0.8432 Epoch: 17 Global Step: 286600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:03,055-Speed 4595.08 samples/sec Loss 0.8385 Epoch: 17 Global Step: 286650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:14,512-Speed 4469.34 samples/sec Loss 0.8383 Epoch: 17 Global Step: 286700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:25,833-Speed 4522.72 samples/sec Loss 0.8496 Epoch: 17 Global Step: 286750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:37,306-Speed 4462.59 samples/sec Loss 0.8587 Epoch: 17 Global Step: 286800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:48,595-Speed 4535.83 samples/sec Loss 0.8524 Epoch: 17 Global Step: 286850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:57:59,860-Speed 4545.30 samples/sec Loss 0.8322 Epoch: 17 Global Step: 286900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:58:11,131-Speed 4543.02 samples/sec Loss 0.8342 Epoch: 17 Global Step: 286950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:58:22,604-Speed 4462.86 samples/sec Loss 0.8512 Epoch: 17 Global Step: 287000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:58:33,954-Speed 4511.16 samples/sec Loss 0.8404 Epoch: 17 Global Step: 287050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:58:45,268-Speed 4525.80 samples/sec Loss 0.8526 Epoch: 17 Global Step: 287100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:58:57,262-Speed 4268.88 samples/sec Loss 0.8432 Epoch: 17 Global Step: 287150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:59:08,591-Speed 4519.97 samples/sec Loss 0.8508 Epoch: 17 Global Step: 287200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:59:19,866-Speed 4541.44 samples/sec Loss 0.8527 Epoch: 17 Global Step: 287250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:59:31,051-Speed 4577.46 samples/sec Loss 0.8523 Epoch: 17 Global Step: 287300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:59:42,330-Speed 4540.09 samples/sec Loss 0.8355 Epoch: 17 Global Step: 287350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 11:59:53,737-Speed 4488.68 samples/sec Loss 0.8440 Epoch: 17 Global Step: 287400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:00:05,134-Speed 4492.79 samples/sec Loss 0.8392 Epoch: 17 Global Step: 287450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:00:17,545-Speed 4125.40 samples/sec Loss 0.8426 Epoch: 17 Global Step: 287500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:00:29,331-Speed 4344.54 samples/sec Loss 0.8423 Epoch: 17 Global Step: 287550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:00:40,681-Speed 4511.23 samples/sec Loss 0.8325 Epoch: 17 Global Step: 287600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:00:52,174-Speed 4455.33 samples/sec Loss 0.8352 Epoch: 17 Global Step: 287650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:03,267-Speed 4615.64 samples/sec Loss 0.8501 Epoch: 17 Global Step: 287700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:14,826-Speed 4429.75 samples/sec Loss 0.8506 Epoch: 17 Global Step: 287750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:25,757-Speed 4684.04 samples/sec Loss 0.8465 Epoch: 17 Global Step: 287800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:37,505-Speed 4358.71 samples/sec Loss 0.8498 Epoch: 17 Global Step: 287850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:48,625-Speed 4604.49 samples/sec Loss 0.8396 Epoch: 17 Global Step: 287900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:01:59,682-Speed 4630.54 samples/sec Loss 0.8409 Epoch: 17 Global Step: 287950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:02:10,889-Speed 4569.13 samples/sec Loss 0.8472 Epoch: 17 Global Step: 288000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:02:35,086-[lfw][288000]XNorm: 22.424462 Training: 2021-03-18 12:02:35,086-[lfw][288000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:02:35,086-[lfw][288000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:03:02,261-[cfp_fp][288000]XNorm: 20.884104 Training: 2021-03-18 12:03:02,261-[cfp_fp][288000]Accuracy-Flip: 0.98586+-0.00509 Training: 2021-03-18 12:03:02,261-[cfp_fp][288000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:03:25,518-[agedb_30][288000]XNorm: 22.582234 Training: 2021-03-18 12:03:25,519-[agedb_30][288000]Accuracy-Flip: 0.98200+-0.00670 Training: 2021-03-18 12:03:25,519-[agedb_30][288000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:03:36,774-Speed 596.15 samples/sec Loss 0.8292 Epoch: 17 Global Step: 288050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:03:48,066-Speed 4534.14 samples/sec Loss 0.8382 Epoch: 17 Global Step: 288100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:03:59,945-Speed 4310.34 samples/sec Loss 0.8337 Epoch: 17 Global Step: 288150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:04:11,147-Speed 4571.05 samples/sec Loss 0.8395 Epoch: 17 Global Step: 288200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:04:22,595-Speed 4473.07 samples/sec Loss 0.8398 Epoch: 17 Global Step: 288250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:04:33,767-Speed 4583.43 samples/sec Loss 0.8544 Epoch: 17 Global Step: 288300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:04:45,205-Speed 4476.54 samples/sec Loss 0.8122 Epoch: 17 Global Step: 288350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:04:56,323-Speed 4605.63 samples/sec Loss 0.8452 Epoch: 17 Global Step: 288400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:05:07,669-Speed 4512.64 samples/sec Loss 0.8471 Epoch: 17 Global Step: 288450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:05:18,895-Speed 4561.14 samples/sec Loss 0.8466 Epoch: 17 Global Step: 288500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:05:30,845-Speed 4284.92 samples/sec Loss 0.8438 Epoch: 17 Global Step: 288550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:05:42,032-Speed 4577.11 samples/sec Loss 0.8617 Epoch: 17 Global Step: 288600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:05:54,137-Speed 4229.77 samples/sec Loss 0.8522 Epoch: 17 Global Step: 288650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:06:05,649-Speed 4448.00 samples/sec Loss 0.8426 Epoch: 17 Global Step: 288700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:06:16,620-Speed 4666.88 samples/sec Loss 0.8443 Epoch: 17 Global Step: 288750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:06:27,693-Speed 4624.06 samples/sec Loss 0.8442 Epoch: 17 Global Step: 288800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:06:39,490-Speed 4340.45 samples/sec Loss 0.8364 Epoch: 17 Global Step: 288850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:06:50,484-Speed 4657.73 samples/sec Loss 0.8320 Epoch: 17 Global Step: 288900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:02,442-Speed 4281.92 samples/sec Loss 0.8287 Epoch: 17 Global Step: 288950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:13,739-Speed 4532.52 samples/sec Loss 0.8463 Epoch: 17 Global Step: 289000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:24,532-Speed 4744.00 samples/sec Loss 0.8318 Epoch: 17 Global Step: 289050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:35,480-Speed 4677.03 samples/sec Loss 0.8601 Epoch: 17 Global Step: 289100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:46,553-Speed 4624.34 samples/sec Loss 0.8503 Epoch: 17 Global Step: 289150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:07:57,657-Speed 4611.29 samples/sec Loss 0.8246 Epoch: 17 Global Step: 289200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:08:09,157-Speed 4452.33 samples/sec Loss 0.8404 Epoch: 17 Global Step: 289250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:08:20,326-Speed 4584.67 samples/sec Loss 0.8367 Epoch: 17 Global Step: 289300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:08:31,497-Speed 4583.50 samples/sec Loss 0.8534 Epoch: 17 Global Step: 289350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:08:42,566-Speed 4625.76 samples/sec Loss 0.8377 Epoch: 17 Global Step: 289400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:08:53,680-Speed 4607.17 samples/sec Loss 0.8596 Epoch: 17 Global Step: 289450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:09:04,646-Speed 4669.43 samples/sec Loss 0.8256 Epoch: 17 Global Step: 289500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:09:15,668-Speed 4645.56 samples/sec Loss 0.8540 Epoch: 17 Global Step: 289550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:09:27,058-Speed 4495.42 samples/sec Loss 0.8341 Epoch: 17 Global Step: 289600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:09:38,342-Speed 4537.70 samples/sec Loss 0.8458 Epoch: 17 Global Step: 289650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:09:49,404-Speed 4628.51 samples/sec Loss 0.8458 Epoch: 17 Global Step: 289700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:00,490-Speed 4618.84 samples/sec Loss 0.8510 Epoch: 17 Global Step: 289750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:11,868-Speed 4499.95 samples/sec Loss 0.8502 Epoch: 17 Global Step: 289800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:23,293-Speed 4481.97 samples/sec Loss 0.8456 Epoch: 17 Global Step: 289850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:34,554-Speed 4547.00 samples/sec Loss 0.8369 Epoch: 17 Global Step: 289900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:46,015-Speed 4467.41 samples/sec Loss 0.8502 Epoch: 17 Global Step: 289950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:10:57,247-Speed 4558.64 samples/sec Loss 0.8417 Epoch: 17 Global Step: 290000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:11:21,558-[lfw][290000]XNorm: 22.466254 Training: 2021-03-18 12:11:21,559-[lfw][290000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:11:21,559-[lfw][290000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:11:51,165-[cfp_fp][290000]XNorm: 20.902265 Training: 2021-03-18 12:11:51,166-[cfp_fp][290000]Accuracy-Flip: 0.98586+-0.00505 Training: 2021-03-18 12:11:51,166-[cfp_fp][290000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:12:16,538-[agedb_30][290000]XNorm: 22.563154 Training: 2021-03-18 12:12:16,538-[agedb_30][290000]Accuracy-Flip: 0.98133+-0.00618 Training: 2021-03-18 12:12:16,538-[agedb_30][290000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:12:27,743-Speed 565.77 samples/sec Loss 0.8401 Epoch: 17 Global Step: 290050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:12:39,562-Speed 4332.40 samples/sec Loss 0.8370 Epoch: 17 Global Step: 290100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:12:50,619-Speed 4630.97 samples/sec Loss 0.8532 Epoch: 17 Global Step: 290150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:01,609-Speed 4659.69 samples/sec Loss 0.8405 Epoch: 17 Global Step: 290200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:12,903-Speed 4534.15 samples/sec Loss 0.8672 Epoch: 17 Global Step: 290250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:24,146-Speed 4554.26 samples/sec Loss 0.8453 Epoch: 17 Global Step: 290300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:36,192-Speed 4250.60 samples/sec Loss 0.8412 Epoch: 17 Global Step: 290350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:48,025-Speed 4327.02 samples/sec Loss 0.8380 Epoch: 17 Global Step: 290400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:13:58,997-Speed 4667.04 samples/sec Loss 0.8501 Epoch: 17 Global Step: 290450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:14:10,444-Speed 4473.20 samples/sec Loss 0.8428 Epoch: 17 Global Step: 290500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:14:21,797-Speed 4510.00 samples/sec Loss 0.8629 Epoch: 17 Global Step: 290550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:14:33,292-Speed 4454.46 samples/sec Loss 0.8412 Epoch: 17 Global Step: 290600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:14:44,448-Speed 4589.72 samples/sec Loss 0.8428 Epoch: 17 Global Step: 290650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:14:55,594-Speed 4593.75 samples/sec Loss 0.8215 Epoch: 17 Global Step: 290700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:15:06,827-Speed 4558.27 samples/sec Loss 0.8439 Epoch: 17 Global Step: 290750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:15:18,204-Speed 4500.83 samples/sec Loss 0.8399 Epoch: 17 Global Step: 290800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:15:29,308-Speed 4611.48 samples/sec Loss 0.8605 Epoch: 17 Global Step: 290850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:15:40,626-Speed 4524.02 samples/sec Loss 0.8328 Epoch: 17 Global Step: 290900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:15:52,765-Speed 4218.13 samples/sec Loss 0.8501 Epoch: 17 Global Step: 290950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:16:03,740-Speed 4665.39 samples/sec Loss 0.8354 Epoch: 17 Global Step: 291000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:16:14,927-Speed 4576.84 samples/sec Loss 0.8370 Epoch: 17 Global Step: 291050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:16:26,027-Speed 4613.14 samples/sec Loss 0.8475 Epoch: 17 Global Step: 291100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:16:37,204-Speed 4581.24 samples/sec Loss 0.8265 Epoch: 17 Global Step: 291150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:16:48,639-Speed 4477.66 samples/sec Loss 0.8394 Epoch: 17 Global Step: 291200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:00,102-Speed 4466.81 samples/sec Loss 0.8368 Epoch: 17 Global Step: 291250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:12,155-Speed 4248.16 samples/sec Loss 0.8404 Epoch: 17 Global Step: 291300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:23,564-Speed 4487.88 samples/sec Loss 0.8328 Epoch: 17 Global Step: 291350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:34,806-Speed 4554.78 samples/sec Loss 0.8449 Epoch: 17 Global Step: 291400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:46,904-Speed 4232.29 samples/sec Loss 0.8368 Epoch: 17 Global Step: 291450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:17:58,217-Speed 4526.09 samples/sec Loss 0.8612 Epoch: 17 Global Step: 291500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:18:09,266-Speed 4633.94 samples/sec Loss 0.8462 Epoch: 17 Global Step: 291550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:18:21,187-Speed 4295.26 samples/sec Loss 0.8524 Epoch: 17 Global Step: 291600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:18:32,419-Speed 4558.67 samples/sec Loss 0.8402 Epoch: 17 Global Step: 291650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:18:44,292-Speed 4312.56 samples/sec Loss 0.8296 Epoch: 17 Global Step: 291700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:18:55,450-Speed 4588.87 samples/sec Loss 0.8549 Epoch: 17 Global Step: 291750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:19:06,638-Speed 4576.48 samples/sec Loss 0.8224 Epoch: 17 Global Step: 291800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:19:18,062-Speed 4482.17 samples/sec Loss 0.8429 Epoch: 17 Global Step: 291850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:19:29,340-Speed 4540.05 samples/sec Loss 0.8606 Epoch: 17 Global Step: 291900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:19:40,701-Speed 4507.03 samples/sec Loss 0.8308 Epoch: 17 Global Step: 291950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:19:51,659-Speed 4672.60 samples/sec Loss 0.8306 Epoch: 17 Global Step: 292000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:20:15,956-[lfw][292000]XNorm: 22.499940 Training: 2021-03-18 12:20:15,956-[lfw][292000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:20:15,956-[lfw][292000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:20:43,631-[cfp_fp][292000]XNorm: 20.899732 Training: 2021-03-18 12:20:43,632-[cfp_fp][292000]Accuracy-Flip: 0.98614+-0.00491 Training: 2021-03-18 12:20:43,632-[cfp_fp][292000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:21:06,984-[agedb_30][292000]XNorm: 22.626598 Training: 2021-03-18 12:21:06,985-[agedb_30][292000]Accuracy-Flip: 0.98183+-0.00689 Training: 2021-03-18 12:21:06,985-[agedb_30][292000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:21:18,065-Speed 592.56 samples/sec Loss 0.8395 Epoch: 17 Global Step: 292050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:21:28,768-Speed 4783.83 samples/sec Loss 0.8600 Epoch: 17 Global Step: 292100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:21:39,939-Speed 4584.03 samples/sec Loss 0.8505 Epoch: 17 Global Step: 292150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:21:51,202-Speed 4546.69 samples/sec Loss 0.8385 Epoch: 17 Global Step: 292200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:02,730-Speed 4441.54 samples/sec Loss 0.8450 Epoch: 17 Global Step: 292250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:13,714-Speed 4661.66 samples/sec Loss 0.8554 Epoch: 17 Global Step: 292300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:24,984-Speed 4543.58 samples/sec Loss 0.8378 Epoch: 17 Global Step: 292350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:36,180-Speed 4573.37 samples/sec Loss 0.8565 Epoch: 17 Global Step: 292400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:47,721-Speed 4436.44 samples/sec Loss 0.8267 Epoch: 17 Global Step: 292450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:22:58,709-Speed 4659.81 samples/sec Loss 0.8462 Epoch: 17 Global Step: 292500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:23:09,606-Speed 4698.87 samples/sec Loss 0.8340 Epoch: 17 Global Step: 292550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:23:20,737-Speed 4600.12 samples/sec Loss 0.8653 Epoch: 17 Global Step: 292600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:23:31,930-Speed 4574.46 samples/sec Loss 0.8553 Epoch: 17 Global Step: 292650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:23:43,109-Speed 4580.35 samples/sec Loss 0.8504 Epoch: 17 Global Step: 292700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:23:54,129-Speed 4646.30 samples/sec Loss 0.8457 Epoch: 17 Global Step: 292750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:24:05,238-Speed 4609.29 samples/sec Loss 0.8324 Epoch: 17 Global Step: 292800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:24:16,483-Speed 4553.27 samples/sec Loss 0.8505 Epoch: 17 Global Step: 292850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:24:27,954-Speed 4463.84 samples/sec Loss 0.8424 Epoch: 17 Global Step: 292900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:24:39,086-Speed 4600.00 samples/sec Loss 0.8435 Epoch: 17 Global Step: 292950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:24:51,209-Speed 4223.41 samples/sec Loss 0.8477 Epoch: 17 Global Step: 293000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:25:02,500-Speed 4534.82 samples/sec Loss 0.8445 Epoch: 17 Global Step: 293050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:25:13,901-Speed 4491.12 samples/sec Loss 0.8402 Epoch: 17 Global Step: 293100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:25:25,993-Speed 4234.51 samples/sec Loss 0.8501 Epoch: 17 Global Step: 293150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:25:37,494-Speed 4452.00 samples/sec Loss 0.8466 Epoch: 17 Global Step: 293200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:25:49,871-Speed 4137.00 samples/sec Loss 0.8300 Epoch: 17 Global Step: 293250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:00,987-Speed 4606.19 samples/sec Loss 0.8444 Epoch: 17 Global Step: 293300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:12,218-Speed 4559.34 samples/sec Loss 0.8466 Epoch: 17 Global Step: 293350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:23,491-Speed 4542.02 samples/sec Loss 0.8396 Epoch: 17 Global Step: 293400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:34,655-Speed 4586.27 samples/sec Loss 0.8388 Epoch: 17 Global Step: 293450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:46,091-Speed 4477.51 samples/sec Loss 0.8415 Epoch: 17 Global Step: 293500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:26:57,327-Speed 4557.27 samples/sec Loss 0.8587 Epoch: 17 Global Step: 293550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:27:08,340-Speed 4649.26 samples/sec Loss 0.8450 Epoch: 17 Global Step: 293600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:27:19,674-Speed 4517.69 samples/sec Loss 0.8607 Epoch: 17 Global Step: 293650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:27:30,729-Speed 4631.68 samples/sec Loss 0.8194 Epoch: 17 Global Step: 293700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:27:41,724-Speed 4656.91 samples/sec Loss 0.8423 Epoch: 17 Global Step: 293750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:27:53,042-Speed 4523.89 samples/sec Loss 0.8420 Epoch: 17 Global Step: 293800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:28:04,281-Speed 4555.84 samples/sec Loss 0.8290 Epoch: 17 Global Step: 293850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:28:16,282-Speed 4266.62 samples/sec Loss 0.8481 Epoch: 17 Global Step: 293900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:28:27,673-Speed 4494.89 samples/sec Loss 0.8362 Epoch: 17 Global Step: 293950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:28:38,779-Speed 4610.60 samples/sec Loss 0.8505 Epoch: 17 Global Step: 294000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:29:03,095-[lfw][294000]XNorm: 22.447373 Training: 2021-03-18 12:29:03,095-[lfw][294000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:29:03,095-[lfw][294000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:29:32,560-[cfp_fp][294000]XNorm: 20.883495 Training: 2021-03-18 12:29:32,561-[cfp_fp][294000]Accuracy-Flip: 0.98643+-0.00520 Training: 2021-03-18 12:29:32,561-[cfp_fp][294000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:29:56,396-[agedb_30][294000]XNorm: 22.573426 Training: 2021-03-18 12:29:56,397-[agedb_30][294000]Accuracy-Flip: 0.98233+-0.00655 Training: 2021-03-18 12:29:56,397-[agedb_30][294000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:30:08,253-Speed 572.23 samples/sec Loss 0.8355 Epoch: 17 Global Step: 294050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:30:20,946-Speed 4034.16 samples/sec Loss 0.8374 Epoch: 17 Global Step: 294100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:30:32,848-Speed 4301.72 samples/sec Loss 0.8408 Epoch: 17 Global Step: 294150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:30:44,725-Speed 4311.37 samples/sec Loss 0.8508 Epoch: 17 Global Step: 294200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:30:56,573-Speed 4321.55 samples/sec Loss 0.8608 Epoch: 17 Global Step: 294250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:31:08,460-Speed 4307.47 samples/sec Loss 0.8340 Epoch: 17 Global Step: 294300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:31:20,691-Speed 4186.37 samples/sec Loss 0.8306 Epoch: 17 Global Step: 294350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:31:35,582-Speed 3438.53 samples/sec Loss 0.8675 Epoch: 17 Global Step: 294400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:31:47,183-Speed 4413.88 samples/sec Loss 0.8299 Epoch: 17 Global Step: 294450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:31:58,881-Speed 4377.07 samples/sec Loss 0.8452 Epoch: 17 Global Step: 294500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:32:10,708-Speed 4329.41 samples/sec Loss 0.8523 Epoch: 17 Global Step: 294550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:32:22,285-Speed 4422.89 samples/sec Loss 0.8290 Epoch: 17 Global Step: 294600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:32:34,095-Speed 4335.53 samples/sec Loss 0.8290 Epoch: 17 Global Step: 294650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:32:45,792-Speed 4377.24 samples/sec Loss 0.8325 Epoch: 17 Global Step: 294700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:32:57,359-Speed 4426.64 samples/sec Loss 0.8420 Epoch: 17 Global Step: 294750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:33:09,251-Speed 4305.46 samples/sec Loss 0.8559 Epoch: 17 Global Step: 294800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:33:20,956-Speed 4374.83 samples/sec Loss 0.8346 Epoch: 17 Global Step: 294850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:33:32,790-Speed 4326.52 samples/sec Loss 0.8424 Epoch: 17 Global Step: 294900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:33:44,414-Speed 4404.93 samples/sec Loss 0.8454 Epoch: 17 Global Step: 294950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:33:56,044-Speed 4402.97 samples/sec Loss 0.8287 Epoch: 17 Global Step: 295000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:34:07,919-Speed 4311.69 samples/sec Loss 0.8396 Epoch: 17 Global Step: 295050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:34:19,446-Speed 4442.42 samples/sec Loss 0.8460 Epoch: 17 Global Step: 295100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:34:30,822-Speed 4500.64 samples/sec Loss 0.8246 Epoch: 17 Global Step: 295150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:34:42,370-Speed 4434.05 samples/sec Loss 0.8427 Epoch: 17 Global Step: 295200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:34:53,698-Speed 4520.14 samples/sec Loss 0.8362 Epoch: 17 Global Step: 295250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:35:05,409-Speed 4372.31 samples/sec Loss 0.8378 Epoch: 17 Global Step: 295300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:35:17,033-Speed 4404.91 samples/sec Loss 0.8480 Epoch: 17 Global Step: 295350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:35:28,195-Speed 4587.05 samples/sec Loss 0.8383 Epoch: 17 Global Step: 295400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:35:39,599-Speed 4490.18 samples/sec Loss 0.8417 Epoch: 17 Global Step: 295450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:35:51,239-Speed 4398.98 samples/sec Loss 0.8303 Epoch: 17 Global Step: 295500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:02,668-Speed 4479.92 samples/sec Loss 0.8303 Epoch: 17 Global Step: 295550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:13,854-Speed 4577.62 samples/sec Loss 0.8313 Epoch: 17 Global Step: 295600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:25,643-Speed 4343.04 samples/sec Loss 0.8412 Epoch: 17 Global Step: 295650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:36,876-Speed 4558.36 samples/sec Loss 0.8433 Epoch: 17 Global Step: 295700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:48,274-Speed 4492.36 samples/sec Loss 0.8435 Epoch: 17 Global Step: 295750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:36:59,690-Speed 4485.04 samples/sec Loss 0.8311 Epoch: 17 Global Step: 295800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:37:11,928-Speed 4184.03 samples/sec Loss 0.8381 Epoch: 17 Global Step: 295850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:37:23,275-Speed 4512.56 samples/sec Loss 0.8380 Epoch: 17 Global Step: 295900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:37:35,447-Speed 4206.41 samples/sec Loss 0.8352 Epoch: 17 Global Step: 295950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:37:46,598-Speed 4591.85 samples/sec Loss 0.8329 Epoch: 17 Global Step: 296000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:38:10,876-[lfw][296000]XNorm: 22.521994 Training: 2021-03-18 12:38:10,876-[lfw][296000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:38:10,876-[lfw][296000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:38:38,586-[cfp_fp][296000]XNorm: 20.963166 Training: 2021-03-18 12:38:38,586-[cfp_fp][296000]Accuracy-Flip: 0.98571+-0.00491 Training: 2021-03-18 12:38:38,586-[cfp_fp][296000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:39:03,291-[agedb_30][296000]XNorm: 22.657544 Training: 2021-03-18 12:39:03,291-[agedb_30][296000]Accuracy-Flip: 0.98250+-0.00684 Training: 2021-03-18 12:39:03,291-[agedb_30][296000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:39:14,350-Speed 583.46 samples/sec Loss 0.8524 Epoch: 17 Global Step: 296050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:39:25,651-Speed 4531.02 samples/sec Loss 0.8280 Epoch: 17 Global Step: 296100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:39:37,947-Speed 4164.26 samples/sec Loss 0.8446 Epoch: 17 Global Step: 296150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:39:48,880-Speed 4683.59 samples/sec Loss 0.8181 Epoch: 17 Global Step: 296200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:00,190-Speed 4527.36 samples/sec Loss 0.8275 Epoch: 17 Global Step: 296250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:11,456-Speed 4544.67 samples/sec Loss 0.8619 Epoch: 17 Global Step: 296300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:22,627-Speed 4583.83 samples/sec Loss 0.8445 Epoch: 17 Global Step: 296350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:33,656-Speed 4642.74 samples/sec Loss 0.8362 Epoch: 17 Global Step: 296400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:44,876-Speed 4563.64 samples/sec Loss 0.8296 Epoch: 17 Global Step: 296450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:40:56,108-Speed 4558.37 samples/sec Loss 0.8488 Epoch: 17 Global Step: 296500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:41:07,167-Speed 4630.08 samples/sec Loss 0.8345 Epoch: 17 Global Step: 296550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:41:18,291-Speed 4602.79 samples/sec Loss 0.8343 Epoch: 17 Global Step: 296600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:41:29,329-Speed 4639.03 samples/sec Loss 0.8477 Epoch: 17 Global Step: 296650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:41:40,316-Speed 4660.49 samples/sec Loss 0.8493 Epoch: 17 Global Step: 296700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:41:52,233-Speed 4296.48 samples/sec Loss 0.8607 Epoch: 17 Global Step: 296750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:42:03,507-Speed 4541.89 samples/sec Loss 0.8492 Epoch: 17 Global Step: 296800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:42:14,793-Speed 4537.23 samples/sec Loss 0.8388 Epoch: 17 Global Step: 296850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:42:27,129-Speed 4150.59 samples/sec Loss 0.8476 Epoch: 17 Global Step: 296900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:42:38,551-Speed 4482.73 samples/sec Loss 0.8474 Epoch: 17 Global Step: 296950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:42:49,817-Speed 4544.98 samples/sec Loss 0.8365 Epoch: 17 Global Step: 297000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:01,913-Speed 4232.85 samples/sec Loss 0.8409 Epoch: 17 Global Step: 297050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:13,226-Speed 4526.20 samples/sec Loss 0.8464 Epoch: 17 Global Step: 297100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:25,395-Speed 4207.61 samples/sec Loss 0.8356 Epoch: 17 Global Step: 297150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:36,547-Speed 4591.39 samples/sec Loss 0.8520 Epoch: 17 Global Step: 297200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:48,499-Speed 4284.12 samples/sec Loss 0.8458 Epoch: 17 Global Step: 297250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:43:59,644-Speed 4594.15 samples/sec Loss 0.8452 Epoch: 17 Global Step: 297300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:44:10,784-Speed 4596.24 samples/sec Loss 0.8410 Epoch: 17 Global Step: 297350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:44:22,030-Speed 4553.29 samples/sec Loss 0.8498 Epoch: 17 Global Step: 297400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:44:33,176-Speed 4593.52 samples/sec Loss 0.8374 Epoch: 17 Global Step: 297450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:44:44,177-Speed 4654.43 samples/sec Loss 0.8404 Epoch: 17 Global Step: 297500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:44:55,530-Speed 4509.97 samples/sec Loss 0.8454 Epoch: 17 Global Step: 297550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:45:07,053-Speed 4443.88 samples/sec Loss 0.8350 Epoch: 17 Global Step: 297600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:45:17,937-Speed 4704.50 samples/sec Loss 0.8560 Epoch: 17 Global Step: 297650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:45:29,406-Speed 4464.31 samples/sec Loss 0.8538 Epoch: 17 Global Step: 297700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:45:40,370-Speed 4670.24 samples/sec Loss 0.8200 Epoch: 17 Global Step: 297750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:45:51,525-Speed 4590.25 samples/sec Loss 0.8623 Epoch: 17 Global Step: 297800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:46:02,707-Speed 4578.78 samples/sec Loss 0.8414 Epoch: 17 Global Step: 297850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:46:13,961-Speed 4549.91 samples/sec Loss 0.8490 Epoch: 17 Global Step: 297900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:46:25,435-Speed 4462.72 samples/sec Loss 0.8327 Epoch: 17 Global Step: 297950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:46:36,455-Speed 4646.17 samples/sec Loss 0.8661 Epoch: 17 Global Step: 298000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:47:00,779-[lfw][298000]XNorm: 22.455888 Training: 2021-03-18 12:47:00,779-[lfw][298000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:47:00,779-[lfw][298000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:47:28,336-[cfp_fp][298000]XNorm: 20.897781 Training: 2021-03-18 12:47:28,336-[cfp_fp][298000]Accuracy-Flip: 0.98657+-0.00487 Training: 2021-03-18 12:47:28,336-[cfp_fp][298000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:47:52,191-[agedb_30][298000]XNorm: 22.600898 Training: 2021-03-18 12:47:52,192-[agedb_30][298000]Accuracy-Flip: 0.98283+-0.00606 Training: 2021-03-18 12:47:52,192-[agedb_30][298000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:48:03,489-Speed 588.28 samples/sec Loss 0.8404 Epoch: 17 Global Step: 298050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:48:14,806-Speed 4524.33 samples/sec Loss 0.8374 Epoch: 17 Global Step: 298100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:48:26,045-Speed 4555.95 samples/sec Loss 0.8329 Epoch: 17 Global Step: 298150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:48:37,293-Speed 4552.50 samples/sec Loss 0.8333 Epoch: 17 Global Step: 298200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:48:48,785-Speed 4455.49 samples/sec Loss 0.8398 Epoch: 17 Global Step: 298250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:48:59,849-Speed 4627.63 samples/sec Loss 0.8335 Epoch: 17 Global Step: 298300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:49:10,784-Speed 4682.52 samples/sec Loss 0.8486 Epoch: 17 Global Step: 298350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:49:22,218-Speed 4478.58 samples/sec Loss 0.8413 Epoch: 17 Global Step: 298400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:49:33,496-Speed 4540.12 samples/sec Loss 0.8436 Epoch: 17 Global Step: 298450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:49:44,762-Speed 4544.92 samples/sec Loss 0.8326 Epoch: 17 Global Step: 298500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:49:56,153-Speed 4495.00 samples/sec Loss 0.8449 Epoch: 17 Global Step: 298550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:50:07,213-Speed 4629.45 samples/sec Loss 0.8412 Epoch: 17 Global Step: 298600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:50:18,396-Speed 4578.59 samples/sec Loss 0.8515 Epoch: 17 Global Step: 298650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:50:30,756-Speed 4142.80 samples/sec Loss 0.8380 Epoch: 17 Global Step: 298700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:50:42,668-Speed 4298.40 samples/sec Loss 0.8313 Epoch: 17 Global Step: 298750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:50:53,983-Speed 4525.18 samples/sec Loss 0.8629 Epoch: 17 Global Step: 298800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:51:05,154-Speed 4583.78 samples/sec Loss 0.8405 Epoch: 17 Global Step: 298850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:51:16,637-Speed 4459.19 samples/sec Loss 0.8409 Epoch: 17 Global Step: 298900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:51:28,027-Speed 4495.28 samples/sec Loss 0.8407 Epoch: 17 Global Step: 298950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:51:39,732-Speed 4374.47 samples/sec Loss 0.8385 Epoch: 17 Global Step: 299000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:51:51,894-Speed 4210.39 samples/sec Loss 0.8437 Epoch: 17 Global Step: 299050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:52:03,077-Speed 4578.66 samples/sec Loss 0.8268 Epoch: 17 Global Step: 299100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 12:52:14,088-Speed 4650.05 samples/sec Loss 0.8417 Epoch: 17 Global Step: 299150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:52:25,740-Speed 4394.76 samples/sec Loss 0.8386 Epoch: 17 Global Step: 299200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:52:36,794-Speed 4632.19 samples/sec Loss 0.8409 Epoch: 17 Global Step: 299250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:52:47,755-Speed 4671.29 samples/sec Loss 0.8301 Epoch: 17 Global Step: 299300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:52:58,947-Speed 4575.07 samples/sec Loss 0.8407 Epoch: 17 Global Step: 299350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:53:10,306-Speed 4507.60 samples/sec Loss 0.8378 Epoch: 17 Global Step: 299400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:53:21,342-Speed 4639.75 samples/sec Loss 0.8487 Epoch: 17 Global Step: 299450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:53:32,390-Speed 4634.63 samples/sec Loss 0.8504 Epoch: 17 Global Step: 299500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:53:43,564-Speed 4582.39 samples/sec Loss 0.8563 Epoch: 17 Global Step: 299550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:53:55,278-Speed 4371.08 samples/sec Loss 0.8305 Epoch: 17 Global Step: 299600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:54:07,370-Speed 4234.35 samples/sec Loss 0.8469 Epoch: 17 Global Step: 299650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:54:18,718-Speed 4512.30 samples/sec Loss 0.8475 Epoch: 17 Global Step: 299700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:54:29,790-Speed 4624.24 samples/sec Loss 0.8398 Epoch: 17 Global Step: 299750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:54:41,022-Speed 4559.02 samples/sec Loss 0.8323 Epoch: 17 Global Step: 299800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:54:52,416-Speed 4493.79 samples/sec Loss 0.8407 Epoch: 17 Global Step: 299850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:55:04,457-Speed 4252.26 samples/sec Loss 0.8456 Epoch: 17 Global Step: 299900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:55:15,458-Speed 4654.72 samples/sec Loss 0.8361 Epoch: 17 Global Step: 299950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:55:27,654-Speed 4198.08 samples/sec Loss 0.8282 Epoch: 17 Global Step: 300000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:55:51,484-[lfw][300000]XNorm: 22.473893 Training: 2021-03-18 12:55:51,484-[lfw][300000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 12:55:51,484-[lfw][300000]Accuracy-Highest: 0.99817 Training: 2021-03-18 12:56:18,925-[cfp_fp][300000]XNorm: 20.944630 Training: 2021-03-18 12:56:18,925-[cfp_fp][300000]Accuracy-Flip: 0.98700+-0.00476 Training: 2021-03-18 12:56:18,925-[cfp_fp][300000]Accuracy-Highest: 0.98786 Training: 2021-03-18 12:56:43,202-[agedb_30][300000]XNorm: 22.609070 Training: 2021-03-18 12:56:43,203-[agedb_30][300000]Accuracy-Flip: 0.98150+-0.00612 Training: 2021-03-18 12:56:43,203-[agedb_30][300000]Accuracy-Highest: 0.98333 Training: 2021-03-18 12:56:54,699-Speed 588.21 samples/sec Loss 0.8451 Epoch: 17 Global Step: 300050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:57:06,798-Speed 4231.96 samples/sec Loss 0.8708 Epoch: 17 Global Step: 300100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:57:18,479-Speed 4383.61 samples/sec Loss 0.8465 Epoch: 17 Global Step: 300150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:57:29,498-Speed 4646.75 samples/sec Loss 0.8410 Epoch: 17 Global Step: 300200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:57:40,717-Speed 4563.99 samples/sec Loss 0.8518 Epoch: 17 Global Step: 300250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:57:51,777-Speed 4629.76 samples/sec Loss 0.8549 Epoch: 17 Global Step: 300300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:58:02,954-Speed 4581.15 samples/sec Loss 0.8530 Epoch: 17 Global Step: 300350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:58:13,820-Speed 4711.86 samples/sec Loss 0.8192 Epoch: 17 Global Step: 300400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:58:39,807-Speed 1970.29 samples/sec Loss 0.8507 Epoch: 18 Global Step: 300450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:58:51,418-Speed 4410.03 samples/sec Loss 0.8361 Epoch: 18 Global Step: 300500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:02,608-Speed 4575.69 samples/sec Loss 0.8645 Epoch: 18 Global Step: 300550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:14,098-Speed 4456.35 samples/sec Loss 0.8420 Epoch: 18 Global Step: 300600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:25,358-Speed 4547.59 samples/sec Loss 0.8431 Epoch: 18 Global Step: 300650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:36,444-Speed 4618.87 samples/sec Loss 0.8282 Epoch: 18 Global Step: 300700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:47,459-Speed 4648.16 samples/sec Loss 0.8375 Epoch: 18 Global Step: 300750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 12:59:58,661-Speed 4571.19 samples/sec Loss 0.8411 Epoch: 18 Global Step: 300800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:00:09,669-Speed 4651.36 samples/sec Loss 0.8461 Epoch: 18 Global Step: 300850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:00:20,725-Speed 4631.10 samples/sec Loss 0.8446 Epoch: 18 Global Step: 300900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:00:31,660-Speed 4682.50 samples/sec Loss 0.8503 Epoch: 18 Global Step: 300950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:00:42,555-Speed 4700.04 samples/sec Loss 0.8285 Epoch: 18 Global Step: 301000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:00:53,645-Speed 4616.94 samples/sec Loss 0.8382 Epoch: 18 Global Step: 301050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:04,632-Speed 4660.33 samples/sec Loss 0.8427 Epoch: 18 Global Step: 301100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:15,497-Speed 4712.74 samples/sec Loss 0.8402 Epoch: 18 Global Step: 301150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:26,460-Speed 4670.17 samples/sec Loss 0.8423 Epoch: 18 Global Step: 301200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:37,417-Speed 4673.02 samples/sec Loss 0.8389 Epoch: 18 Global Step: 301250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:48,397-Speed 4663.36 samples/sec Loss 0.8421 Epoch: 18 Global Step: 301300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:01:59,251-Speed 4717.74 samples/sec Loss 0.8460 Epoch: 18 Global Step: 301350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:02:10,145-Speed 4700.20 samples/sec Loss 0.8384 Epoch: 18 Global Step: 301400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:02:20,997-Speed 4718.30 samples/sec Loss 0.8484 Epoch: 18 Global Step: 301450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:02:33,100-Speed 4230.59 samples/sec Loss 0.8343 Epoch: 18 Global Step: 301500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:02:44,087-Speed 4660.57 samples/sec Loss 0.8384 Epoch: 18 Global Step: 301550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:02:55,098-Speed 4650.03 samples/sec Loss 0.8375 Epoch: 18 Global Step: 301600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:03:06,974-Speed 4311.34 samples/sec Loss 0.8480 Epoch: 18 Global Step: 301650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:03:17,767-Speed 4744.06 samples/sec Loss 0.8327 Epoch: 18 Global Step: 301700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:03:28,869-Speed 4612.27 samples/sec Loss 0.8327 Epoch: 18 Global Step: 301750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:03:39,825-Speed 4673.35 samples/sec Loss 0.8480 Epoch: 18 Global Step: 301800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:03:50,878-Speed 4632.79 samples/sec Loss 0.8531 Epoch: 18 Global Step: 301850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:04:01,850-Speed 4666.71 samples/sec Loss 0.8350 Epoch: 18 Global Step: 301900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:04:12,658-Speed 4737.40 samples/sec Loss 0.8403 Epoch: 18 Global Step: 301950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:04:24,452-Speed 4341.45 samples/sec Loss 0.8373 Epoch: 18 Global Step: 302000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:04:48,352-[lfw][302000]XNorm: 22.434025 Training: 2021-03-18 13:04:48,353-[lfw][302000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:04:48,353-[lfw][302000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:05:15,918-[cfp_fp][302000]XNorm: 20.915834 Training: 2021-03-18 13:05:15,918-[cfp_fp][302000]Accuracy-Flip: 0.98614+-0.00470 Training: 2021-03-18 13:05:15,918-[cfp_fp][302000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:05:40,178-[agedb_30][302000]XNorm: 22.609732 Training: 2021-03-18 13:05:40,178-[agedb_30][302000]Accuracy-Flip: 0.98217+-0.00671 Training: 2021-03-18 13:05:40,178-[agedb_30][302000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:05:51,027-Speed 591.40 samples/sec Loss 0.8414 Epoch: 18 Global Step: 302050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:02,269-Speed 4554.53 samples/sec Loss 0.8414 Epoch: 18 Global Step: 302100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:12,950-Speed 4793.74 samples/sec Loss 0.8335 Epoch: 18 Global Step: 302150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:24,111-Speed 4587.63 samples/sec Loss 0.8460 Epoch: 18 Global Step: 302200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:35,145-Speed 4640.78 samples/sec Loss 0.8219 Epoch: 18 Global Step: 302250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:45,932-Speed 4746.31 samples/sec Loss 0.8463 Epoch: 18 Global Step: 302300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:06:56,841-Speed 4693.87 samples/sec Loss 0.8414 Epoch: 18 Global Step: 302350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:07:07,830-Speed 4659.27 samples/sec Loss 0.8309 Epoch: 18 Global Step: 302400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:07:18,693-Speed 4713.63 samples/sec Loss 0.8543 Epoch: 18 Global Step: 302450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:07:29,423-Speed 4772.25 samples/sec Loss 0.8501 Epoch: 18 Global Step: 302500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:07:42,770-Speed 3836.19 samples/sec Loss 0.8365 Epoch: 18 Global Step: 302550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:07:53,697-Speed 4685.60 samples/sec Loss 0.8334 Epoch: 18 Global Step: 302600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:08:04,960-Speed 4546.47 samples/sec Loss 0.8376 Epoch: 18 Global Step: 302650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:08:16,019-Speed 4629.97 samples/sec Loss 0.8326 Epoch: 18 Global Step: 302700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:08:27,025-Speed 4652.39 samples/sec Loss 0.8314 Epoch: 18 Global Step: 302750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:08:38,886-Speed 4316.74 samples/sec Loss 0.8384 Epoch: 18 Global Step: 302800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:08:50,936-Speed 4249.14 samples/sec Loss 0.8415 Epoch: 18 Global Step: 302850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:01,874-Speed 4681.33 samples/sec Loss 0.8382 Epoch: 18 Global Step: 302900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:12,906-Speed 4641.31 samples/sec Loss 0.8366 Epoch: 18 Global Step: 302950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:24,057-Speed 4591.92 samples/sec Loss 0.8227 Epoch: 18 Global Step: 303000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:35,770-Speed 4371.56 samples/sec Loss 0.8299 Epoch: 18 Global Step: 303050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:46,814-Speed 4636.54 samples/sec Loss 0.8378 Epoch: 18 Global Step: 303100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:09:57,743-Speed 4685.00 samples/sec Loss 0.8299 Epoch: 18 Global Step: 303150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:10:08,609-Speed 4712.33 samples/sec Loss 0.8232 Epoch: 18 Global Step: 303200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:10:19,339-Speed 4771.86 samples/sec Loss 0.8265 Epoch: 18 Global Step: 303250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:10:30,259-Speed 4688.97 samples/sec Loss 0.8553 Epoch: 18 Global Step: 303300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:10:41,232-Speed 4666.17 samples/sec Loss 0.8309 Epoch: 18 Global Step: 303350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:10:52,327-Speed 4615.10 samples/sec Loss 0.8293 Epoch: 18 Global Step: 303400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:03,102-Speed 4751.94 samples/sec Loss 0.8403 Epoch: 18 Global Step: 303450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:13,905-Speed 4739.64 samples/sec Loss 0.8419 Epoch: 18 Global Step: 303500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:25,020-Speed 4606.82 samples/sec Loss 0.8366 Epoch: 18 Global Step: 303550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:35,893-Speed 4709.25 samples/sec Loss 0.8436 Epoch: 18 Global Step: 303600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:46,728-Speed 4725.64 samples/sec Loss 0.8199 Epoch: 18 Global Step: 303650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:11:57,753-Speed 4644.47 samples/sec Loss 0.8483 Epoch: 18 Global Step: 303700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:12:08,574-Speed 4731.70 samples/sec Loss 0.8594 Epoch: 18 Global Step: 303750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:12:19,318-Speed 4765.42 samples/sec Loss 0.8482 Epoch: 18 Global Step: 303800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:12:30,243-Speed 4686.89 samples/sec Loss 0.8409 Epoch: 18 Global Step: 303850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:12:41,186-Speed 4679.28 samples/sec Loss 0.8397 Epoch: 18 Global Step: 303900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:12:52,351-Speed 4585.93 samples/sec Loss 0.8314 Epoch: 18 Global Step: 303950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:13:03,296-Speed 4678.59 samples/sec Loss 0.8355 Epoch: 18 Global Step: 304000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:13:27,527-[lfw][304000]XNorm: 22.484119 Training: 2021-03-18 13:13:27,527-[lfw][304000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:13:27,527-[lfw][304000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:13:55,124-[cfp_fp][304000]XNorm: 20.925044 Training: 2021-03-18 13:13:55,124-[cfp_fp][304000]Accuracy-Flip: 0.98729+-0.00454 Training: 2021-03-18 13:13:55,124-[cfp_fp][304000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:14:18,876-[agedb_30][304000]XNorm: 22.628365 Training: 2021-03-18 13:14:18,876-[agedb_30][304000]Accuracy-Flip: 0.98233+-0.00624 Training: 2021-03-18 13:14:18,876-[agedb_30][304000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:14:29,798-Speed 591.90 samples/sec Loss 0.8305 Epoch: 18 Global Step: 304050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:14:40,814-Speed 4647.80 samples/sec Loss 0.8322 Epoch: 18 Global Step: 304100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:14:51,682-Speed 4711.34 samples/sec Loss 0.8402 Epoch: 18 Global Step: 304150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:02,880-Speed 4572.91 samples/sec Loss 0.8361 Epoch: 18 Global Step: 304200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:13,869-Speed 4659.45 samples/sec Loss 0.8275 Epoch: 18 Global Step: 304250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:25,721-Speed 4320.14 samples/sec Loss 0.8433 Epoch: 18 Global Step: 304300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:36,613-Speed 4701.34 samples/sec Loss 0.8383 Epoch: 18 Global Step: 304350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:47,711-Speed 4613.56 samples/sec Loss 0.8377 Epoch: 18 Global Step: 304400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:15:58,573-Speed 4713.85 samples/sec Loss 0.8420 Epoch: 18 Global Step: 304450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:16:09,471-Speed 4698.55 samples/sec Loss 0.8411 Epoch: 18 Global Step: 304500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:16:20,180-Speed 4781.38 samples/sec Loss 0.8470 Epoch: 18 Global Step: 304550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:16:30,973-Speed 4744.16 samples/sec Loss 0.8584 Epoch: 18 Global Step: 304600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:16:42,880-Speed 4299.99 samples/sec Loss 0.8377 Epoch: 18 Global Step: 304650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:16:53,895-Speed 4648.84 samples/sec Loss 0.8426 Epoch: 18 Global Step: 304700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:05,027-Speed 4599.64 samples/sec Loss 0.8370 Epoch: 18 Global Step: 304750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:15,834-Speed 4738.08 samples/sec Loss 0.8491 Epoch: 18 Global Step: 304800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:26,750-Speed 4690.57 samples/sec Loss 0.8532 Epoch: 18 Global Step: 304850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:37,490-Speed 4767.46 samples/sec Loss 0.8532 Epoch: 18 Global Step: 304900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:49,173-Speed 4382.93 samples/sec Loss 0.8322 Epoch: 18 Global Step: 304950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:17:59,846-Speed 4797.30 samples/sec Loss 0.8493 Epoch: 18 Global Step: 305000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:18:10,831-Speed 4661.36 samples/sec Loss 0.8340 Epoch: 18 Global Step: 305050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:18:21,797-Speed 4669.38 samples/sec Loss 0.8442 Epoch: 18 Global Step: 305100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:18:32,910-Speed 4607.49 samples/sec Loss 0.8463 Epoch: 18 Global Step: 305150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:18:43,919-Speed 4650.85 samples/sec Loss 0.8417 Epoch: 18 Global Step: 305200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:18:54,921-Speed 4653.99 samples/sec Loss 0.8346 Epoch: 18 Global Step: 305250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:19:05,899-Speed 4664.18 samples/sec Loss 0.8322 Epoch: 18 Global Step: 305300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:19:16,721-Speed 4731.33 samples/sec Loss 0.8103 Epoch: 18 Global Step: 305350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:19:28,555-Speed 4326.61 samples/sec Loss 0.8394 Epoch: 18 Global Step: 305400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:19:39,754-Speed 4572.05 samples/sec Loss 0.8330 Epoch: 18 Global Step: 305450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:19:51,837-Speed 4237.57 samples/sec Loss 0.8429 Epoch: 18 Global Step: 305500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:03,184-Speed 4512.24 samples/sec Loss 0.8400 Epoch: 18 Global Step: 305550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:14,239-Speed 4631.75 samples/sec Loss 0.8492 Epoch: 18 Global Step: 305600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:25,261-Speed 4645.37 samples/sec Loss 0.8407 Epoch: 18 Global Step: 305650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:36,868-Speed 4411.16 samples/sec Loss 0.8332 Epoch: 18 Global Step: 305700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:48,023-Speed 4590.01 samples/sec Loss 0.8459 Epoch: 18 Global Step: 305750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:20:59,742-Speed 4369.31 samples/sec Loss 0.8575 Epoch: 18 Global Step: 305800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:21:11,446-Speed 4374.66 samples/sec Loss 0.8331 Epoch: 18 Global Step: 305850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:21:22,489-Speed 4636.62 samples/sec Loss 0.8348 Epoch: 18 Global Step: 305900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:21:33,322-Speed 4726.74 samples/sec Loss 0.8432 Epoch: 18 Global Step: 305950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:21:44,142-Speed 4732.38 samples/sec Loss 0.8342 Epoch: 18 Global Step: 306000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:22:07,905-[lfw][306000]XNorm: 22.449848 Training: 2021-03-18 13:22:07,905-[lfw][306000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:22:07,905-[lfw][306000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:22:35,362-[cfp_fp][306000]XNorm: 20.888537 Training: 2021-03-18 13:22:35,363-[cfp_fp][306000]Accuracy-Flip: 0.98629+-0.00405 Training: 2021-03-18 13:22:35,363-[cfp_fp][306000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:22:59,155-[agedb_30][306000]XNorm: 22.572944 Training: 2021-03-18 13:22:59,155-[agedb_30][306000]Accuracy-Flip: 0.98317+-0.00673 Training: 2021-03-18 13:22:59,156-[agedb_30][306000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:23:10,219-Speed 594.82 samples/sec Loss 0.8477 Epoch: 18 Global Step: 306050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:23:21,216-Speed 4656.03 samples/sec Loss 0.8406 Epoch: 18 Global Step: 306100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:23:32,203-Speed 4660.30 samples/sec Loss 0.8389 Epoch: 18 Global Step: 306150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:23:43,047-Speed 4722.16 samples/sec Loss 0.8408 Epoch: 18 Global Step: 306200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:23:53,891-Speed 4721.59 samples/sec Loss 0.8351 Epoch: 18 Global Step: 306250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:05,006-Speed 4606.85 samples/sec Loss 0.8413 Epoch: 18 Global Step: 306300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:15,932-Speed 4686.26 samples/sec Loss 0.8345 Epoch: 18 Global Step: 306350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:26,809-Speed 4707.34 samples/sec Loss 0.8525 Epoch: 18 Global Step: 306400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:37,752-Speed 4678.77 samples/sec Loss 0.8439 Epoch: 18 Global Step: 306450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:48,546-Speed 4743.79 samples/sec Loss 0.8308 Epoch: 18 Global Step: 306500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:24:59,763-Speed 4565.09 samples/sec Loss 0.8303 Epoch: 18 Global Step: 306550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:25:10,650-Speed 4702.86 samples/sec Loss 0.8287 Epoch: 18 Global Step: 306600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:25:21,590-Speed 4680.63 samples/sec Loss 0.8443 Epoch: 18 Global Step: 306650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:25:32,539-Speed 4676.58 samples/sec Loss 0.8264 Epoch: 18 Global Step: 306700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:25:43,583-Speed 4636.19 samples/sec Loss 0.8414 Epoch: 18 Global Step: 306750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:25:54,525-Speed 4679.25 samples/sec Loss 0.8429 Epoch: 18 Global Step: 306800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:05,413-Speed 4703.00 samples/sec Loss 0.8108 Epoch: 18 Global Step: 306850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:16,228-Speed 4734.36 samples/sec Loss 0.8597 Epoch: 18 Global Step: 306900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:27,182-Speed 4674.36 samples/sec Loss 0.8471 Epoch: 18 Global Step: 306950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:38,205-Speed 4645.15 samples/sec Loss 0.8497 Epoch: 18 Global Step: 307000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:49,128-Speed 4687.63 samples/sec Loss 0.8467 Epoch: 18 Global Step: 307050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:26:59,714-Speed 4837.21 samples/sec Loss 0.8411 Epoch: 18 Global Step: 307100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:27:10,630-Speed 4690.41 samples/sec Loss 0.8333 Epoch: 18 Global Step: 307150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:27:22,039-Speed 4487.82 samples/sec Loss 0.8433 Epoch: 18 Global Step: 307200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:27:33,102-Speed 4628.51 samples/sec Loss 0.8418 Epoch: 18 Global Step: 307250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:27:44,047-Speed 4678.24 samples/sec Loss 0.8439 Epoch: 18 Global Step: 307300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:27:54,656-Speed 4826.32 samples/sec Loss 0.8531 Epoch: 18 Global Step: 307350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:28:05,809-Speed 4590.75 samples/sec Loss 0.8350 Epoch: 18 Global Step: 307400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:28:16,707-Speed 4698.57 samples/sec Loss 0.8218 Epoch: 18 Global Step: 307450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:28:27,996-Speed 4535.75 samples/sec Loss 0.8322 Epoch: 18 Global Step: 307500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:28:39,111-Speed 4606.65 samples/sec Loss 0.8421 Epoch: 18 Global Step: 307550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:28:50,801-Speed 4379.90 samples/sec Loss 0.8498 Epoch: 18 Global Step: 307600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:01,679-Speed 4707.01 samples/sec Loss 0.8447 Epoch: 18 Global Step: 307650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:12,818-Speed 4596.77 samples/sec Loss 0.8453 Epoch: 18 Global Step: 307700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:23,811-Speed 4658.05 samples/sec Loss 0.8367 Epoch: 18 Global Step: 307750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:34,818-Speed 4652.02 samples/sec Loss 0.8368 Epoch: 18 Global Step: 307800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:45,616-Speed 4742.07 samples/sec Loss 0.8324 Epoch: 18 Global Step: 307850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:29:57,718-Speed 4230.91 samples/sec Loss 0.8428 Epoch: 18 Global Step: 307900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:30:08,863-Speed 4594.17 samples/sec Loss 0.8517 Epoch: 18 Global Step: 307950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:30:19,790-Speed 4685.61 samples/sec Loss 0.8549 Epoch: 18 Global Step: 308000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:30:44,317-[lfw][308000]XNorm: 22.453800 Training: 2021-03-18 13:30:44,317-[lfw][308000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:30:44,317-[lfw][308000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:31:11,832-[cfp_fp][308000]XNorm: 20.898813 Training: 2021-03-18 13:31:11,832-[cfp_fp][308000]Accuracy-Flip: 0.98686+-0.00442 Training: 2021-03-18 13:31:11,832-[cfp_fp][308000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:31:35,643-[agedb_30][308000]XNorm: 22.602295 Training: 2021-03-18 13:31:35,643-[agedb_30][308000]Accuracy-Flip: 0.98283+-0.00679 Training: 2021-03-18 13:31:35,643-[agedb_30][308000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:31:46,338-Speed 591.58 samples/sec Loss 0.8385 Epoch: 18 Global Step: 308050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:31:57,178-Speed 4723.60 samples/sec Loss 0.8422 Epoch: 18 Global Step: 308100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:32:08,012-Speed 4726.32 samples/sec Loss 0.8334 Epoch: 18 Global Step: 308150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:32:19,069-Speed 4631.17 samples/sec Loss 0.8332 Epoch: 18 Global Step: 308200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:32:29,947-Speed 4707.03 samples/sec Loss 0.8379 Epoch: 18 Global Step: 308250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:32:41,757-Speed 4335.41 samples/sec Loss 0.8354 Epoch: 18 Global Step: 308300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:32:53,366-Speed 4410.55 samples/sec Loss 0.8342 Epoch: 18 Global Step: 308350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:33:04,229-Speed 4713.92 samples/sec Loss 0.8383 Epoch: 18 Global Step: 308400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:33:15,289-Speed 4629.38 samples/sec Loss 0.8453 Epoch: 18 Global Step: 308450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:33:25,988-Speed 4785.84 samples/sec Loss 0.8302 Epoch: 18 Global Step: 308500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:33:37,077-Speed 4617.25 samples/sec Loss 0.8328 Epoch: 18 Global Step: 308550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:33:49,512-Speed 4117.92 samples/sec Loss 0.8465 Epoch: 18 Global Step: 308600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:01,068-Speed 4430.83 samples/sec Loss 0.8353 Epoch: 18 Global Step: 308650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:12,002-Speed 4682.95 samples/sec Loss 0.8387 Epoch: 18 Global Step: 308700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:23,256-Speed 4549.65 samples/sec Loss 0.8208 Epoch: 18 Global Step: 308750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:34,109-Speed 4717.91 samples/sec Loss 0.8193 Epoch: 18 Global Step: 308800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:45,183-Speed 4623.54 samples/sec Loss 0.8222 Epoch: 18 Global Step: 308850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:34:56,263-Speed 4621.45 samples/sec Loss 0.8433 Epoch: 18 Global Step: 308900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:35:07,215-Speed 4675.22 samples/sec Loss 0.8226 Epoch: 18 Global Step: 308950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:35:18,236-Speed 4646.17 samples/sec Loss 0.8362 Epoch: 18 Global Step: 309000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:35:29,169-Speed 4683.37 samples/sec Loss 0.8485 Epoch: 18 Global Step: 309050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:35:39,959-Speed 4745.21 samples/sec Loss 0.8312 Epoch: 18 Global Step: 309100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:35:50,755-Speed 4742.71 samples/sec Loss 0.8551 Epoch: 18 Global Step: 309150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:01,340-Speed 4837.62 samples/sec Loss 0.8292 Epoch: 18 Global Step: 309200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:12,230-Speed 4701.70 samples/sec Loss 0.8486 Epoch: 18 Global Step: 309250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:23,131-Speed 4697.29 samples/sec Loss 0.8300 Epoch: 18 Global Step: 309300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:33,945-Speed 4735.04 samples/sec Loss 0.8550 Epoch: 18 Global Step: 309350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:45,068-Speed 4603.04 samples/sec Loss 0.8513 Epoch: 18 Global Step: 309400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:36:56,083-Speed 4648.77 samples/sec Loss 0.8310 Epoch: 18 Global Step: 309450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:37:06,940-Speed 4716.25 samples/sec Loss 0.8316 Epoch: 18 Global Step: 309500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:37:18,036-Speed 4614.10 samples/sec Loss 0.8376 Epoch: 18 Global Step: 309550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:37:28,982-Speed 4677.85 samples/sec Loss 0.8365 Epoch: 18 Global Step: 309600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:37:39,908-Speed 4686.22 samples/sec Loss 0.8457 Epoch: 18 Global Step: 309650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:37:50,692-Speed 4748.02 samples/sec Loss 0.8319 Epoch: 18 Global Step: 309700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:01,591-Speed 4698.31 samples/sec Loss 0.8404 Epoch: 18 Global Step: 309750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:12,410-Speed 4732.85 samples/sec Loss 0.8428 Epoch: 18 Global Step: 309800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:23,403-Speed 4657.77 samples/sec Loss 0.8330 Epoch: 18 Global Step: 309850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:34,249-Speed 4720.95 samples/sec Loss 0.8304 Epoch: 18 Global Step: 309900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:45,396-Speed 4593.40 samples/sec Loss 0.8427 Epoch: 18 Global Step: 309950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:38:57,131-Speed 4363.12 samples/sec Loss 0.8333 Epoch: 18 Global Step: 310000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:39:21,436-[lfw][310000]XNorm: 22.503928 Training: 2021-03-18 13:39:21,437-[lfw][310000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:39:21,437-[lfw][310000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:39:49,129-[cfp_fp][310000]XNorm: 20.949774 Training: 2021-03-18 13:39:49,129-[cfp_fp][310000]Accuracy-Flip: 0.98643+-0.00492 Training: 2021-03-18 13:39:49,129-[cfp_fp][310000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:40:13,023-[agedb_30][310000]XNorm: 22.637087 Training: 2021-03-18 13:40:13,023-[agedb_30][310000]Accuracy-Flip: 0.98200+-0.00632 Training: 2021-03-18 13:40:13,023-[agedb_30][310000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:40:23,991-Speed 589.46 samples/sec Loss 0.8306 Epoch: 18 Global Step: 310050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:40:35,022-Speed 4641.66 samples/sec Loss 0.8483 Epoch: 18 Global Step: 310100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:40:45,995-Speed 4666.58 samples/sec Loss 0.8467 Epoch: 18 Global Step: 310150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:40:56,791-Speed 4742.72 samples/sec Loss 0.8318 Epoch: 18 Global Step: 310200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:41:07,709-Speed 4689.72 samples/sec Loss 0.8425 Epoch: 18 Global Step: 310250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:41:18,633-Speed 4687.09 samples/sec Loss 0.8511 Epoch: 18 Global Step: 310300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:41:29,480-Speed 4720.14 samples/sec Loss 0.8131 Epoch: 18 Global Step: 310350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:41:40,308-Speed 4728.81 samples/sec Loss 0.8301 Epoch: 18 Global Step: 310400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:41:51,349-Speed 4637.59 samples/sec Loss 0.8361 Epoch: 18 Global Step: 310450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:02,145-Speed 4742.71 samples/sec Loss 0.8399 Epoch: 18 Global Step: 310500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:14,032-Speed 4307.74 samples/sec Loss 0.8519 Epoch: 18 Global Step: 310550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:25,282-Speed 4551.33 samples/sec Loss 0.8556 Epoch: 18 Global Step: 310600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:36,214-Speed 4683.57 samples/sec Loss 0.8404 Epoch: 18 Global Step: 310650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:47,386-Speed 4583.34 samples/sec Loss 0.8475 Epoch: 18 Global Step: 310700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:42:58,181-Speed 4742.90 samples/sec Loss 0.8477 Epoch: 18 Global Step: 310750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:43:09,080-Speed 4698.18 samples/sec Loss 0.8509 Epoch: 18 Global Step: 310800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:43:20,007-Speed 4686.11 samples/sec Loss 0.8484 Epoch: 18 Global Step: 310850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:43:31,932-Speed 4293.55 samples/sec Loss 0.8445 Epoch: 18 Global Step: 310900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:43:42,756-Speed 4730.68 samples/sec Loss 0.8491 Epoch: 18 Global Step: 310950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:43:53,731-Speed 4665.27 samples/sec Loss 0.8426 Epoch: 18 Global Step: 311000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:44:04,575-Speed 4721.83 samples/sec Loss 0.8372 Epoch: 18 Global Step: 311050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:44:16,251-Speed 4385.30 samples/sec Loss 0.8592 Epoch: 18 Global Step: 311100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:44:27,854-Speed 4412.76 samples/sec Loss 0.8532 Epoch: 18 Global Step: 311150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:44:38,868-Speed 4648.58 samples/sec Loss 0.8390 Epoch: 18 Global Step: 311200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:44:49,941-Speed 4624.11 samples/sec Loss 0.8370 Epoch: 18 Global Step: 311250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:00,667-Speed 4774.05 samples/sec Loss 0.8408 Epoch: 18 Global Step: 311300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:11,567-Speed 4697.65 samples/sec Loss 0.8361 Epoch: 18 Global Step: 311350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:23,124-Speed 4430.43 samples/sec Loss 0.8438 Epoch: 18 Global Step: 311400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:34,131-Speed 4651.72 samples/sec Loss 0.8448 Epoch: 18 Global Step: 311450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:46,537-Speed 4127.24 samples/sec Loss 0.8351 Epoch: 18 Global Step: 311500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:45:57,439-Speed 4696.62 samples/sec Loss 0.8336 Epoch: 18 Global Step: 311550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:46:08,478-Speed 4638.30 samples/sec Loss 0.8426 Epoch: 18 Global Step: 311600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:46:19,321-Speed 4722.52 samples/sec Loss 0.8294 Epoch: 18 Global Step: 311650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:46:30,507-Speed 4577.26 samples/sec Loss 0.8423 Epoch: 18 Global Step: 311700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:46:41,256-Speed 4763.56 samples/sec Loss 0.8370 Epoch: 18 Global Step: 311750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:46:52,265-Speed 4651.35 samples/sec Loss 0.8447 Epoch: 18 Global Step: 311800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:47:03,231-Speed 4669.14 samples/sec Loss 0.8365 Epoch: 18 Global Step: 311850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:47:14,243-Speed 4649.92 samples/sec Loss 0.8330 Epoch: 18 Global Step: 311900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:47:25,020-Speed 4751.23 samples/sec Loss 0.8406 Epoch: 18 Global Step: 311950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:47:35,891-Speed 4710.05 samples/sec Loss 0.8418 Epoch: 18 Global Step: 312000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:48:00,021-[lfw][312000]XNorm: 22.508023 Training: 2021-03-18 13:48:00,021-[lfw][312000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:48:00,021-[lfw][312000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:48:27,698-[cfp_fp][312000]XNorm: 20.919118 Training: 2021-03-18 13:48:27,698-[cfp_fp][312000]Accuracy-Flip: 0.98700+-0.00463 Training: 2021-03-18 13:48:27,698-[cfp_fp][312000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:48:51,332-[agedb_30][312000]XNorm: 22.636371 Training: 2021-03-18 13:48:51,332-[agedb_30][312000]Accuracy-Flip: 0.98217+-0.00667 Training: 2021-03-18 13:48:51,332-[agedb_30][312000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:49:02,182-Speed 593.34 samples/sec Loss 0.8383 Epoch: 18 Global Step: 312050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:49:13,001-Speed 4732.63 samples/sec Loss 0.8314 Epoch: 18 Global Step: 312100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:49:23,535-Speed 4860.85 samples/sec Loss 0.8248 Epoch: 18 Global Step: 312150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:49:34,620-Speed 4619.09 samples/sec Loss 0.8464 Epoch: 18 Global Step: 312200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:49:45,372-Speed 4762.09 samples/sec Loss 0.8285 Epoch: 18 Global Step: 312250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:49:56,200-Speed 4729.05 samples/sec Loss 0.8335 Epoch: 18 Global Step: 312300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:50:06,973-Speed 4752.71 samples/sec Loss 0.8417 Epoch: 18 Global Step: 312350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:50:17,897-Speed 4687.25 samples/sec Loss 0.8433 Epoch: 18 Global Step: 312400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:50:28,903-Speed 4652.54 samples/sec Loss 0.8300 Epoch: 18 Global Step: 312450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:50:39,799-Speed 4699.32 samples/sec Loss 0.8448 Epoch: 18 Global Step: 312500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:50:50,898-Speed 4613.07 samples/sec Loss 0.8590 Epoch: 18 Global Step: 312550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:01,791-Speed 4700.59 samples/sec Loss 0.8330 Epoch: 18 Global Step: 312600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:12,826-Speed 4640.32 samples/sec Loss 0.8316 Epoch: 18 Global Step: 312650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:23,684-Speed 4715.61 samples/sec Loss 0.8420 Epoch: 18 Global Step: 312700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:34,508-Speed 4730.38 samples/sec Loss 0.8538 Epoch: 18 Global Step: 312750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:45,337-Speed 4728.57 samples/sec Loss 0.8388 Epoch: 18 Global Step: 312800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:51:57,187-Speed 4320.82 samples/sec Loss 0.8474 Epoch: 18 Global Step: 312850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:52:08,110-Speed 4687.89 samples/sec Loss 0.8283 Epoch: 18 Global Step: 312900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:52:19,228-Speed 4605.35 samples/sec Loss 0.8402 Epoch: 18 Global Step: 312950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:52:30,394-Speed 4585.53 samples/sec Loss 0.8374 Epoch: 18 Global Step: 313000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 13:52:41,503-Speed 4609.08 samples/sec Loss 0.8364 Epoch: 18 Global Step: 313050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:52:52,571-Speed 4626.02 samples/sec Loss 0.8463 Epoch: 18 Global Step: 313100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:03,437-Speed 4712.26 samples/sec Loss 0.8257 Epoch: 18 Global Step: 313150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:14,308-Speed 4710.15 samples/sec Loss 0.8276 Epoch: 18 Global Step: 313200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:25,395-Speed 4618.23 samples/sec Loss 0.8374 Epoch: 18 Global Step: 313250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:36,301-Speed 4694.98 samples/sec Loss 0.8395 Epoch: 18 Global Step: 313300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:47,136-Speed 4725.79 samples/sec Loss 0.8420 Epoch: 18 Global Step: 313350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:53:58,077-Speed 4679.59 samples/sec Loss 0.8352 Epoch: 18 Global Step: 313400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:54:09,067-Speed 4659.26 samples/sec Loss 0.8499 Epoch: 18 Global Step: 313450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:54:20,961-Speed 4304.93 samples/sec Loss 0.8343 Epoch: 18 Global Step: 313500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:54:31,806-Speed 4721.56 samples/sec Loss 0.8196 Epoch: 18 Global Step: 313550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:54:42,879-Speed 4624.12 samples/sec Loss 0.8537 Epoch: 18 Global Step: 313600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:54:54,148-Speed 4543.61 samples/sec Loss 0.8392 Epoch: 18 Global Step: 313650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:55:05,341-Speed 4574.74 samples/sec Loss 0.8399 Epoch: 18 Global Step: 313700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:55:16,254-Speed 4691.69 samples/sec Loss 0.8371 Epoch: 18 Global Step: 313750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:55:27,895-Speed 4398.42 samples/sec Loss 0.8514 Epoch: 18 Global Step: 313800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:55:38,761-Speed 4712.35 samples/sec Loss 0.8503 Epoch: 18 Global Step: 313850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:55:49,725-Speed 4670.20 samples/sec Loss 0.8361 Epoch: 18 Global Step: 313900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:56:00,587-Speed 4714.15 samples/sec Loss 0.8330 Epoch: 18 Global Step: 313950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:56:12,196-Speed 4410.77 samples/sec Loss 0.8471 Epoch: 18 Global Step: 314000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:56:35,928-[lfw][314000]XNorm: 22.391852 Training: 2021-03-18 13:56:35,929-[lfw][314000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 13:56:35,929-[lfw][314000]Accuracy-Highest: 0.99817 Training: 2021-03-18 13:57:03,390-[cfp_fp][314000]XNorm: 20.887784 Training: 2021-03-18 13:57:03,391-[cfp_fp][314000]Accuracy-Flip: 0.98743+-0.00473 Training: 2021-03-18 13:57:03,391-[cfp_fp][314000]Accuracy-Highest: 0.98786 Training: 2021-03-18 13:57:27,068-[agedb_30][314000]XNorm: 22.541606 Training: 2021-03-18 13:57:27,069-[agedb_30][314000]Accuracy-Flip: 0.98267+-0.00700 Training: 2021-03-18 13:57:27,069-[agedb_30][314000]Accuracy-Highest: 0.98333 Training: 2021-03-18 13:57:38,833-Speed 590.97 samples/sec Loss 0.8350 Epoch: 18 Global Step: 314050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:57:49,789-Speed 4673.23 samples/sec Loss 0.8388 Epoch: 18 Global Step: 314100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:00,969-Speed 4579.82 samples/sec Loss 0.8317 Epoch: 18 Global Step: 314150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:11,854-Speed 4704.22 samples/sec Loss 0.8468 Epoch: 18 Global Step: 314200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:23,023-Speed 4584.32 samples/sec Loss 0.8499 Epoch: 18 Global Step: 314250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:33,979-Speed 4673.51 samples/sec Loss 0.8389 Epoch: 18 Global Step: 314300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:47,184-Speed 3877.63 samples/sec Loss 0.8512 Epoch: 18 Global Step: 314350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:58:58,242-Speed 4630.19 samples/sec Loss 0.8492 Epoch: 18 Global Step: 314400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:59:09,265-Speed 4644.97 samples/sec Loss 0.8484 Epoch: 18 Global Step: 314450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:59:20,293-Speed 4643.05 samples/sec Loss 0.8477 Epoch: 18 Global Step: 314500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:59:31,507-Speed 4565.96 samples/sec Loss 0.8415 Epoch: 18 Global Step: 314550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:59:42,434-Speed 4685.68 samples/sec Loss 0.8488 Epoch: 18 Global Step: 314600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 13:59:53,503-Speed 4626.37 samples/sec Loss 0.8226 Epoch: 18 Global Step: 314650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:04,555-Speed 4632.57 samples/sec Loss 0.8386 Epoch: 18 Global Step: 314700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:15,573-Speed 4647.34 samples/sec Loss 0.8410 Epoch: 18 Global Step: 314750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:26,026-Speed 4898.59 samples/sec Loss 0.8451 Epoch: 18 Global Step: 314800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:36,951-Speed 4686.67 samples/sec Loss 0.8464 Epoch: 18 Global Step: 314850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:48,155-Speed 4569.97 samples/sec Loss 0.8465 Epoch: 18 Global Step: 314900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:00:58,975-Speed 4732.42 samples/sec Loss 0.8395 Epoch: 18 Global Step: 314950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:01:09,892-Speed 4690.09 samples/sec Loss 0.8480 Epoch: 18 Global Step: 315000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:01:20,838-Speed 4677.98 samples/sec Loss 0.8427 Epoch: 18 Global Step: 315050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:01:31,892-Speed 4632.16 samples/sec Loss 0.8335 Epoch: 18 Global Step: 315100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:01:42,714-Speed 4731.39 samples/sec Loss 0.8451 Epoch: 18 Global Step: 315150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:01:53,713-Speed 4655.23 samples/sec Loss 0.8474 Epoch: 18 Global Step: 315200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:04,801-Speed 4617.85 samples/sec Loss 0.8411 Epoch: 18 Global Step: 315250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:15,563-Speed 4757.80 samples/sec Loss 0.8457 Epoch: 18 Global Step: 315300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:26,520-Speed 4672.83 samples/sec Loss 0.8428 Epoch: 18 Global Step: 315350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:37,556-Speed 4639.82 samples/sec Loss 0.8446 Epoch: 18 Global Step: 315400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:48,739-Speed 4578.28 samples/sec Loss 0.8395 Epoch: 18 Global Step: 315450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:02:59,774-Speed 4640.47 samples/sec Loss 0.8414 Epoch: 18 Global Step: 315500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:03:10,839-Speed 4627.34 samples/sec Loss 0.8425 Epoch: 18 Global Step: 315550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:03:22,418-Speed 4422.12 samples/sec Loss 0.8483 Epoch: 18 Global Step: 315600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:03:33,480-Speed 4628.75 samples/sec Loss 0.8387 Epoch: 18 Global Step: 315650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:03:44,270-Speed 4745.36 samples/sec Loss 0.8549 Epoch: 18 Global Step: 315700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:03:55,189-Speed 4689.42 samples/sec Loss 0.8312 Epoch: 18 Global Step: 315750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:04:06,316-Speed 4601.55 samples/sec Loss 0.8452 Epoch: 18 Global Step: 315800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:04:17,350-Speed 4640.69 samples/sec Loss 0.8387 Epoch: 18 Global Step: 315850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:04:28,301-Speed 4675.66 samples/sec Loss 0.8315 Epoch: 18 Global Step: 315900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:04:39,504-Speed 4570.26 samples/sec Loss 0.8307 Epoch: 18 Global Step: 315950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:04:50,749-Speed 4553.37 samples/sec Loss 0.8423 Epoch: 18 Global Step: 316000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:05:14,401-[lfw][316000]XNorm: 22.470521 Training: 2021-03-18 14:05:14,401-[lfw][316000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:05:14,401-[lfw][316000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:05:41,804-[cfp_fp][316000]XNorm: 20.897322 Training: 2021-03-18 14:05:41,804-[cfp_fp][316000]Accuracy-Flip: 0.98586+-0.00458 Training: 2021-03-18 14:05:41,804-[cfp_fp][316000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:06:05,461-[agedb_30][316000]XNorm: 22.593478 Training: 2021-03-18 14:06:05,461-[agedb_30][316000]Accuracy-Flip: 0.98150+-0.00681 Training: 2021-03-18 14:06:05,461-[agedb_30][316000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:06:16,459-Speed 597.37 samples/sec Loss 0.8220 Epoch: 18 Global Step: 316050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:06:27,262-Speed 4739.89 samples/sec Loss 0.8587 Epoch: 18 Global Step: 316100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:06:38,105-Speed 4721.98 samples/sec Loss 0.8381 Epoch: 18 Global Step: 316150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:06:49,327-Speed 4562.99 samples/sec Loss 0.8620 Epoch: 18 Global Step: 316200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:00,560-Speed 4558.15 samples/sec Loss 0.8237 Epoch: 18 Global Step: 316250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:11,646-Speed 4618.71 samples/sec Loss 0.8402 Epoch: 18 Global Step: 316300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:23,474-Speed 4328.69 samples/sec Loss 0.8373 Epoch: 18 Global Step: 316350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:34,400-Speed 4686.48 samples/sec Loss 0.8467 Epoch: 18 Global Step: 316400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:45,361-Speed 4671.48 samples/sec Loss 0.8344 Epoch: 18 Global Step: 316450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:07:56,538-Speed 4580.95 samples/sec Loss 0.8450 Epoch: 18 Global Step: 316500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:08:07,424-Speed 4703.47 samples/sec Loss 0.8521 Epoch: 18 Global Step: 316550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:08:18,484-Speed 4629.77 samples/sec Loss 0.8334 Epoch: 18 Global Step: 316600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:08:29,502-Speed 4647.27 samples/sec Loss 0.8612 Epoch: 18 Global Step: 316650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:08:41,088-Speed 4419.65 samples/sec Loss 0.8245 Epoch: 18 Global Step: 316700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:08:52,040-Speed 4675.08 samples/sec Loss 0.8314 Epoch: 18 Global Step: 316750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:02,708-Speed 4799.63 samples/sec Loss 0.8213 Epoch: 18 Global Step: 316800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:13,818-Speed 4608.40 samples/sec Loss 0.8287 Epoch: 18 Global Step: 316850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:25,514-Speed 4377.87 samples/sec Loss 0.8533 Epoch: 18 Global Step: 316900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:37,254-Speed 4361.49 samples/sec Loss 0.8459 Epoch: 18 Global Step: 316950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:48,157-Speed 4696.11 samples/sec Loss 0.8228 Epoch: 18 Global Step: 317000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:09:59,370-Speed 4566.41 samples/sec Loss 0.8387 Epoch: 18 Global Step: 317050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:10:10,254-Speed 4704.25 samples/sec Loss 0.8447 Epoch: 18 Global Step: 317100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:10:35,478-Speed 2029.85 samples/sec Loss 0.8417 Epoch: 19 Global Step: 317150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:10:46,732-Speed 4550.27 samples/sec Loss 0.8461 Epoch: 19 Global Step: 317200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:10:58,504-Speed 4349.40 samples/sec Loss 0.8289 Epoch: 19 Global Step: 317250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:11:09,833-Speed 4520.08 samples/sec Loss 0.8423 Epoch: 19 Global Step: 317300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:11:20,809-Speed 4664.88 samples/sec Loss 0.8373 Epoch: 19 Global Step: 317350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:11:31,845-Speed 4639.42 samples/sec Loss 0.8402 Epoch: 19 Global Step: 317400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:11:42,873-Speed 4643.29 samples/sec Loss 0.8467 Epoch: 19 Global Step: 317450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:11:53,918-Speed 4635.64 samples/sec Loss 0.8451 Epoch: 19 Global Step: 317500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:12:05,093-Speed 4582.06 samples/sec Loss 0.8455 Epoch: 19 Global Step: 317550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:12:16,130-Speed 4639.37 samples/sec Loss 0.8470 Epoch: 19 Global Step: 317600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:12:27,351-Speed 4562.93 samples/sec Loss 0.8453 Epoch: 19 Global Step: 317650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:12:38,558-Speed 4568.86 samples/sec Loss 0.8261 Epoch: 19 Global Step: 317700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:12:49,748-Speed 4576.16 samples/sec Loss 0.8410 Epoch: 19 Global Step: 317750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:13:00,755-Speed 4651.62 samples/sec Loss 0.8480 Epoch: 19 Global Step: 317800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:13:11,905-Speed 4592.25 samples/sec Loss 0.8579 Epoch: 19 Global Step: 317850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:13:22,966-Speed 4629.27 samples/sec Loss 0.8381 Epoch: 19 Global Step: 317900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:13:33,999-Speed 4641.09 samples/sec Loss 0.8340 Epoch: 19 Global Step: 317950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:13:45,006-Speed 4651.88 samples/sec Loss 0.8507 Epoch: 19 Global Step: 318000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:14:08,845-[lfw][318000]XNorm: 22.500957 Training: 2021-03-18 14:14:08,846-[lfw][318000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:14:08,846-[lfw][318000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:14:36,446-[cfp_fp][318000]XNorm: 20.973652 Training: 2021-03-18 14:14:36,446-[cfp_fp][318000]Accuracy-Flip: 0.98729+-0.00497 Training: 2021-03-18 14:14:36,446-[cfp_fp][318000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:15:00,340-[agedb_30][318000]XNorm: 22.621710 Training: 2021-03-18 14:15:00,340-[agedb_30][318000]Accuracy-Flip: 0.98217+-0.00683 Training: 2021-03-18 14:15:00,340-[agedb_30][318000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:15:11,662-Speed 590.85 samples/sec Loss 0.8427 Epoch: 19 Global Step: 318050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:15:22,563-Speed 4697.17 samples/sec Loss 0.8396 Epoch: 19 Global Step: 318100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:15:33,760-Speed 4572.70 samples/sec Loss 0.8309 Epoch: 19 Global Step: 318150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:15:44,826-Speed 4627.02 samples/sec Loss 0.8263 Epoch: 19 Global Step: 318200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:15:55,840-Speed 4649.04 samples/sec Loss 0.8381 Epoch: 19 Global Step: 318250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:16:06,773-Speed 4683.59 samples/sec Loss 0.8322 Epoch: 19 Global Step: 318300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:16:17,686-Speed 4691.88 samples/sec Loss 0.8328 Epoch: 19 Global Step: 318350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:16:28,936-Speed 4551.36 samples/sec Loss 0.8378 Epoch: 19 Global Step: 318400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:16:40,624-Speed 4380.94 samples/sec Loss 0.8330 Epoch: 19 Global Step: 318450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:16:51,501-Speed 4707.08 samples/sec Loss 0.8385 Epoch: 19 Global Step: 318500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:02,247-Speed 4765.00 samples/sec Loss 0.8635 Epoch: 19 Global Step: 318550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:13,096-Speed 4719.68 samples/sec Loss 0.8250 Epoch: 19 Global Step: 318600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:24,048-Speed 4675.34 samples/sec Loss 0.8477 Epoch: 19 Global Step: 318650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:35,265-Speed 4564.63 samples/sec Loss 0.8391 Epoch: 19 Global Step: 318700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:46,039-Speed 4752.28 samples/sec Loss 0.8542 Epoch: 19 Global Step: 318750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:17:56,975-Speed 4682.31 samples/sec Loss 0.8357 Epoch: 19 Global Step: 318800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:18:08,013-Speed 4638.68 samples/sec Loss 0.8396 Epoch: 19 Global Step: 318850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:18:19,012-Speed 4655.27 samples/sec Loss 0.8405 Epoch: 19 Global Step: 318900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:18:30,022-Speed 4650.62 samples/sec Loss 0.8323 Epoch: 19 Global Step: 318950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:18:41,076-Speed 4631.92 samples/sec Loss 0.8359 Epoch: 19 Global Step: 319000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:18:51,960-Speed 4704.39 samples/sec Loss 0.8436 Epoch: 19 Global Step: 319050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:02,916-Speed 4673.41 samples/sec Loss 0.8411 Epoch: 19 Global Step: 319100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:14,058-Speed 4595.59 samples/sec Loss 0.8191 Epoch: 19 Global Step: 319150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:25,879-Speed 4331.47 samples/sec Loss 0.8397 Epoch: 19 Global Step: 319200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:36,984-Speed 4611.12 samples/sec Loss 0.8384 Epoch: 19 Global Step: 319250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:47,690-Speed 4782.44 samples/sec Loss 0.8506 Epoch: 19 Global Step: 319300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:19:58,485-Speed 4743.29 samples/sec Loss 0.8492 Epoch: 19 Global Step: 319350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:20:09,391-Speed 4695.20 samples/sec Loss 0.8378 Epoch: 19 Global Step: 319400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:20:20,267-Speed 4707.73 samples/sec Loss 0.8368 Epoch: 19 Global Step: 319450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:20:31,308-Speed 4637.49 samples/sec Loss 0.8421 Epoch: 19 Global Step: 319500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:20:42,285-Speed 4664.71 samples/sec Loss 0.8413 Epoch: 19 Global Step: 319550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:20:53,201-Speed 4690.51 samples/sec Loss 0.8280 Epoch: 19 Global Step: 319600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:21:04,160-Speed 4672.33 samples/sec Loss 0.8308 Epoch: 19 Global Step: 319650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:21:15,911-Speed 4357.26 samples/sec Loss 0.8514 Epoch: 19 Global Step: 319700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:21:27,963-Speed 4248.51 samples/sec Loss 0.8338 Epoch: 19 Global Step: 319750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:21:39,629-Speed 4389.16 samples/sec Loss 0.8256 Epoch: 19 Global Step: 319800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:21:50,385-Speed 4760.53 samples/sec Loss 0.8343 Epoch: 19 Global Step: 319850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:22:01,476-Speed 4616.84 samples/sec Loss 0.8381 Epoch: 19 Global Step: 319900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:22:13,457-Speed 4273.73 samples/sec Loss 0.8315 Epoch: 19 Global Step: 319950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:22:24,433-Speed 4664.89 samples/sec Loss 0.8493 Epoch: 19 Global Step: 320000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:22:48,396-[lfw][320000]XNorm: 22.475986 Training: 2021-03-18 14:22:48,396-[lfw][320000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:22:48,396-[lfw][320000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:23:15,808-[cfp_fp][320000]XNorm: 20.961902 Training: 2021-03-18 14:23:15,809-[cfp_fp][320000]Accuracy-Flip: 0.98714+-0.00491 Training: 2021-03-18 14:23:15,809-[cfp_fp][320000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:23:39,474-[agedb_30][320000]XNorm: 22.602001 Training: 2021-03-18 14:23:39,475-[agedb_30][320000]Accuracy-Flip: 0.98267+-0.00680 Training: 2021-03-18 14:23:39,475-[agedb_30][320000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:23:50,499-Speed 594.90 samples/sec Loss 0.8361 Epoch: 19 Global Step: 320050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:02,359-Speed 4317.06 samples/sec Loss 0.8324 Epoch: 19 Global Step: 320100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:13,271-Speed 4692.49 samples/sec Loss 0.8233 Epoch: 19 Global Step: 320150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:25,019-Speed 4358.59 samples/sec Loss 0.8496 Epoch: 19 Global Step: 320200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:35,927-Speed 4694.00 samples/sec Loss 0.8519 Epoch: 19 Global Step: 320250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:46,953-Speed 4643.82 samples/sec Loss 0.8304 Epoch: 19 Global Step: 320300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:24:57,918-Speed 4669.85 samples/sec Loss 0.8283 Epoch: 19 Global Step: 320350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:25:08,952-Speed 4640.38 samples/sec Loss 0.8282 Epoch: 19 Global Step: 320400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:25:20,098-Speed 4593.75 samples/sec Loss 0.8254 Epoch: 19 Global Step: 320450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:25:31,160-Speed 4628.76 samples/sec Loss 0.8465 Epoch: 19 Global Step: 320500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:25:42,406-Speed 4552.99 samples/sec Loss 0.8255 Epoch: 19 Global Step: 320550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:25:53,448-Speed 4637.11 samples/sec Loss 0.8262 Epoch: 19 Global Step: 320600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:04,213-Speed 4756.51 samples/sec Loss 0.8394 Epoch: 19 Global Step: 320650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:15,385-Speed 4583.03 samples/sec Loss 0.8400 Epoch: 19 Global Step: 320700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:26,453-Speed 4626.16 samples/sec Loss 0.8368 Epoch: 19 Global Step: 320750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:37,800-Speed 4512.72 samples/sec Loss 0.8361 Epoch: 19 Global Step: 320800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:48,740-Speed 4680.07 samples/sec Loss 0.8284 Epoch: 19 Global Step: 320850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:26:59,764-Speed 4644.68 samples/sec Loss 0.8343 Epoch: 19 Global Step: 320900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:27:10,722-Speed 4672.74 samples/sec Loss 0.8434 Epoch: 19 Global Step: 320950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:27:21,698-Speed 4665.14 samples/sec Loss 0.8392 Epoch: 19 Global Step: 321000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:27:32,735-Speed 4639.00 samples/sec Loss 0.8409 Epoch: 19 Global Step: 321050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:27:43,651-Speed 4690.86 samples/sec Loss 0.8355 Epoch: 19 Global Step: 321100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:27:54,905-Speed 4549.91 samples/sec Loss 0.8324 Epoch: 19 Global Step: 321150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:28:05,976-Speed 4624.72 samples/sec Loss 0.8238 Epoch: 19 Global Step: 321200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:28:18,075-Speed 4232.22 samples/sec Loss 0.8416 Epoch: 19 Global Step: 321250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:28:29,161-Speed 4618.58 samples/sec Loss 0.8324 Epoch: 19 Global Step: 321300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:28:40,172-Speed 4650.37 samples/sec Loss 0.8184 Epoch: 19 Global Step: 321350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:28:51,150-Speed 4663.93 samples/sec Loss 0.8393 Epoch: 19 Global Step: 321400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:02,129-Speed 4663.78 samples/sec Loss 0.8383 Epoch: 19 Global Step: 321450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:12,837-Speed 4782.12 samples/sec Loss 0.8346 Epoch: 19 Global Step: 321500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:23,841-Speed 4652.80 samples/sec Loss 0.8338 Epoch: 19 Global Step: 321550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:34,963-Speed 4603.78 samples/sec Loss 0.8378 Epoch: 19 Global Step: 321600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:45,769-Speed 4738.21 samples/sec Loss 0.8162 Epoch: 19 Global Step: 321650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:29:56,542-Speed 4752.97 samples/sec Loss 0.8350 Epoch: 19 Global Step: 321700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:30:07,739-Speed 4572.85 samples/sec Loss 0.8185 Epoch: 19 Global Step: 321750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:30:18,583-Speed 4722.21 samples/sec Loss 0.8326 Epoch: 19 Global Step: 321800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:30:29,575-Speed 4657.84 samples/sec Loss 0.8514 Epoch: 19 Global Step: 321850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:30:40,608-Speed 4640.92 samples/sec Loss 0.8362 Epoch: 19 Global Step: 321900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:30:51,434-Speed 4729.97 samples/sec Loss 0.8407 Epoch: 19 Global Step: 321950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:31:02,360-Speed 4686.06 samples/sec Loss 0.8417 Epoch: 19 Global Step: 322000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:31:26,346-[lfw][322000]XNorm: 22.400128 Training: 2021-03-18 14:31:26,346-[lfw][322000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:31:26,346-[lfw][322000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:31:53,902-[cfp_fp][322000]XNorm: 20.847641 Training: 2021-03-18 14:31:53,902-[cfp_fp][322000]Accuracy-Flip: 0.98600+-0.00510 Training: 2021-03-18 14:31:53,903-[cfp_fp][322000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:32:17,639-[agedb_30][322000]XNorm: 22.537530 Training: 2021-03-18 14:32:17,640-[agedb_30][322000]Accuracy-Flip: 0.98167+-0.00703 Training: 2021-03-18 14:32:17,643-[agedb_30][322000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:32:28,522-Speed 594.24 samples/sec Loss 0.8359 Epoch: 19 Global Step: 322050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:32:39,595-Speed 4623.99 samples/sec Loss 0.8422 Epoch: 19 Global Step: 322100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:32:51,488-Speed 4305.03 samples/sec Loss 0.8282 Epoch: 19 Global Step: 322150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:02,340-Speed 4718.57 samples/sec Loss 0.8445 Epoch: 19 Global Step: 322200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:13,372-Speed 4640.91 samples/sec Loss 0.8452 Epoch: 19 Global Step: 322250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:24,318-Speed 4677.68 samples/sec Loss 0.8534 Epoch: 19 Global Step: 322300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:35,325-Speed 4651.87 samples/sec Loss 0.8310 Epoch: 19 Global Step: 322350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:46,472-Speed 4593.69 samples/sec Loss 0.8268 Epoch: 19 Global Step: 322400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:33:57,473-Speed 4654.35 samples/sec Loss 0.8351 Epoch: 19 Global Step: 322450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:34:08,369-Speed 4699.30 samples/sec Loss 0.8374 Epoch: 19 Global Step: 322500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:34:19,339-Speed 4667.19 samples/sec Loss 0.8355 Epoch: 19 Global Step: 322550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:34:31,154-Speed 4333.82 samples/sec Loss 0.8305 Epoch: 19 Global Step: 322600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:34:42,060-Speed 4695.26 samples/sec Loss 0.8495 Epoch: 19 Global Step: 322650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:34:53,741-Speed 4383.39 samples/sec Loss 0.8356 Epoch: 19 Global Step: 322700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:35:04,725-Speed 4661.40 samples/sec Loss 0.8403 Epoch: 19 Global Step: 322750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:35:17,297-Speed 4072.88 samples/sec Loss 0.8494 Epoch: 19 Global Step: 322800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:35:28,321-Speed 4644.43 samples/sec Loss 0.8561 Epoch: 19 Global Step: 322850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:35:39,219-Speed 4698.71 samples/sec Loss 0.8317 Epoch: 19 Global Step: 322900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:35:50,033-Speed 4734.49 samples/sec Loss 0.8291 Epoch: 19 Global Step: 322950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:01,707-Speed 4386.43 samples/sec Loss 0.8571 Epoch: 19 Global Step: 323000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:12,997-Speed 4534.90 samples/sec Loss 0.8231 Epoch: 19 Global Step: 323050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:24,082-Speed 4619.12 samples/sec Loss 0.8228 Epoch: 19 Global Step: 323100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:35,876-Speed 4341.57 samples/sec Loss 0.8537 Epoch: 19 Global Step: 323150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:46,920-Speed 4636.12 samples/sec Loss 0.8454 Epoch: 19 Global Step: 323200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:36:57,836-Speed 4690.80 samples/sec Loss 0.8402 Epoch: 19 Global Step: 323250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:37:09,223-Speed 4496.74 samples/sec Loss 0.8409 Epoch: 19 Global Step: 323300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:37:20,255-Speed 4641.41 samples/sec Loss 0.8323 Epoch: 19 Global Step: 323350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:37:31,547-Speed 4534.33 samples/sec Loss 0.8487 Epoch: 19 Global Step: 323400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:37:42,588-Speed 4637.41 samples/sec Loss 0.8250 Epoch: 19 Global Step: 323450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:37:53,433-Speed 4721.62 samples/sec Loss 0.8413 Epoch: 19 Global Step: 323500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:04,417-Speed 4661.54 samples/sec Loss 0.8376 Epoch: 19 Global Step: 323550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:15,206-Speed 4745.99 samples/sec Loss 0.8211 Epoch: 19 Global Step: 323600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:26,142-Speed 4681.74 samples/sec Loss 0.8463 Epoch: 19 Global Step: 323650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:37,277-Speed 4598.43 samples/sec Loss 0.8290 Epoch: 19 Global Step: 323700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:48,190-Speed 4692.23 samples/sec Loss 0.8422 Epoch: 19 Global Step: 323750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:38:58,997-Speed 4737.67 samples/sec Loss 0.8177 Epoch: 19 Global Step: 323800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:39:10,032-Speed 4640.38 samples/sec Loss 0.8315 Epoch: 19 Global Step: 323850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:39:20,957-Speed 4686.67 samples/sec Loss 0.8314 Epoch: 19 Global Step: 323900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:39:31,867-Speed 4692.98 samples/sec Loss 0.8396 Epoch: 19 Global Step: 323950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:39:42,783-Speed 4691.04 samples/sec Loss 0.8298 Epoch: 19 Global Step: 324000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:40:06,841-[lfw][324000]XNorm: 22.538613 Training: 2021-03-18 14:40:06,842-[lfw][324000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:40:06,842-[lfw][324000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:40:34,393-[cfp_fp][324000]XNorm: 20.975023 Training: 2021-03-18 14:40:34,393-[cfp_fp][324000]Accuracy-Flip: 0.98657+-0.00439 Training: 2021-03-18 14:40:34,393-[cfp_fp][324000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:40:58,102-[agedb_30][324000]XNorm: 22.644398 Training: 2021-03-18 14:40:58,102-[agedb_30][324000]Accuracy-Flip: 0.98183+-0.00630 Training: 2021-03-18 14:40:58,102-[agedb_30][324000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:41:09,748-Speed 588.75 samples/sec Loss 0.8408 Epoch: 19 Global Step: 324050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:41:20,554-Speed 4738.69 samples/sec Loss 0.8445 Epoch: 19 Global Step: 324100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:41:31,755-Speed 4571.22 samples/sec Loss 0.8319 Epoch: 19 Global Step: 324150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:41:42,612-Speed 4715.96 samples/sec Loss 0.8589 Epoch: 19 Global Step: 324200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:41:53,661-Speed 4634.26 samples/sec Loss 0.8402 Epoch: 19 Global Step: 324250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:04,844-Speed 4579.07 samples/sec Loss 0.8282 Epoch: 19 Global Step: 324300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:15,882-Speed 4638.87 samples/sec Loss 0.8436 Epoch: 19 Global Step: 324350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:26,882-Speed 4654.52 samples/sec Loss 0.8258 Epoch: 19 Global Step: 324400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:37,943-Speed 4629.44 samples/sec Loss 0.8360 Epoch: 19 Global Step: 324450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:48,773-Speed 4727.98 samples/sec Loss 0.8430 Epoch: 19 Global Step: 324500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:42:59,688-Speed 4691.14 samples/sec Loss 0.8201 Epoch: 19 Global Step: 324550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:43:10,655-Speed 4668.59 samples/sec Loss 0.8376 Epoch: 19 Global Step: 324600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:43:21,530-Speed 4708.28 samples/sec Loss 0.8605 Epoch: 19 Global Step: 324650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:43:32,542-Speed 4649.83 samples/sec Loss 0.8512 Epoch: 19 Global Step: 324700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:43:43,666-Speed 4602.71 samples/sec Loss 0.8336 Epoch: 19 Global Step: 324750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:43:54,494-Speed 4728.73 samples/sec Loss 0.8411 Epoch: 19 Global Step: 324800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:44:05,389-Speed 4699.90 samples/sec Loss 0.8419 Epoch: 19 Global Step: 324850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:44:16,239-Speed 4719.23 samples/sec Loss 0.8318 Epoch: 19 Global Step: 324900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:44:27,177-Speed 4681.15 samples/sec Loss 0.8499 Epoch: 19 Global Step: 324950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:44:38,139-Speed 4670.71 samples/sec Loss 0.8449 Epoch: 19 Global Step: 325000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:44:49,101-Speed 4671.24 samples/sec Loss 0.8333 Epoch: 19 Global Step: 325050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:00,890-Speed 4343.09 samples/sec Loss 0.8452 Epoch: 19 Global Step: 325100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:11,911-Speed 4646.04 samples/sec Loss 0.8506 Epoch: 19 Global Step: 325150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:22,585-Speed 4796.89 samples/sec Loss 0.8359 Epoch: 19 Global Step: 325200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:33,595-Speed 4650.73 samples/sec Loss 0.8370 Epoch: 19 Global Step: 325250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:44,544-Speed 4676.58 samples/sec Loss 0.8407 Epoch: 19 Global Step: 325300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:45:55,459-Speed 4690.66 samples/sec Loss 0.8300 Epoch: 19 Global Step: 325350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:46:06,351-Speed 4701.11 samples/sec Loss 0.8480 Epoch: 19 Global Step: 325400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:46:18,315-Speed 4279.72 samples/sec Loss 0.8189 Epoch: 19 Global Step: 325450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:46:29,011-Speed 4787.15 samples/sec Loss 0.8459 Epoch: 19 Global Step: 325500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:46:39,750-Speed 4768.02 samples/sec Loss 0.8323 Epoch: 19 Global Step: 325550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:46:51,569-Speed 4332.28 samples/sec Loss 0.8526 Epoch: 19 Global Step: 325600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:04,077-Speed 4093.50 samples/sec Loss 0.8381 Epoch: 19 Global Step: 325650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:14,705-Speed 4818.05 samples/sec Loss 0.8511 Epoch: 19 Global Step: 325700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:25,680-Speed 4665.32 samples/sec Loss 0.8342 Epoch: 19 Global Step: 325750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:36,609-Speed 4684.84 samples/sec Loss 0.8400 Epoch: 19 Global Step: 325800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:48,444-Speed 4326.43 samples/sec Loss 0.8520 Epoch: 19 Global Step: 325850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:47:59,364-Speed 4688.82 samples/sec Loss 0.8355 Epoch: 19 Global Step: 325900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:48:10,131-Speed 4755.85 samples/sec Loss 0.8345 Epoch: 19 Global Step: 325950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:48:21,961-Speed 4328.26 samples/sec Loss 0.8320 Epoch: 19 Global Step: 326000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:48:45,885-[lfw][326000]XNorm: 22.474494 Training: 2021-03-18 14:48:45,886-[lfw][326000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:48:45,886-[lfw][326000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:49:13,296-[cfp_fp][326000]XNorm: 20.924658 Training: 2021-03-18 14:49:13,296-[cfp_fp][326000]Accuracy-Flip: 0.98671+-0.00474 Training: 2021-03-18 14:49:13,296-[cfp_fp][326000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:49:36,986-[agedb_30][326000]XNorm: 22.610732 Training: 2021-03-18 14:49:36,987-[agedb_30][326000]Accuracy-Flip: 0.98250+-0.00668 Training: 2021-03-18 14:49:36,987-[agedb_30][326000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:49:47,855-Speed 596.08 samples/sec Loss 0.8522 Epoch: 19 Global Step: 326050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:49:58,885-Speed 4642.32 samples/sec Loss 0.8524 Epoch: 19 Global Step: 326100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:50:09,908-Speed 4645.17 samples/sec Loss 0.8348 Epoch: 19 Global Step: 326150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:50:20,974-Speed 4626.94 samples/sec Loss 0.8387 Epoch: 19 Global Step: 326200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:50:32,147-Speed 4583.05 samples/sec Loss 0.8370 Epoch: 19 Global Step: 326250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:50:43,075-Speed 4685.60 samples/sec Loss 0.8392 Epoch: 19 Global Step: 326300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:50:54,067-Speed 4658.28 samples/sec Loss 0.8270 Epoch: 19 Global Step: 326350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:05,119-Speed 4632.84 samples/sec Loss 0.8281 Epoch: 19 Global Step: 326400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:15,862-Speed 4766.42 samples/sec Loss 0.8432 Epoch: 19 Global Step: 326450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:26,707-Speed 4721.10 samples/sec Loss 0.8166 Epoch: 19 Global Step: 326500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:37,576-Speed 4711.22 samples/sec Loss 0.8335 Epoch: 19 Global Step: 326550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:48,593-Speed 4647.56 samples/sec Loss 0.8477 Epoch: 19 Global Step: 326600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:51:59,749-Speed 4589.50 samples/sec Loss 0.8428 Epoch: 19 Global Step: 326650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:52:10,534-Speed 4747.55 samples/sec Loss 0.8319 Epoch: 19 Global Step: 326700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:52:21,503-Speed 4668.07 samples/sec Loss 0.8330 Epoch: 19 Global Step: 326750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:52:32,390-Speed 4703.09 samples/sec Loss 0.8399 Epoch: 19 Global Step: 326800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:52:43,274-Speed 4704.59 samples/sec Loss 0.8329 Epoch: 19 Global Step: 326850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 14:52:54,971-Speed 4377.45 samples/sec Loss 0.8354 Epoch: 19 Global Step: 326900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:53:06,202-Speed 4558.89 samples/sec Loss 0.8433 Epoch: 19 Global Step: 326950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:53:17,054-Speed 4718.40 samples/sec Loss 0.8395 Epoch: 19 Global Step: 327000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:53:27,989-Speed 4682.39 samples/sec Loss 0.8359 Epoch: 19 Global Step: 327050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:53:39,155-Speed 4585.71 samples/sec Loss 0.8214 Epoch: 19 Global Step: 327100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:53:49,934-Speed 4750.24 samples/sec Loss 0.8568 Epoch: 19 Global Step: 327150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:00,977-Speed 4636.56 samples/sec Loss 0.8627 Epoch: 19 Global Step: 327200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:11,992-Speed 4648.59 samples/sec Loss 0.8483 Epoch: 19 Global Step: 327250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:22,990-Speed 4655.61 samples/sec Loss 0.8379 Epoch: 19 Global Step: 327300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:33,908-Speed 4689.96 samples/sec Loss 0.8407 Epoch: 19 Global Step: 327350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:45,008-Speed 4612.73 samples/sec Loss 0.8352 Epoch: 19 Global Step: 327400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:54:55,701-Speed 4788.53 samples/sec Loss 0.8592 Epoch: 19 Global Step: 327450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:55:06,605-Speed 4695.81 samples/sec Loss 0.8431 Epoch: 19 Global Step: 327500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:55:17,514-Speed 4693.65 samples/sec Loss 0.8501 Epoch: 19 Global Step: 327550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:55:28,340-Speed 4729.73 samples/sec Loss 0.8508 Epoch: 19 Global Step: 327600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:55:39,388-Speed 4634.49 samples/sec Loss 0.8293 Epoch: 19 Global Step: 327650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:55:50,383-Speed 4656.76 samples/sec Loss 0.8437 Epoch: 19 Global Step: 327700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:01,424-Speed 4637.48 samples/sec Loss 0.8387 Epoch: 19 Global Step: 327750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:12,088-Speed 4801.72 samples/sec Loss 0.8508 Epoch: 19 Global Step: 327800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:22,878-Speed 4745.67 samples/sec Loss 0.8370 Epoch: 19 Global Step: 327850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:33,838-Speed 4671.48 samples/sec Loss 0.8455 Epoch: 19 Global Step: 327900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:44,660-Speed 4731.50 samples/sec Loss 0.8330 Epoch: 19 Global Step: 327950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:56:56,354-Speed 4378.37 samples/sec Loss 0.8476 Epoch: 19 Global Step: 328000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:57:20,612-[lfw][328000]XNorm: 22.491892 Training: 2021-03-18 14:57:20,613-[lfw][328000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 14:57:20,613-[lfw][328000]Accuracy-Highest: 0.99817 Training: 2021-03-18 14:57:48,225-[cfp_fp][328000]XNorm: 20.908390 Training: 2021-03-18 14:57:48,226-[cfp_fp][328000]Accuracy-Flip: 0.98757+-0.00483 Training: 2021-03-18 14:57:48,226-[cfp_fp][328000]Accuracy-Highest: 0.98786 Training: 2021-03-18 14:58:11,958-[agedb_30][328000]XNorm: 22.607904 Training: 2021-03-18 14:58:11,958-[agedb_30][328000]Accuracy-Flip: 0.98167+-0.00675 Training: 2021-03-18 14:58:11,958-[agedb_30][328000]Accuracy-Highest: 0.98333 Training: 2021-03-18 14:58:22,721-Speed 592.83 samples/sec Loss 0.8477 Epoch: 19 Global Step: 328050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:58:33,718-Speed 4655.99 samples/sec Loss 0.8343 Epoch: 19 Global Step: 328100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:58:44,636-Speed 4689.58 samples/sec Loss 0.8323 Epoch: 19 Global Step: 328150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:58:55,801-Speed 4586.05 samples/sec Loss 0.8280 Epoch: 19 Global Step: 328200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:59:06,709-Speed 4694.43 samples/sec Loss 0.8241 Epoch: 19 Global Step: 328250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:59:18,616-Speed 4300.06 samples/sec Loss 0.8283 Epoch: 19 Global Step: 328300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:59:29,563-Speed 4677.30 samples/sec Loss 0.8531 Epoch: 19 Global Step: 328350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:59:40,456-Speed 4700.66 samples/sec Loss 0.8476 Epoch: 19 Global Step: 328400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 14:59:53,231-Speed 4008.10 samples/sec Loss 0.8191 Epoch: 19 Global Step: 328450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:00:04,281-Speed 4633.79 samples/sec Loss 0.8307 Epoch: 19 Global Step: 328500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:00:15,966-Speed 4381.77 samples/sec Loss 0.8382 Epoch: 19 Global Step: 328550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:00:26,942-Speed 4665.08 samples/sec Loss 0.8463 Epoch: 19 Global Step: 328600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:00:38,592-Speed 4395.36 samples/sec Loss 0.8303 Epoch: 19 Global Step: 328650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:00:49,597-Speed 4652.62 samples/sec Loss 0.8436 Epoch: 19 Global Step: 328700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:00,533-Speed 4682.19 samples/sec Loss 0.8419 Epoch: 19 Global Step: 328750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:11,569-Speed 4639.71 samples/sec Loss 0.8290 Epoch: 19 Global Step: 328800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:22,260-Speed 4789.13 samples/sec Loss 0.8336 Epoch: 19 Global Step: 328850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:34,166-Speed 4300.49 samples/sec Loss 0.8351 Epoch: 19 Global Step: 328900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:45,041-Speed 4708.60 samples/sec Loss 0.8350 Epoch: 19 Global Step: 328950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:01:55,815-Speed 4752.24 samples/sec Loss 0.8310 Epoch: 19 Global Step: 329000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:02:07,003-Speed 4576.93 samples/sec Loss 0.8442 Epoch: 19 Global Step: 329050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:02:17,968-Speed 4669.31 samples/sec Loss 0.8374 Epoch: 19 Global Step: 329100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:02:28,755-Speed 4746.93 samples/sec Loss 0.8305 Epoch: 19 Global Step: 329150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:02:39,907-Speed 4591.44 samples/sec Loss 0.8359 Epoch: 19 Global Step: 329200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:02:51,019-Speed 4607.72 samples/sec Loss 0.8408 Epoch: 19 Global Step: 329250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:01,814-Speed 4743.29 samples/sec Loss 0.8478 Epoch: 19 Global Step: 329300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:12,919-Speed 4610.77 samples/sec Loss 0.8328 Epoch: 19 Global Step: 329350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:23,639-Speed 4776.45 samples/sec Loss 0.8358 Epoch: 19 Global Step: 329400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:34,702-Speed 4628.50 samples/sec Loss 0.8427 Epoch: 19 Global Step: 329450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:45,858-Speed 4589.78 samples/sec Loss 0.8354 Epoch: 19 Global Step: 329500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:03:56,648-Speed 4745.19 samples/sec Loss 0.8439 Epoch: 19 Global Step: 329550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:04:07,407-Speed 4759.32 samples/sec Loss 0.8371 Epoch: 19 Global Step: 329600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:04:18,350-Speed 4678.64 samples/sec Loss 0.8321 Epoch: 19 Global Step: 329650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:04:29,216-Speed 4712.43 samples/sec Loss 0.8287 Epoch: 19 Global Step: 329700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:04:40,863-Speed 4396.39 samples/sec Loss 0.8388 Epoch: 19 Global Step: 329750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:04:51,629-Speed 4755.69 samples/sec Loss 0.8373 Epoch: 19 Global Step: 329800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:05:02,658-Speed 4642.80 samples/sec Loss 0.8323 Epoch: 19 Global Step: 329850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:05:13,870-Speed 4566.55 samples/sec Loss 0.8396 Epoch: 19 Global Step: 329900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:05:25,098-Speed 4560.59 samples/sec Loss 0.8359 Epoch: 19 Global Step: 329950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:05:35,897-Speed 4741.40 samples/sec Loss 0.8398 Epoch: 19 Global Step: 330000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:05:59,664-[lfw][330000]XNorm: 22.511422 Training: 2021-03-18 15:05:59,664-[lfw][330000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 15:05:59,664-[lfw][330000]Accuracy-Highest: 0.99817 Training: 2021-03-18 15:06:27,174-[cfp_fp][330000]XNorm: 20.927429 Training: 2021-03-18 15:06:27,174-[cfp_fp][330000]Accuracy-Flip: 0.98571+-0.00469 Training: 2021-03-18 15:06:27,174-[cfp_fp][330000]Accuracy-Highest: 0.98786 Training: 2021-03-18 15:06:50,891-[agedb_30][330000]XNorm: 22.624713 Training: 2021-03-18 15:06:50,891-[agedb_30][330000]Accuracy-Flip: 0.98267+-0.00700 Training: 2021-03-18 15:06:50,891-[agedb_30][330000]Accuracy-Highest: 0.98333 Training: 2021-03-18 15:07:01,682-Speed 596.85 samples/sec Loss 0.8378 Epoch: 19 Global Step: 330050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:07:12,484-Speed 4740.07 samples/sec Loss 0.8290 Epoch: 19 Global Step: 330100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:07:23,481-Speed 4656.12 samples/sec Loss 0.8420 Epoch: 19 Global Step: 330150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:07:34,321-Speed 4723.69 samples/sec Loss 0.8323 Epoch: 19 Global Step: 330200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:07:45,371-Speed 4633.57 samples/sec Loss 0.8369 Epoch: 19 Global Step: 330250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:07:56,163-Speed 4744.93 samples/sec Loss 0.8371 Epoch: 19 Global Step: 330300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:08:07,079-Speed 4690.34 samples/sec Loss 0.8474 Epoch: 19 Global Step: 330350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:08:18,124-Speed 4636.18 samples/sec Loss 0.8292 Epoch: 19 Global Step: 330400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:08:28,891-Speed 4755.15 samples/sec Loss 0.8505 Epoch: 19 Global Step: 330450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:08:40,165-Speed 4542.01 samples/sec Loss 0.8344 Epoch: 19 Global Step: 330500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:08:51,305-Speed 4596.21 samples/sec Loss 0.8440 Epoch: 19 Global Step: 330550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:02,307-Speed 4653.76 samples/sec Loss 0.8386 Epoch: 19 Global Step: 330600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:13,297-Speed 4659.30 samples/sec Loss 0.8313 Epoch: 19 Global Step: 330650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:24,090-Speed 4743.85 samples/sec Loss 0.8276 Epoch: 19 Global Step: 330700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:35,209-Speed 4605.16 samples/sec Loss 0.8470 Epoch: 19 Global Step: 330750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:46,261-Speed 4632.65 samples/sec Loss 0.8217 Epoch: 19 Global Step: 330800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:09:57,882-Speed 4406.04 samples/sec Loss 0.8415 Epoch: 19 Global Step: 330850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:10:08,583-Speed 4784.93 samples/sec Loss 0.8380 Epoch: 19 Global Step: 330900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:10:19,709-Speed 4602.43 samples/sec Loss 0.8518 Epoch: 19 Global Step: 330950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:10:30,408-Speed 4785.53 samples/sec Loss 0.8490 Epoch: 19 Global Step: 331000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:10:41,181-Speed 4753.06 samples/sec Loss 0.8378 Epoch: 19 Global Step: 331050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:10:52,139-Speed 4672.61 samples/sec Loss 0.8490 Epoch: 19 Global Step: 331100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:11:03,089-Speed 4676.27 samples/sec Loss 0.8385 Epoch: 19 Global Step: 331150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:11:13,967-Speed 4707.12 samples/sec Loss 0.8291 Epoch: 19 Global Step: 331200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:11:25,839-Speed 4312.73 samples/sec Loss 0.8303 Epoch: 19 Global Step: 331250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:11:37,204-Speed 4505.31 samples/sec Loss 0.8337 Epoch: 19 Global Step: 331300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:11:49,009-Speed 4337.52 samples/sec Loss 0.8408 Epoch: 19 Global Step: 331350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:00,561-Speed 4432.11 samples/sec Loss 0.8405 Epoch: 19 Global Step: 331400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:12,442-Speed 4309.87 samples/sec Loss 0.8425 Epoch: 19 Global Step: 331450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:23,358-Speed 4690.64 samples/sec Loss 0.8513 Epoch: 19 Global Step: 331500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:35,018-Speed 4391.46 samples/sec Loss 0.8213 Epoch: 19 Global Step: 331550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:45,918-Speed 4697.20 samples/sec Loss 0.8405 Epoch: 19 Global Step: 331600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:12:56,700-Speed 4749.35 samples/sec Loss 0.8245 Epoch: 19 Global Step: 331650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:13:07,803-Speed 4611.32 samples/sec Loss 0.8532 Epoch: 19 Global Step: 331700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:13:18,740-Speed 4681.74 samples/sec Loss 0.8266 Epoch: 19 Global Step: 331750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:13:29,529-Speed 4745.68 samples/sec Loss 0.8552 Epoch: 19 Global Step: 331800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:13:40,517-Speed 4660.09 samples/sec Loss 0.8386 Epoch: 19 Global Step: 331850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:13:51,903-Speed 4496.99 samples/sec Loss 0.8362 Epoch: 19 Global Step: 331900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:14:02,442-Speed 4858.23 samples/sec Loss 0.8336 Epoch: 19 Global Step: 331950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:14:13,244-Speed 4740.21 samples/sec Loss 0.8188 Epoch: 19 Global Step: 332000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:14:37,538-[lfw][332000]XNorm: 22.378174 Training: 2021-03-18 15:14:37,538-[lfw][332000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-18 15:14:37,538-[lfw][332000]Accuracy-Highest: 0.99817 Training: 2021-03-18 15:15:05,131-[cfp_fp][332000]XNorm: 20.826706 Training: 2021-03-18 15:15:05,131-[cfp_fp][332000]Accuracy-Flip: 0.98729+-0.00521 Training: 2021-03-18 15:15:05,131-[cfp_fp][332000]Accuracy-Highest: 0.98786 Training: 2021-03-18 15:15:28,886-[agedb_30][332000]XNorm: 22.492539 Training: 2021-03-18 15:15:28,886-[agedb_30][332000]Accuracy-Flip: 0.98250+-0.00680 Training: 2021-03-18 15:15:28,886-[agedb_30][332000]Accuracy-Highest: 0.98333 Training: 2021-03-18 15:15:39,724-Speed 592.05 samples/sec Loss 0.8405 Epoch: 19 Global Step: 332050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:15:50,539-Speed 4734.26 samples/sec Loss 0.8288 Epoch: 19 Global Step: 332100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:01,815-Speed 4541.29 samples/sec Loss 0.8349 Epoch: 19 Global Step: 332150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:12,638-Speed 4730.65 samples/sec Loss 0.8506 Epoch: 19 Global Step: 332200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:23,552-Speed 4691.44 samples/sec Loss 0.8379 Epoch: 19 Global Step: 332250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:34,490-Speed 4681.51 samples/sec Loss 0.8297 Epoch: 19 Global Step: 332300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:45,362-Speed 4709.41 samples/sec Loss 0.8442 Epoch: 19 Global Step: 332350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:16:56,215-Speed 4717.87 samples/sec Loss 0.8441 Epoch: 19 Global Step: 332400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:17:06,997-Speed 4748.66 samples/sec Loss 0.8359 Epoch: 19 Global Step: 332450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:17:18,064-Speed 4626.74 samples/sec Loss 0.8328 Epoch: 19 Global Step: 332500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:17:29,679-Speed 4408.33 samples/sec Loss 0.8285 Epoch: 19 Global Step: 332550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:17:40,799-Speed 4604.71 samples/sec Loss 0.8257 Epoch: 19 Global Step: 332600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:17:51,602-Speed 4739.60 samples/sec Loss 0.8291 Epoch: 19 Global Step: 332650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:02,504-Speed 4696.80 samples/sec Loss 0.8521 Epoch: 19 Global Step: 332700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:13,481-Speed 4664.59 samples/sec Loss 0.8299 Epoch: 19 Global Step: 332750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:24,369-Speed 4702.79 samples/sec Loss 0.8449 Epoch: 19 Global Step: 332800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:35,155-Speed 4747.18 samples/sec Loss 0.8320 Epoch: 19 Global Step: 332850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:46,141-Speed 4661.05 samples/sec Loss 0.8397 Epoch: 19 Global Step: 332900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:18:57,119-Speed 4663.90 samples/sec Loss 0.8413 Epoch: 19 Global Step: 332950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:19:08,002-Speed 4704.96 samples/sec Loss 0.8351 Epoch: 19 Global Step: 333000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:19:18,943-Speed 4680.00 samples/sec Loss 0.8333 Epoch: 19 Global Step: 333050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:19:30,180-Speed 4556.65 samples/sec Loss 0.8406 Epoch: 19 Global Step: 333100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:19:41,189-Speed 4650.77 samples/sec Loss 0.8266 Epoch: 19 Global Step: 333150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:19:52,113-Speed 4687.29 samples/sec Loss 0.8546 Epoch: 19 Global Step: 333200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:03,116-Speed 4653.70 samples/sec Loss 0.8597 Epoch: 19 Global Step: 333250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:13,884-Speed 4755.11 samples/sec Loss 0.8348 Epoch: 19 Global Step: 333300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:24,729-Speed 4721.47 samples/sec Loss 0.8450 Epoch: 19 Global Step: 333350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:35,945-Speed 4565.00 samples/sec Loss 0.8245 Epoch: 19 Global Step: 333400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:47,101-Speed 4589.86 samples/sec Loss 0.8472 Epoch: 19 Global Step: 333450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:20:58,053-Speed 4675.41 samples/sec Loss 0.8397 Epoch: 19 Global Step: 333500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:21:09,220-Speed 4585.29 samples/sec Loss 0.8590 Epoch: 19 Global Step: 333550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:21:19,998-Speed 4750.61 samples/sec Loss 0.8388 Epoch: 19 Global Step: 333600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:21:30,905-Speed 4694.23 samples/sec Loss 0.8466 Epoch: 19 Global Step: 333650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:21:41,938-Speed 4641.11 samples/sec Loss 0.8214 Epoch: 19 Global Step: 333700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:21:52,658-Speed 4776.57 samples/sec Loss 0.8441 Epoch: 19 Global Step: 333750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 15:22:03,370-Speed 4779.76 samples/sec Loss 0.8442 Epoch: 19 Global Step: 333800 Fp16 Grad Scale: 16384 Required: 0 hours