Training: 2021-03-14 21:49:01,065-rank_id: 0 Training: 2021-03-14 21:49:25,242-softmax weight init successfully! Training: 2021-03-14 21:49:25,242-softmax weight mom init successfully! Training: 2021-03-14 21:49:25,245-Total Step is: 333821 Training: 2021-03-14 21:50:07,345-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-14 21:50:35,088-Speed 4105.24 samples/sec Loss 46.9402 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 30 hours Training: 2021-03-14 21:50:51,387-Speed 3141.38 samples/sec Loss 46.0908 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 30 hours Training: 2021-03-14 21:51:02,912-Speed 4442.96 samples/sec Loss 45.1257 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 28 hours Training: 2021-03-14 21:51:14,830-Speed 4295.93 samples/sec Loss 43.7563 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 27 hours Training: 2021-03-14 21:51:26,605-Speed 4348.38 samples/sec Loss 42.6347 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 26 hours Training: 2021-03-14 21:51:39,074-Speed 4106.30 samples/sec Loss 41.1149 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 25 hours Training: 2021-03-14 21:51:55,013-Speed 3212.42 samples/sec Loss 40.3078 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 26 hours Training: 2021-03-14 21:52:06,593-Speed 4421.63 samples/sec Loss 39.5733 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 25 hours Training: 2021-03-14 21:52:18,288-Speed 4378.09 samples/sec Loss 38.9473 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 25 hours Training: 2021-03-14 21:52:29,813-Speed 4442.50 samples/sec Loss 38.5894 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 25 hours Training: 2021-03-14 21:52:41,337-Speed 4443.31 samples/sec Loss 38.2535 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 24 hours Training: 2021-03-14 21:52:53,887-Speed 4079.84 samples/sec Loss 37.9597 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 24 hours Training: 2021-03-14 21:53:05,425-Speed 4437.64 samples/sec Loss 37.7011 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 21:53:17,055-Speed 4402.58 samples/sec Loss 37.3634 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 21:53:28,626-Speed 4424.86 samples/sec Loss 37.0828 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 21:53:40,234-Speed 4410.72 samples/sec Loss 36.8450 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 21:53:51,735-Speed 4452.12 samples/sec Loss 36.5827 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 21:54:03,278-Speed 4435.85 samples/sec Loss 36.3103 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:54:14,653-Speed 4501.39 samples/sec Loss 36.0483 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:54:26,408-Speed 4355.53 samples/sec Loss 35.7620 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:54:37,917-Speed 4448.95 samples/sec Loss 35.4236 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:54:49,511-Speed 4416.14 samples/sec Loss 35.1475 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:01,012-Speed 4452.12 samples/sec Loss 34.9056 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:12,764-Speed 4357.08 samples/sec Loss 34.5845 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:24,248-Speed 4458.36 samples/sec Loss 34.3024 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:36,116-Speed 4314.40 samples/sec Loss 33.9949 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:47,714-Speed 4414.60 samples/sec Loss 33.6711 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:55:59,377-Speed 4390.22 samples/sec Loss 33.4009 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:56:11,007-Speed 4402.57 samples/sec Loss 33.0303 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:56:24,250-Speed 3866.31 samples/sec Loss 32.7610 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:56:39,212-Speed 3422.07 samples/sec Loss 32.4523 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:56:51,598-Speed 4133.89 samples/sec Loss 32.0828 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:57:03,085-Speed 4457.27 samples/sec Loss 31.6988 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:57:14,768-Speed 4382.65 samples/sec Loss 31.4211 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:57:26,532-Speed 4352.37 samples/sec Loss 31.0368 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:57:38,336-Speed 4337.96 samples/sec Loss 30.7352 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:57:50,105-Speed 4350.44 samples/sec Loss 30.3967 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:58:01,879-Speed 4348.83 samples/sec Loss 30.0589 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:58:13,420-Speed 4436.31 samples/sec Loss 29.7628 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-14 21:58:47,351-[lfw][2000]XNorm: 23.666658 Training: 2021-03-14 21:58:47,351-[lfw][2000]Accuracy-Flip: 0.95600+-0.00923 Training: 2021-03-14 21:58:47,352-[lfw][2000]Accuracy-Highest: 0.95600 Training: 2021-03-14 21:59:26,146-[cfp_fp][2000]XNorm: 21.440910 Training: 2021-03-14 21:59:26,147-[cfp_fp][2000]Accuracy-Flip: 0.77900+-0.01483 Training: 2021-03-14 21:59:26,147-[cfp_fp][2000]Accuracy-Highest: 0.77900 Training: 2021-03-14 21:59:59,279-[agedb_30][2000]XNorm: 22.786848 Training: 2021-03-14 21:59:59,279-[agedb_30][2000]Accuracy-Flip: 0.77867+-0.02038 Training: 2021-03-14 21:59:59,279-[agedb_30][2000]Accuracy-Highest: 0.77867 Training: 2021-03-14 22:00:11,066-Speed 435.21 samples/sec Loss 29.4135 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:00:23,586-Speed 4089.73 samples/sec Loss 29.0506 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:00:35,230-Speed 4397.33 samples/sec Loss 28.7308 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:00:46,848-Speed 4407.10 samples/sec Loss 28.4281 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:00:58,385-Speed 4438.01 samples/sec Loss 28.0651 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:01:10,207-Speed 4331.03 samples/sec Loss 27.7229 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:01:21,756-Speed 4433.54 samples/sec Loss 27.3793 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:01:33,467-Speed 4371.94 samples/sec Loss 27.0299 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:01:45,048-Speed 4421.53 samples/sec Loss 26.6991 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:01:56,496-Speed 4472.33 samples/sec Loss 26.4198 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:02:08,156-Speed 4391.44 samples/sec Loss 26.1538 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:02:19,954-Speed 4339.97 samples/sec Loss 25.8938 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:02:31,529-Speed 4423.40 samples/sec Loss 25.4207 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:02:43,266-Speed 4362.31 samples/sec Loss 25.2163 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:02:54,814-Speed 4433.90 samples/sec Loss 24.9470 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:03:06,527-Speed 4371.27 samples/sec Loss 24.6767 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:03:18,283-Speed 4355.67 samples/sec Loss 24.3035 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:03:29,781-Speed 4453.06 samples/sec Loss 24.0643 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:03:41,616-Speed 4326.34 samples/sec Loss 23.6893 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:03:53,261-Speed 4396.82 samples/sec Loss 23.3461 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:04:05,106-Speed 4322.92 samples/sec Loss 23.1870 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:04:16,695-Speed 4418.08 samples/sec Loss 22.8596 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:04:28,515-Speed 4331.70 samples/sec Loss 22.6227 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:04:40,349-Speed 4326.51 samples/sec Loss 22.3342 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:04:52,621-Speed 4172.61 samples/sec Loss 22.1885 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:05:04,142-Speed 4444.09 samples/sec Loss 21.8639 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:05:17,608-Speed 3802.24 samples/sec Loss 21.5979 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:05:29,158-Speed 4433.31 samples/sec Loss 21.3788 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:05:40,698-Speed 4436.84 samples/sec Loss 21.0219 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:05:54,344-Speed 3752.06 samples/sec Loss 20.8582 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:06:07,689-Speed 3836.87 samples/sec Loss 20.6734 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:06:19,373-Speed 4381.99 samples/sec Loss 20.3973 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:06:31,216-Speed 4323.48 samples/sec Loss 20.2522 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:06:42,841-Speed 4404.80 samples/sec Loss 19.9988 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:06:54,418-Speed 4422.75 samples/sec Loss 19.8128 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:07:05,849-Speed 4479.23 samples/sec Loss 19.7238 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:07:18,482-Speed 4052.85 samples/sec Loss 19.5215 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:07:30,101-Speed 4406.81 samples/sec Loss 19.2661 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:07:41,827-Speed 4366.51 samples/sec Loss 18.9888 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:07:53,342-Speed 4446.57 samples/sec Loss 18.8155 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:08:23,476-[lfw][4000]XNorm: 23.491670 Training: 2021-03-14 22:08:23,477-[lfw][4000]Accuracy-Flip: 0.98400+-0.00512 Training: 2021-03-14 22:08:23,477-[lfw][4000]Accuracy-Highest: 0.98400 Training: 2021-03-14 22:08:58,606-[cfp_fp][4000]XNorm: 20.943978 Training: 2021-03-14 22:08:58,606-[cfp_fp][4000]Accuracy-Flip: 0.86743+-0.01338 Training: 2021-03-14 22:08:58,606-[cfp_fp][4000]Accuracy-Highest: 0.86743 Training: 2021-03-14 22:09:28,818-[agedb_30][4000]XNorm: 22.980985 Training: 2021-03-14 22:09:28,818-[agedb_30][4000]Accuracy-Flip: 0.88533+-0.01955 Training: 2021-03-14 22:09:28,818-[agedb_30][4000]Accuracy-Highest: 0.88533 Training: 2021-03-14 22:09:40,489-Speed 477.85 samples/sec Loss 18.6604 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:09:52,091-Speed 4413.08 samples/sec Loss 18.3553 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-14 22:10:03,670-Speed 4422.01 samples/sec Loss 18.2748 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:10:15,302-Speed 4401.67 samples/sec Loss 18.0551 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:10:27,162-Speed 4317.33 samples/sec Loss 17.9147 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:10:38,759-Speed 4415.02 samples/sec Loss 17.8733 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:10:50,274-Speed 4446.73 samples/sec Loss 17.6037 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:11:01,561-Speed 4536.34 samples/sec Loss 17.4294 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:11:13,457-Speed 4304.04 samples/sec Loss 17.3173 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:11:24,980-Speed 4443.63 samples/sec Loss 17.1452 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:11:36,832-Speed 4320.00 samples/sec Loss 17.0064 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:11:48,600-Speed 4351.02 samples/sec Loss 16.8215 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:00,446-Speed 4322.15 samples/sec Loss 16.6545 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:11,869-Speed 4482.39 samples/sec Loss 16.5880 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:23,532-Speed 4390.11 samples/sec Loss 16.4550 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:35,048-Speed 4446.40 samples/sec Loss 16.2438 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:46,621-Speed 4424.18 samples/sec Loss 16.0467 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:12:58,211-Speed 4417.67 samples/sec Loss 16.0366 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:13:10,062-Speed 4320.52 samples/sec Loss 16.0087 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:13:21,604-Speed 4436.38 samples/sec Loss 15.7011 Epoch: 0 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:13:33,345-Speed 4360.89 samples/sec Loss 15.7178 Epoch: 0 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:13:44,909-Speed 4427.65 samples/sec Loss 15.5176 Epoch: 0 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:13:56,657-Speed 4358.30 samples/sec Loss 15.4009 Epoch: 0 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:14:08,932-Speed 4171.16 samples/sec Loss 15.3256 Epoch: 0 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:14:21,460-Speed 4087.20 samples/sec Loss 15.1859 Epoch: 0 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:14:32,750-Speed 4535.14 samples/sec Loss 15.0338 Epoch: 0 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:14:44,523-Speed 4348.86 samples/sec Loss 14.8872 Epoch: 0 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:14:56,121-Speed 4414.99 samples/sec Loss 14.7598 Epoch: 0 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:15:07,631-Speed 4448.24 samples/sec Loss 14.7595 Epoch: 0 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:15:21,259-Speed 3757.22 samples/sec Loss 14.6324 Epoch: 0 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:15:34,026-Speed 4010.54 samples/sec Loss 14.5215 Epoch: 0 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:15:46,352-Speed 4153.99 samples/sec Loss 14.4064 Epoch: 0 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:15:57,878-Speed 4442.47 samples/sec Loss 14.2640 Epoch: 0 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:16:09,519-Speed 4398.19 samples/sec Loss 14.2812 Epoch: 0 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:16:22,310-Speed 4003.07 samples/sec Loss 14.1104 Epoch: 0 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:16:34,117-Speed 4336.40 samples/sec Loss 14.0799 Epoch: 0 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:16:45,763-Speed 4396.66 samples/sec Loss 13.9548 Epoch: 0 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:16:58,254-Speed 4098.95 samples/sec Loss 13.9364 Epoch: 0 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:17:09,989-Speed 4363.31 samples/sec Loss 13.7556 Epoch: 0 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:17:21,849-Speed 4317.28 samples/sec Loss 13.6731 Epoch: 0 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:17:54,568-[lfw][6000]XNorm: 23.556458 Training: 2021-03-14 22:17:54,568-[lfw][6000]Accuracy-Flip: 0.99050+-0.00538 Training: 2021-03-14 22:17:54,568-[lfw][6000]Accuracy-Highest: 0.99050 Training: 2021-03-14 22:18:30,860-[cfp_fp][6000]XNorm: 21.159959 Training: 2021-03-14 22:18:30,861-[cfp_fp][6000]Accuracy-Flip: 0.91314+-0.00903 Training: 2021-03-14 22:18:30,861-[cfp_fp][6000]Accuracy-Highest: 0.91314 Training: 2021-03-14 22:19:01,177-[agedb_30][6000]XNorm: 23.019043 Training: 2021-03-14 22:19:01,177-[agedb_30][6000]Accuracy-Flip: 0.91300+-0.01648 Training: 2021-03-14 22:19:01,177-[agedb_30][6000]Accuracy-Highest: 0.91300 Training: 2021-03-14 22:19:12,892-Speed 461.08 samples/sec Loss 13.5900 Epoch: 0 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:19:24,829-Speed 4289.38 samples/sec Loss 13.5387 Epoch: 0 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:19:36,544-Speed 4370.52 samples/sec Loss 13.4088 Epoch: 0 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:19:47,811-Speed 4544.51 samples/sec Loss 13.2958 Epoch: 0 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:19:59,693-Speed 4309.42 samples/sec Loss 13.3066 Epoch: 0 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:20:11,034-Speed 4514.41 samples/sec Loss 13.1100 Epoch: 0 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:20:22,611-Speed 4423.01 samples/sec Loss 13.1225 Epoch: 0 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:20:34,028-Speed 4484.46 samples/sec Loss 12.9992 Epoch: 0 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:20:45,622-Speed 4416.51 samples/sec Loss 12.9198 Epoch: 0 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:20:57,188-Speed 4426.95 samples/sec Loss 12.8589 Epoch: 0 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:21:08,848-Speed 4391.01 samples/sec Loss 12.7578 Epoch: 0 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:21:20,436-Speed 4418.77 samples/sec Loss 12.6731 Epoch: 0 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:21:32,033-Speed 4415.02 samples/sec Loss 12.5921 Epoch: 0 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:21:43,870-Speed 4325.80 samples/sec Loss 12.6122 Epoch: 0 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:21:55,551-Speed 4383.04 samples/sec Loss 12.4992 Epoch: 0 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:22:07,309-Speed 4354.94 samples/sec Loss 12.3261 Epoch: 0 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:22:18,972-Speed 4390.03 samples/sec Loss 12.3199 Epoch: 0 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:22:30,596-Speed 4404.73 samples/sec Loss 12.2930 Epoch: 0 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:22:42,254-Speed 4392.12 samples/sec Loss 12.1732 Epoch: 0 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:22:54,130-Speed 4311.43 samples/sec Loss 12.2493 Epoch: 0 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:23:05,796-Speed 4388.76 samples/sec Loss 12.1309 Epoch: 0 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:23:17,361-Speed 4427.52 samples/sec Loss 12.0696 Epoch: 0 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:23:29,027-Speed 4388.97 samples/sec Loss 11.9744 Epoch: 0 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:23:41,437-Speed 4126.05 samples/sec Loss 11.8803 Epoch: 0 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:23:53,004-Speed 4426.50 samples/sec Loss 11.8622 Epoch: 0 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:24:04,905-Speed 4302.31 samples/sec Loss 11.8296 Epoch: 0 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:24:17,269-Speed 4140.99 samples/sec Loss 11.7534 Epoch: 0 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:24:28,892-Speed 4405.30 samples/sec Loss 11.7301 Epoch: 0 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:24:40,620-Speed 4365.84 samples/sec Loss 11.6375 Epoch: 0 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:24:52,354-Speed 4363.74 samples/sec Loss 11.6374 Epoch: 0 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:25:04,124-Speed 4350.20 samples/sec Loss 11.4875 Epoch: 0 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:25:16,604-Speed 4102.72 samples/sec Loss 11.5328 Epoch: 0 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:25:28,341-Speed 4362.20 samples/sec Loss 11.3429 Epoch: 0 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:25:40,934-Speed 4066.13 samples/sec Loss 11.3790 Epoch: 0 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:25:53,481-Speed 4080.82 samples/sec Loss 11.3894 Epoch: 0 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:26:05,243-Speed 4353.19 samples/sec Loss 11.2957 Epoch: 0 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:26:17,554-Speed 4159.01 samples/sec Loss 11.1670 Epoch: 0 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:26:30,075-Speed 4089.15 samples/sec Loss 11.2330 Epoch: 0 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:26:41,691-Speed 4407.87 samples/sec Loss 11.1012 Epoch: 0 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:26:54,348-Speed 4045.33 samples/sec Loss 11.0318 Epoch: 0 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:27:24,660-[lfw][8000]XNorm: 22.530201 Training: 2021-03-14 22:27:24,661-[lfw][8000]Accuracy-Flip: 0.99367+-0.00364 Training: 2021-03-14 22:27:24,661-[lfw][8000]Accuracy-Highest: 0.99367 Training: 2021-03-14 22:27:59,780-[cfp_fp][8000]XNorm: 19.564681 Training: 2021-03-14 22:27:59,780-[cfp_fp][8000]Accuracy-Flip: 0.90871+-0.01126 Training: 2021-03-14 22:27:59,780-[cfp_fp][8000]Accuracy-Highest: 0.91314 Training: 2021-03-14 22:28:29,840-[agedb_30][8000]XNorm: 21.599426 Training: 2021-03-14 22:28:29,840-[agedb_30][8000]Accuracy-Flip: 0.93267+-0.01294 Training: 2021-03-14 22:28:29,840-[agedb_30][8000]Accuracy-Highest: 0.93267 Training: 2021-03-14 22:28:41,333-Speed 478.58 samples/sec Loss 10.9242 Epoch: 0 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:28:52,918-Speed 4419.48 samples/sec Loss 11.0650 Epoch: 0 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:29:04,683-Speed 4352.27 samples/sec Loss 11.0065 Epoch: 0 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:29:16,651-Speed 4277.96 samples/sec Loss 10.8969 Epoch: 0 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:29:28,129-Speed 4460.92 samples/sec Loss 11.0097 Epoch: 0 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:29:39,796-Speed 4388.65 samples/sec Loss 10.8401 Epoch: 0 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:29:51,419-Speed 4405.33 samples/sec Loss 10.7716 Epoch: 0 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:30:02,937-Speed 4445.47 samples/sec Loss 10.7482 Epoch: 0 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:30:14,422-Speed 4458.04 samples/sec Loss 10.7057 Epoch: 0 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:30:26,022-Speed 4413.94 samples/sec Loss 10.7186 Epoch: 0 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:30:37,733-Speed 4372.14 samples/sec Loss 10.6024 Epoch: 0 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:30:49,837-Speed 4230.38 samples/sec Loss 10.5944 Epoch: 0 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:31:01,702-Speed 4315.10 samples/sec Loss 10.5923 Epoch: 0 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:31:13,383-Speed 4383.31 samples/sec Loss 10.5308 Epoch: 0 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:31:25,164-Speed 4346.18 samples/sec Loss 10.4745 Epoch: 0 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:31:36,806-Speed 4398.23 samples/sec Loss 10.4386 Epoch: 0 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:31:48,582-Speed 4348.07 samples/sec Loss 10.3977 Epoch: 0 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:32:00,399-Speed 4332.70 samples/sec Loss 10.3394 Epoch: 0 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:32:11,966-Speed 4426.57 samples/sec Loss 10.4327 Epoch: 0 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:32:23,821-Speed 4319.00 samples/sec Loss 10.3666 Epoch: 0 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:32:35,469-Speed 4396.02 samples/sec Loss 10.3138 Epoch: 0 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:32:47,195-Speed 4366.42 samples/sec Loss 10.3088 Epoch: 0 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:32:58,599-Speed 4489.95 samples/sec Loss 10.1784 Epoch: 0 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:33:10,251-Speed 4394.25 samples/sec Loss 10.2576 Epoch: 0 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:33:21,750-Speed 4452.50 samples/sec Loss 10.1863 Epoch: 0 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:33:33,257-Speed 4449.79 samples/sec Loss 10.1538 Epoch: 0 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:33:45,569-Speed 4158.88 samples/sec Loss 10.0909 Epoch: 0 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:33:58,049-Speed 4102.65 samples/sec Loss 10.1437 Epoch: 0 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:34:09,405-Speed 4508.80 samples/sec Loss 10.0061 Epoch: 0 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:34:21,085-Speed 4383.77 samples/sec Loss 10.0175 Epoch: 0 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:34:32,930-Speed 4322.69 samples/sec Loss 9.9887 Epoch: 0 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:34:44,823-Speed 4305.26 samples/sec Loss 9.9478 Epoch: 0 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:34:56,424-Speed 4413.35 samples/sec Loss 9.9170 Epoch: 0 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:35:08,003-Speed 4422.06 samples/sec Loss 9.8771 Epoch: 0 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:35:20,429-Speed 4120.69 samples/sec Loss 9.9253 Epoch: 0 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:35:32,229-Speed 4339.05 samples/sec Loss 9.8558 Epoch: 0 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:35:43,649-Speed 4483.66 samples/sec Loss 9.8165 Epoch: 0 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:35:56,699-Speed 3923.38 samples/sec Loss 9.7872 Epoch: 0 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:36:08,232-Speed 4439.67 samples/sec Loss 9.7304 Epoch: 0 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:36:19,808-Speed 4422.99 samples/sec Loss 9.7051 Epoch: 0 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:36:49,974-[lfw][10000]XNorm: 21.414020 Training: 2021-03-14 22:36:49,974-[lfw][10000]Accuracy-Flip: 0.99217+-0.00489 Training: 2021-03-14 22:36:49,974-[lfw][10000]Accuracy-Highest: 0.99367 Training: 2021-03-14 22:37:25,082-[cfp_fp][10000]XNorm: 18.805220 Training: 2021-03-14 22:37:25,082-[cfp_fp][10000]Accuracy-Flip: 0.92900+-0.01217 Training: 2021-03-14 22:37:25,082-[cfp_fp][10000]Accuracy-Highest: 0.92900 Training: 2021-03-14 22:37:55,348-[agedb_30][10000]XNorm: 21.069462 Training: 2021-03-14 22:37:55,348-[agedb_30][10000]Accuracy-Flip: 0.92917+-0.01535 Training: 2021-03-14 22:37:55,348-[agedb_30][10000]Accuracy-Highest: 0.93267 Training: 2021-03-14 22:38:07,693-Speed 474.58 samples/sec Loss 9.6779 Epoch: 0 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:38:19,319-Speed 4404.08 samples/sec Loss 9.6959 Epoch: 0 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:38:31,065-Speed 4359.03 samples/sec Loss 9.6720 Epoch: 0 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:38:42,922-Speed 4318.28 samples/sec Loss 9.5911 Epoch: 0 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:38:55,556-Speed 4052.79 samples/sec Loss 9.6159 Epoch: 0 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:39:07,310-Speed 4356.14 samples/sec Loss 9.5895 Epoch: 0 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:39:19,872-Speed 4075.74 samples/sec Loss 9.5243 Epoch: 0 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:39:31,495-Speed 4405.26 samples/sec Loss 9.5360 Epoch: 0 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:39:43,061-Speed 4427.11 samples/sec Loss 9.4974 Epoch: 0 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:39:54,582-Speed 4444.07 samples/sec Loss 9.4932 Epoch: 0 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:40:06,203-Speed 4405.93 samples/sec Loss 9.4510 Epoch: 0 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:40:18,040-Speed 4325.55 samples/sec Loss 9.4869 Epoch: 0 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:40:29,659-Speed 4407.10 samples/sec Loss 9.3677 Epoch: 0 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:40:41,263-Speed 4412.37 samples/sec Loss 9.3536 Epoch: 0 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:40:52,830-Speed 4426.32 samples/sec Loss 9.3634 Epoch: 0 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:41:04,570-Speed 4361.40 samples/sec Loss 9.3416 Epoch: 0 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:41:16,254-Speed 4382.18 samples/sec Loss 9.3181 Epoch: 0 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:41:28,123-Speed 4314.06 samples/sec Loss 9.2761 Epoch: 0 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:41:39,976-Speed 4319.85 samples/sec Loss 9.2925 Epoch: 0 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:41:51,361-Speed 4497.12 samples/sec Loss 9.2216 Epoch: 0 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:42:03,270-Speed 4299.55 samples/sec Loss 9.3199 Epoch: 0 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:42:14,792-Speed 4443.75 samples/sec Loss 9.1445 Epoch: 0 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:42:26,629-Speed 4325.80 samples/sec Loss 9.2297 Epoch: 0 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:42:38,282-Speed 4393.63 samples/sec Loss 9.2572 Epoch: 0 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:42:50,104-Speed 4331.17 samples/sec Loss 9.1663 Epoch: 0 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:43:01,708-Speed 4412.32 samples/sec Loss 9.1535 Epoch: 0 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:43:13,377-Speed 4387.93 samples/sec Loss 9.1581 Epoch: 0 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:43:24,602-Speed 4561.57 samples/sec Loss 9.1304 Epoch: 0 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:43:36,684-Speed 4237.80 samples/sec Loss 9.0542 Epoch: 0 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:43:48,853-Speed 4207.82 samples/sec Loss 9.1114 Epoch: 0 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:01,255-Speed 4128.25 samples/sec Loss 9.0648 Epoch: 0 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:12,820-Speed 4427.51 samples/sec Loss 9.0816 Epoch: 0 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:24,426-Speed 4411.67 samples/sec Loss 8.9962 Epoch: 0 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:36,005-Speed 4421.89 samples/sec Loss 9.0678 Epoch: 0 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:47,667-Speed 4390.60 samples/sec Loss 9.0241 Epoch: 0 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:44:59,194-Speed 4441.79 samples/sec Loss 9.0097 Epoch: 0 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:45:10,950-Speed 4355.68 samples/sec Loss 8.9133 Epoch: 0 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:45:22,614-Speed 4389.53 samples/sec Loss 8.8657 Epoch: 0 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:45:35,348-Speed 4021.03 samples/sec Loss 8.9659 Epoch: 0 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:45:47,318-Speed 4277.51 samples/sec Loss 8.8919 Epoch: 0 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:46:17,447-[lfw][12000]XNorm: 21.350232 Training: 2021-03-14 22:46:17,447-[lfw][12000]Accuracy-Flip: 0.99417+-0.00389 Training: 2021-03-14 22:46:17,448-[lfw][12000]Accuracy-Highest: 0.99417 Training: 2021-03-14 22:46:52,612-[cfp_fp][12000]XNorm: 18.579650 Training: 2021-03-14 22:46:52,612-[cfp_fp][12000]Accuracy-Flip: 0.91743+-0.01583 Training: 2021-03-14 22:46:52,612-[cfp_fp][12000]Accuracy-Highest: 0.92900 Training: 2021-03-14 22:47:22,712-[agedb_30][12000]XNorm: 20.561791 Training: 2021-03-14 22:47:22,713-[agedb_30][12000]Accuracy-Flip: 0.94317+-0.01212 Training: 2021-03-14 22:47:22,713-[agedb_30][12000]Accuracy-Highest: 0.94317 Training: 2021-03-14 22:47:34,416-Speed 478.07 samples/sec Loss 8.8769 Epoch: 0 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:47:46,111-Speed 4378.15 samples/sec Loss 8.8847 Epoch: 0 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:47:59,395-Speed 3854.41 samples/sec Loss 8.8345 Epoch: 0 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:48:12,015-Speed 4057.10 samples/sec Loss 8.8648 Epoch: 0 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:48:23,853-Speed 4325.38 samples/sec Loss 8.8841 Epoch: 0 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:48:35,445-Speed 4416.80 samples/sec Loss 8.8079 Epoch: 0 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-14 22:48:47,223-Speed 4347.30 samples/sec Loss 8.8182 Epoch: 0 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:48:58,856-Speed 4401.54 samples/sec Loss 8.7359 Epoch: 0 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:49:10,620-Speed 4352.52 samples/sec Loss 8.7657 Epoch: 0 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:49:21,969-Speed 4511.49 samples/sec Loss 8.7642 Epoch: 0 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:49:34,426-Speed 4110.22 samples/sec Loss 8.7154 Epoch: 0 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:49:45,875-Speed 4472.31 samples/sec Loss 8.6854 Epoch: 0 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:49:57,553-Speed 4384.57 samples/sec Loss 8.6711 Epoch: 0 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:50:09,112-Speed 4429.63 samples/sec Loss 8.6620 Epoch: 0 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:50:20,885-Speed 4348.91 samples/sec Loss 8.7384 Epoch: 0 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:50:32,588-Speed 4375.27 samples/sec Loss 8.7240 Epoch: 0 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:50:45,212-Speed 4055.78 samples/sec Loss 8.6624 Epoch: 0 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:50:56,750-Speed 4437.83 samples/sec Loss 8.6743 Epoch: 0 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:51:08,532-Speed 4345.87 samples/sec Loss 8.6746 Epoch: 0 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:51:20,237-Speed 4374.43 samples/sec Loss 8.6089 Epoch: 0 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:51:31,744-Speed 4449.45 samples/sec Loss 8.5841 Epoch: 0 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:51:43,448-Speed 4374.97 samples/sec Loss 8.5346 Epoch: 0 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:51:55,252-Speed 4337.44 samples/sec Loss 8.5994 Epoch: 0 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:52:07,106-Speed 4319.36 samples/sec Loss 8.4918 Epoch: 0 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:52:19,040-Speed 4290.48 samples/sec Loss 8.5655 Epoch: 0 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:52:30,631-Speed 4417.55 samples/sec Loss 8.5473 Epoch: 0 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:52:42,241-Speed 4409.95 samples/sec Loss 8.5589 Epoch: 0 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:52:54,164-Speed 4294.48 samples/sec Loss 8.4723 Epoch: 0 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:53:05,656-Speed 4455.30 samples/sec Loss 8.4416 Epoch: 0 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:53:17,387-Speed 4365.01 samples/sec Loss 8.4610 Epoch: 0 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:53:29,011-Speed 4404.73 samples/sec Loss 8.4478 Epoch: 0 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:53:42,308-Speed 3850.73 samples/sec Loss 8.4963 Epoch: 0 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:53:54,050-Speed 4360.51 samples/sec Loss 8.4231 Epoch: 0 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:54:05,827-Speed 4347.61 samples/sec Loss 8.4400 Epoch: 0 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:54:17,504-Speed 4384.76 samples/sec Loss 8.4466 Epoch: 0 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:54:29,346-Speed 4323.68 samples/sec Loss 8.4691 Epoch: 0 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:54:41,052-Speed 4374.23 samples/sec Loss 8.4256 Epoch: 0 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:54:52,628-Speed 4422.89 samples/sec Loss 8.4580 Epoch: 0 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:55:04,174-Speed 4434.64 samples/sec Loss 8.3928 Epoch: 0 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:55:16,057-Speed 4309.12 samples/sec Loss 8.2898 Epoch: 0 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:55:46,235-[lfw][14000]XNorm: 19.349468 Training: 2021-03-14 22:55:46,235-[lfw][14000]Accuracy-Flip: 0.99450+-0.00325 Training: 2021-03-14 22:55:46,235-[lfw][14000]Accuracy-Highest: 0.99450 Training: 2021-03-14 22:56:21,126-[cfp_fp][14000]XNorm: 16.728640 Training: 2021-03-14 22:56:21,126-[cfp_fp][14000]Accuracy-Flip: 0.92143+-0.00986 Training: 2021-03-14 22:56:21,126-[cfp_fp][14000]Accuracy-Highest: 0.92900 Training: 2021-03-14 22:56:51,283-[agedb_30][14000]XNorm: 18.840586 Training: 2021-03-14 22:56:51,283-[agedb_30][14000]Accuracy-Flip: 0.93817+-0.01603 Training: 2021-03-14 22:56:51,283-[agedb_30][14000]Accuracy-Highest: 0.94317 Training: 2021-03-14 22:57:02,870-Speed 479.34 samples/sec Loss 8.3928 Epoch: 0 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:57:14,599-Speed 4365.47 samples/sec Loss 8.3248 Epoch: 0 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:57:26,120-Speed 4444.13 samples/sec Loss 8.3839 Epoch: 0 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:57:37,929-Speed 4335.92 samples/sec Loss 8.3778 Epoch: 0 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:57:49,597-Speed 4388.18 samples/sec Loss 8.2943 Epoch: 0 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:58:02,281-Speed 4036.68 samples/sec Loss 8.2297 Epoch: 0 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:58:14,672-Speed 4132.10 samples/sec Loss 8.2843 Epoch: 0 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:58:26,596-Speed 4294.10 samples/sec Loss 8.3543 Epoch: 0 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:58:38,410-Speed 4334.12 samples/sec Loss 8.2615 Epoch: 0 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:58:51,666-Speed 3862.40 samples/sec Loss 8.3415 Epoch: 0 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:59:03,085-Speed 4484.19 samples/sec Loss 8.2906 Epoch: 0 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:59:15,061-Speed 4275.30 samples/sec Loss 8.2235 Epoch: 0 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:59:26,808-Speed 4358.60 samples/sec Loss 8.2398 Epoch: 0 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:59:38,625-Speed 4333.17 samples/sec Loss 8.2328 Epoch: 0 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 22:59:50,098-Speed 4462.85 samples/sec Loss 8.2541 Epoch: 0 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:00:01,981-Speed 4308.71 samples/sec Loss 8.2186 Epoch: 0 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:00:13,632-Speed 4394.69 samples/sec Loss 8.2261 Epoch: 0 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:00:25,187-Speed 4431.07 samples/sec Loss 8.2668 Epoch: 0 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:00:36,804-Speed 4407.67 samples/sec Loss 8.1644 Epoch: 0 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:00:49,531-Speed 4022.96 samples/sec Loss 8.1888 Epoch: 0 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:01:01,068-Speed 4438.26 samples/sec Loss 8.2183 Epoch: 0 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:01:12,923-Speed 4319.01 samples/sec Loss 8.1899 Epoch: 0 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:01:24,496-Speed 4424.39 samples/sec Loss 8.0994 Epoch: 0 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:01:36,335-Speed 4324.86 samples/sec Loss 8.0848 Epoch: 0 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:01:48,612-Speed 4170.49 samples/sec Loss 8.1181 Epoch: 0 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:00,383-Speed 4349.89 samples/sec Loss 8.0904 Epoch: 0 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:11,994-Speed 4409.78 samples/sec Loss 8.1307 Epoch: 0 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:23,668-Speed 4386.04 samples/sec Loss 8.0765 Epoch: 0 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:35,317-Speed 4395.15 samples/sec Loss 8.0732 Epoch: 0 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:46,905-Speed 4418.91 samples/sec Loss 8.0736 Epoch: 0 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:02:58,722-Speed 4332.73 samples/sec Loss 8.0035 Epoch: 0 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:03:10,594-Speed 4312.86 samples/sec Loss 8.0598 Epoch: 0 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:03:22,160-Speed 4426.99 samples/sec Loss 8.0646 Epoch: 0 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:03:33,916-Speed 4355.25 samples/sec Loss 8.0469 Epoch: 0 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:03:46,407-Speed 4099.07 samples/sec Loss 8.0286 Epoch: 0 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:03:58,228-Speed 4331.68 samples/sec Loss 8.0009 Epoch: 0 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:04:10,525-Speed 4163.70 samples/sec Loss 7.9812 Epoch: 0 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:04:22,241-Speed 4370.20 samples/sec Loss 8.0265 Epoch: 0 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:04:34,052-Speed 4335.27 samples/sec Loss 7.9927 Epoch: 0 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:04:45,934-Speed 4308.99 samples/sec Loss 7.9936 Epoch: 0 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:05:16,315-[lfw][16000]XNorm: 22.260423 Training: 2021-03-14 23:05:16,315-[lfw][16000]Accuracy-Flip: 0.99500+-0.00333 Training: 2021-03-14 23:05:16,315-[lfw][16000]Accuracy-Highest: 0.99500 Training: 2021-03-14 23:05:51,572-[cfp_fp][16000]XNorm: 19.103679 Training: 2021-03-14 23:05:51,573-[cfp_fp][16000]Accuracy-Flip: 0.92943+-0.01229 Training: 2021-03-14 23:05:51,573-[cfp_fp][16000]Accuracy-Highest: 0.92943 Training: 2021-03-14 23:06:21,934-[agedb_30][16000]XNorm: 21.822217 Training: 2021-03-14 23:06:21,935-[agedb_30][16000]Accuracy-Flip: 0.94983+-0.01168 Training: 2021-03-14 23:06:21,935-[agedb_30][16000]Accuracy-Highest: 0.94983 Training: 2021-03-14 23:06:33,575-Speed 475.66 samples/sec Loss 7.9672 Epoch: 0 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:06:45,489-Speed 4297.69 samples/sec Loss 7.9844 Epoch: 0 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:06:57,206-Speed 4370.00 samples/sec Loss 7.9810 Epoch: 0 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:07:08,956-Speed 4357.46 samples/sec Loss 7.9699 Epoch: 0 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:07:20,668-Speed 4371.95 samples/sec Loss 7.9628 Epoch: 0 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:07:32,420-Speed 4357.01 samples/sec Loss 7.9589 Epoch: 0 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:07:44,088-Speed 4388.09 samples/sec Loss 7.9085 Epoch: 0 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:07:55,850-Speed 4353.31 samples/sec Loss 7.8804 Epoch: 0 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:08:07,638-Speed 4343.59 samples/sec Loss 7.9179 Epoch: 0 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:08:19,057-Speed 4483.68 samples/sec Loss 7.9685 Epoch: 0 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:08:31,723-Speed 4042.53 samples/sec Loss 7.9591 Epoch: 0 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:08:44,115-Speed 4131.71 samples/sec Loss 7.8783 Epoch: 0 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:08:55,361-Speed 4552.99 samples/sec Loss 7.9029 Epoch: 0 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:09:11,909-Speed 3094.26 samples/sec Loss 7.7799 Epoch: 1 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:09:23,754-Speed 4322.73 samples/sec Loss 7.1654 Epoch: 1 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:09:37,186-Speed 3811.96 samples/sec Loss 7.1752 Epoch: 1 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:09:48,920-Speed 4363.50 samples/sec Loss 7.1873 Epoch: 1 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:10:01,690-Speed 4009.66 samples/sec Loss 7.2244 Epoch: 1 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:10:13,439-Speed 4357.89 samples/sec Loss 7.2306 Epoch: 1 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:10:25,254-Speed 4333.68 samples/sec Loss 7.2428 Epoch: 1 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:10:37,191-Speed 4289.30 samples/sec Loss 7.2228 Epoch: 1 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:10:48,992-Speed 4338.80 samples/sec Loss 7.2510 Epoch: 1 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:11:00,665-Speed 4386.45 samples/sec Loss 7.2990 Epoch: 1 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:11:12,646-Speed 4273.70 samples/sec Loss 7.2869 Epoch: 1 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:11:24,328-Speed 4382.94 samples/sec Loss 7.2666 Epoch: 1 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:11:36,272-Speed 4286.77 samples/sec Loss 7.3232 Epoch: 1 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:11:48,092-Speed 4332.06 samples/sec Loss 7.4036 Epoch: 1 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:12:00,069-Speed 4274.86 samples/sec Loss 7.3329 Epoch: 1 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:12:12,703-Speed 4052.75 samples/sec Loss 7.3029 Epoch: 1 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:12:24,486-Speed 4345.57 samples/sec Loss 7.4132 Epoch: 1 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:12:36,184-Speed 4376.82 samples/sec Loss 7.3756 Epoch: 1 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:12:47,837-Speed 4393.90 samples/sec Loss 7.3435 Epoch: 1 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:00,314-Speed 4103.74 samples/sec Loss 7.3616 Epoch: 1 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:12,084-Speed 4350.37 samples/sec Loss 7.3646 Epoch: 1 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:23,825-Speed 4360.76 samples/sec Loss 7.3221 Epoch: 1 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:35,529-Speed 4374.64 samples/sec Loss 7.3212 Epoch: 1 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:47,110-Speed 4421.18 samples/sec Loss 7.3930 Epoch: 1 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:13:58,846-Speed 4362.87 samples/sec Loss 7.3586 Epoch: 1 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:14:10,492-Speed 4396.47 samples/sec Loss 7.3341 Epoch: 1 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:14:23,662-Speed 3887.76 samples/sec Loss 7.3299 Epoch: 1 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:14:53,875-[lfw][18000]XNorm: 23.164828 Training: 2021-03-14 23:14:53,876-[lfw][18000]Accuracy-Flip: 0.99517+-0.00263 Training: 2021-03-14 23:14:53,876-[lfw][18000]Accuracy-Highest: 0.99517 Training: 2021-03-14 23:15:28,973-[cfp_fp][18000]XNorm: 19.788065 Training: 2021-03-14 23:15:28,973-[cfp_fp][18000]Accuracy-Flip: 0.93514+-0.01268 Training: 2021-03-14 23:15:28,973-[cfp_fp][18000]Accuracy-Highest: 0.93514 Training: 2021-03-14 23:15:59,240-[agedb_30][18000]XNorm: 22.557457 Training: 2021-03-14 23:15:59,240-[agedb_30][18000]Accuracy-Flip: 0.94383+-0.00966 Training: 2021-03-14 23:15:59,240-[agedb_30][18000]Accuracy-Highest: 0.94983 Training: 2021-03-14 23:16:10,772-Speed 478.02 samples/sec Loss 7.3823 Epoch: 1 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:16:22,431-Speed 4391.47 samples/sec Loss 7.4013 Epoch: 1 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:16:34,069-Speed 4399.64 samples/sec Loss 7.4394 Epoch: 1 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:16:45,824-Speed 4355.87 samples/sec Loss 7.3434 Epoch: 1 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:16:57,291-Speed 4465.24 samples/sec Loss 7.3974 Epoch: 1 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:17:08,874-Speed 4420.35 samples/sec Loss 7.3635 Epoch: 1 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:17:20,415-Speed 4436.27 samples/sec Loss 7.3840 Epoch: 1 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:17:32,108-Speed 4378.91 samples/sec Loss 7.4097 Epoch: 1 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:17:43,832-Speed 4367.41 samples/sec Loss 7.3497 Epoch: 1 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:17:55,471-Speed 4399.11 samples/sec Loss 7.3903 Epoch: 1 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:18:07,014-Speed 4435.77 samples/sec Loss 7.3891 Epoch: 1 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:18:18,627-Speed 4408.93 samples/sec Loss 7.4403 Epoch: 1 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:18:30,254-Speed 4404.01 samples/sec Loss 7.4343 Epoch: 1 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:18:41,898-Speed 4397.34 samples/sec Loss 7.3817 Epoch: 1 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:18:53,798-Speed 4302.72 samples/sec Loss 7.3783 Epoch: 1 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:19:06,607-Speed 3997.14 samples/sec Loss 7.3800 Epoch: 1 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:19:18,388-Speed 4346.15 samples/sec Loss 7.3654 Epoch: 1 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:19:31,221-Speed 3989.87 samples/sec Loss 7.4602 Epoch: 1 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:19:42,954-Speed 4363.79 samples/sec Loss 7.4020 Epoch: 1 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:19:54,549-Speed 4416.20 samples/sec Loss 7.3817 Epoch: 1 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:20:06,870-Speed 4155.57 samples/sec Loss 7.4125 Epoch: 1 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:20:18,696-Speed 4329.45 samples/sec Loss 7.3931 Epoch: 1 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:20:30,594-Speed 4303.62 samples/sec Loss 7.3600 Epoch: 1 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:20:42,084-Speed 4456.10 samples/sec Loss 7.4315 Epoch: 1 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:20:54,700-Speed 4058.52 samples/sec Loss 7.4510 Epoch: 1 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:21:06,440-Speed 4361.45 samples/sec Loss 7.4219 Epoch: 1 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:21:18,435-Speed 4268.65 samples/sec Loss 7.4174 Epoch: 1 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:21:29,995-Speed 4429.30 samples/sec Loss 7.3825 Epoch: 1 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:21:41,754-Speed 4353.99 samples/sec Loss 7.3543 Epoch: 1 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:21:53,192-Speed 4476.57 samples/sec Loss 7.3835 Epoch: 1 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:22:04,860-Speed 4388.36 samples/sec Loss 7.4231 Epoch: 1 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:22:16,180-Speed 4523.17 samples/sec Loss 7.3751 Epoch: 1 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:22:28,094-Speed 4297.40 samples/sec Loss 7.3996 Epoch: 1 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:22:39,897-Speed 4338.32 samples/sec Loss 7.4145 Epoch: 1 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:22:51,714-Speed 4332.74 samples/sec Loss 7.4091 Epoch: 1 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:23:03,376-Speed 4390.53 samples/sec Loss 7.4291 Epoch: 1 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:23:15,040-Speed 4389.88 samples/sec Loss 7.4134 Epoch: 1 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:23:26,843-Speed 4337.93 samples/sec Loss 7.4027 Epoch: 1 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:23:39,308-Speed 4107.77 samples/sec Loss 7.3635 Epoch: 1 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:23:50,902-Speed 4416.02 samples/sec Loss 7.3554 Epoch: 1 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:24:21,122-[lfw][20000]XNorm: 22.858925 Training: 2021-03-14 23:24:21,123-[lfw][20000]Accuracy-Flip: 0.99550+-0.00308 Training: 2021-03-14 23:24:21,123-[lfw][20000]Accuracy-Highest: 0.99550 Training: 2021-03-14 23:24:56,412-[cfp_fp][20000]XNorm: 19.792704 Training: 2021-03-14 23:24:56,412-[cfp_fp][20000]Accuracy-Flip: 0.92600+-0.01238 Training: 2021-03-14 23:24:56,412-[cfp_fp][20000]Accuracy-Highest: 0.93514 Training: 2021-03-14 23:25:26,785-[agedb_30][20000]XNorm: 22.435877 Training: 2021-03-14 23:25:26,785-[agedb_30][20000]Accuracy-Flip: 0.94683+-0.01262 Training: 2021-03-14 23:25:26,785-[agedb_30][20000]Accuracy-Highest: 0.94983 Training: 2021-03-14 23:25:38,444-Speed 476.09 samples/sec Loss 7.3770 Epoch: 1 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:25:50,088-Speed 4397.25 samples/sec Loss 7.3608 Epoch: 1 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:26:03,663-Speed 3771.81 samples/sec Loss 7.3878 Epoch: 1 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:26:16,256-Speed 4065.89 samples/sec Loss 7.4247 Epoch: 1 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:26:27,817-Speed 4429.09 samples/sec Loss 7.3936 Epoch: 1 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:26:39,342-Speed 4442.68 samples/sec Loss 7.3818 Epoch: 1 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:26:51,246-Speed 4301.03 samples/sec Loss 7.3647 Epoch: 1 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:27:02,655-Speed 4488.09 samples/sec Loss 7.3795 Epoch: 1 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:27:14,548-Speed 4305.22 samples/sec Loss 7.4098 Epoch: 1 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:27:26,391-Speed 4323.52 samples/sec Loss 7.4407 Epoch: 1 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:27:38,328-Speed 4289.17 samples/sec Loss 7.3884 Epoch: 1 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:27:50,061-Speed 4363.91 samples/sec Loss 7.4260 Epoch: 1 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:28:02,222-Speed 4210.30 samples/sec Loss 7.4188 Epoch: 1 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:28:13,607-Speed 4497.42 samples/sec Loss 7.3699 Epoch: 1 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:28:25,115-Speed 4449.36 samples/sec Loss 7.3758 Epoch: 1 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:28:36,693-Speed 4422.41 samples/sec Loss 7.4123 Epoch: 1 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:28:48,718-Speed 4257.83 samples/sec Loss 7.3885 Epoch: 1 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:00,268-Speed 4432.95 samples/sec Loss 7.4035 Epoch: 1 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:12,038-Speed 4350.39 samples/sec Loss 7.3560 Epoch: 1 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:23,705-Speed 4388.74 samples/sec Loss 7.3539 Epoch: 1 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:35,334-Speed 4402.78 samples/sec Loss 7.3969 Epoch: 1 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:47,754-Speed 4122.66 samples/sec Loss 7.2911 Epoch: 1 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:29:59,875-Speed 4224.39 samples/sec Loss 7.3403 Epoch: 1 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:30:11,450-Speed 4423.26 samples/sec Loss 7.3073 Epoch: 1 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:30:24,069-Speed 4057.74 samples/sec Loss 7.3121 Epoch: 1 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:30:35,631-Speed 4428.44 samples/sec Loss 7.3692 Epoch: 1 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:30:47,373-Speed 4360.57 samples/sec Loss 7.3454 Epoch: 1 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:30:59,881-Speed 4093.54 samples/sec Loss 7.3062 Epoch: 1 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:31:11,486-Speed 4412.04 samples/sec Loss 7.2881 Epoch: 1 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:31:23,169-Speed 4382.40 samples/sec Loss 7.3408 Epoch: 1 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:31:34,967-Speed 4340.26 samples/sec Loss 7.3178 Epoch: 1 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:31:47,261-Speed 4164.63 samples/sec Loss 7.3001 Epoch: 1 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:31:59,215-Speed 4283.28 samples/sec Loss 7.3408 Epoch: 1 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:32:10,816-Speed 4413.44 samples/sec Loss 7.3033 Epoch: 1 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:32:22,683-Speed 4314.64 samples/sec Loss 7.3491 Epoch: 1 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:32:34,143-Speed 4468.04 samples/sec Loss 7.3322 Epoch: 1 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:32:45,866-Speed 4367.54 samples/sec Loss 7.3206 Epoch: 1 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:32:57,606-Speed 4361.39 samples/sec Loss 7.3550 Epoch: 1 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:33:09,597-Speed 4270.01 samples/sec Loss 7.3398 Epoch: 1 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:33:21,284-Speed 4381.07 samples/sec Loss 7.3075 Epoch: 1 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:33:51,592-[lfw][22000]XNorm: 22.762181 Training: 2021-03-14 23:33:51,593-[lfw][22000]Accuracy-Flip: 0.99450+-0.00366 Training: 2021-03-14 23:33:51,593-[lfw][22000]Accuracy-Highest: 0.99550 Training: 2021-03-14 23:34:26,801-[cfp_fp][22000]XNorm: 19.121887 Training: 2021-03-14 23:34:26,801-[cfp_fp][22000]Accuracy-Flip: 0.93686+-0.01255 Training: 2021-03-14 23:34:26,802-[cfp_fp][22000]Accuracy-Highest: 0.93686 Training: 2021-03-14 23:34:57,105-[agedb_30][22000]XNorm: 21.513588 Training: 2021-03-14 23:34:57,106-[agedb_30][22000]Accuracy-Flip: 0.94983+-0.00902 Training: 2021-03-14 23:34:57,106-[agedb_30][22000]Accuracy-Highest: 0.94983 Training: 2021-03-14 23:35:08,824-Speed 476.10 samples/sec Loss 7.3600 Epoch: 1 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:35:20,459-Speed 4400.68 samples/sec Loss 7.3603 Epoch: 1 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:35:32,375-Speed 4297.00 samples/sec Loss 7.3929 Epoch: 1 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:35:43,951-Speed 4422.99 samples/sec Loss 7.2898 Epoch: 1 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:35:55,549-Speed 4414.78 samples/sec Loss 7.2966 Epoch: 1 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:36:07,560-Speed 4262.89 samples/sec Loss 7.2733 Epoch: 1 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:36:20,834-Speed 3857.49 samples/sec Loss 7.2811 Epoch: 1 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:36:32,525-Speed 4379.58 samples/sec Loss 7.2676 Epoch: 1 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:36:45,067-Speed 4082.48 samples/sec Loss 7.2559 Epoch: 1 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:36:56,532-Speed 4465.77 samples/sec Loss 7.2815 Epoch: 1 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:37:08,307-Speed 4348.40 samples/sec Loss 7.2897 Epoch: 1 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:37:20,977-Speed 4041.10 samples/sec Loss 7.2793 Epoch: 1 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:37:32,771-Speed 4341.38 samples/sec Loss 7.3063 Epoch: 1 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:37:44,292-Speed 4444.57 samples/sec Loss 7.2728 Epoch: 1 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:37:56,055-Speed 4352.73 samples/sec Loss 7.2286 Epoch: 1 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:38:07,928-Speed 4312.55 samples/sec Loss 7.2281 Epoch: 1 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:38:19,664-Speed 4362.59 samples/sec Loss 7.3145 Epoch: 1 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:38:31,236-Speed 4424.72 samples/sec Loss 7.2556 Epoch: 1 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:38:42,736-Speed 4452.15 samples/sec Loss 7.3013 Epoch: 1 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:38:54,644-Speed 4300.09 samples/sec Loss 7.2664 Epoch: 1 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:39:06,284-Speed 4398.55 samples/sec Loss 7.3028 Epoch: 1 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:39:18,068-Speed 4345.17 samples/sec Loss 7.2636 Epoch: 1 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:39:29,550-Speed 4459.54 samples/sec Loss 7.2614 Epoch: 1 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:39:41,267-Speed 4369.84 samples/sec Loss 7.2528 Epoch: 1 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:39:52,984-Speed 4369.76 samples/sec Loss 7.1927 Epoch: 1 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:40:05,496-Speed 4092.13 samples/sec Loss 7.2191 Epoch: 1 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:40:17,111-Speed 4408.33 samples/sec Loss 7.2595 Epoch: 1 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:40:29,054-Speed 4287.31 samples/sec Loss 7.2908 Epoch: 1 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:40:40,616-Speed 4428.52 samples/sec Loss 7.2155 Epoch: 1 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:40:52,399-Speed 4345.22 samples/sec Loss 7.2768 Epoch: 1 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:41:05,106-Speed 4029.44 samples/sec Loss 7.2581 Epoch: 1 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:41:16,895-Speed 4343.39 samples/sec Loss 7.2652 Epoch: 1 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:41:28,752-Speed 4318.13 samples/sec Loss 7.2216 Epoch: 1 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:41:40,545-Speed 4341.62 samples/sec Loss 7.1646 Epoch: 1 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:41:52,121-Speed 4423.21 samples/sec Loss 7.2281 Epoch: 1 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:42:04,744-Speed 4056.19 samples/sec Loss 7.2517 Epoch: 1 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:42:16,338-Speed 4416.47 samples/sec Loss 7.2506 Epoch: 1 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:42:28,042-Speed 4374.82 samples/sec Loss 7.2396 Epoch: 1 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:42:40,593-Speed 4079.46 samples/sec Loss 7.2089 Epoch: 1 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:42:52,558-Speed 4279.18 samples/sec Loss 7.2225 Epoch: 1 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:43:22,803-[lfw][24000]XNorm: 22.605283 Training: 2021-03-14 23:43:22,804-[lfw][24000]Accuracy-Flip: 0.99583+-0.00261 Training: 2021-03-14 23:43:22,804-[lfw][24000]Accuracy-Highest: 0.99583 Training: 2021-03-14 23:43:57,793-[cfp_fp][24000]XNorm: 19.346349 Training: 2021-03-14 23:43:57,793-[cfp_fp][24000]Accuracy-Flip: 0.93314+-0.01353 Training: 2021-03-14 23:43:57,793-[cfp_fp][24000]Accuracy-Highest: 0.93686 Training: 2021-03-14 23:44:27,801-[agedb_30][24000]XNorm: 21.874742 Training: 2021-03-14 23:44:27,802-[agedb_30][24000]Accuracy-Flip: 0.95567+-0.01101 Training: 2021-03-14 23:44:27,802-[agedb_30][24000]Accuracy-Highest: 0.95567 Training: 2021-03-14 23:44:39,394-Speed 479.24 samples/sec Loss 7.1971 Epoch: 1 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:44:51,153-Speed 4354.30 samples/sec Loss 7.2306 Epoch: 1 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:45:02,814-Speed 4391.13 samples/sec Loss 7.2218 Epoch: 1 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:45:14,729-Speed 4297.06 samples/sec Loss 7.1465 Epoch: 1 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:45:26,209-Speed 4460.19 samples/sec Loss 7.1390 Epoch: 1 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:45:37,945-Speed 4362.88 samples/sec Loss 7.1968 Epoch: 1 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:45:49,466-Speed 4444.32 samples/sec Loss 7.2192 Epoch: 1 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:46:01,326-Speed 4317.04 samples/sec Loss 7.2350 Epoch: 1 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-14 23:46:12,744-Speed 4484.23 samples/sec Loss 7.2282 Epoch: 1 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:46:24,693-Speed 4285.05 samples/sec Loss 7.2053 Epoch: 1 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:46:37,911-Speed 3873.67 samples/sec Loss 7.2678 Epoch: 1 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:46:49,674-Speed 4352.77 samples/sec Loss 7.2466 Epoch: 1 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:01,500-Speed 4329.88 samples/sec Loss 7.1794 Epoch: 1 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:13,289-Speed 4343.01 samples/sec Loss 7.2041 Epoch: 1 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:25,140-Speed 4320.52 samples/sec Loss 7.1969 Epoch: 1 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:36,864-Speed 4367.11 samples/sec Loss 7.1422 Epoch: 1 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:48,232-Speed 4504.13 samples/sec Loss 7.1977 Epoch: 1 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:47:59,877-Speed 4397.09 samples/sec Loss 7.1814 Epoch: 1 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:48:11,591-Speed 4370.91 samples/sec Loss 7.1971 Epoch: 1 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:48:24,232-Speed 4050.46 samples/sec Loss 7.2324 Epoch: 1 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:48:35,692-Speed 4467.82 samples/sec Loss 7.1819 Epoch: 1 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:48:47,632-Speed 4288.21 samples/sec Loss 7.1622 Epoch: 1 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:00,084-Speed 4112.12 samples/sec Loss 7.1044 Epoch: 1 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:12,124-Speed 4252.59 samples/sec Loss 7.1460 Epoch: 1 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:23,514-Speed 4495.50 samples/sec Loss 7.1483 Epoch: 1 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:35,338-Speed 4330.08 samples/sec Loss 7.1658 Epoch: 1 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:46,963-Speed 4404.77 samples/sec Loss 7.1447 Epoch: 1 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:49:59,054-Speed 4234.51 samples/sec Loss 7.1323 Epoch: 1 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:50:10,841-Speed 4343.89 samples/sec Loss 7.1945 Epoch: 1 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:50:22,634-Speed 4341.82 samples/sec Loss 7.2041 Epoch: 1 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:50:34,387-Speed 4356.52 samples/sec Loss 7.1632 Epoch: 1 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:50:46,909-Speed 4089.02 samples/sec Loss 7.1596 Epoch: 1 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:50:58,528-Speed 4406.72 samples/sec Loss 7.1339 Epoch: 1 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:51:10,269-Speed 4361.09 samples/sec Loss 7.1351 Epoch: 1 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:51:22,014-Speed 4359.47 samples/sec Loss 7.1636 Epoch: 1 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:51:33,769-Speed 4355.65 samples/sec Loss 7.1263 Epoch: 1 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:51:46,266-Speed 4097.20 samples/sec Loss 7.1623 Epoch: 1 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:51:58,100-Speed 4326.67 samples/sec Loss 7.1362 Epoch: 1 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:52:09,632-Speed 4440.07 samples/sec Loss 7.1195 Epoch: 1 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:52:21,473-Speed 4324.21 samples/sec Loss 7.0828 Epoch: 1 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:52:51,628-[lfw][26000]XNorm: 21.692931 Training: 2021-03-14 23:52:51,628-[lfw][26000]Accuracy-Flip: 0.99533+-0.00306 Training: 2021-03-14 23:52:51,628-[lfw][26000]Accuracy-Highest: 0.99583 Training: 2021-03-14 23:53:26,887-[cfp_fp][26000]XNorm: 18.189684 Training: 2021-03-14 23:53:26,888-[cfp_fp][26000]Accuracy-Flip: 0.93700+-0.00920 Training: 2021-03-14 23:53:26,889-[cfp_fp][26000]Accuracy-Highest: 0.93700 Training: 2021-03-14 23:53:57,026-[agedb_30][26000]XNorm: 20.792297 Training: 2021-03-14 23:53:57,026-[agedb_30][26000]Accuracy-Flip: 0.95750+-0.01109 Training: 2021-03-14 23:53:57,026-[agedb_30][26000]Accuracy-Highest: 0.95750 Training: 2021-03-14 23:54:08,690-Speed 477.54 samples/sec Loss 7.1203 Epoch: 1 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:54:20,545-Speed 4318.86 samples/sec Loss 7.1523 Epoch: 1 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:54:32,281-Speed 4362.79 samples/sec Loss 7.1153 Epoch: 1 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:54:44,876-Speed 4065.19 samples/sec Loss 7.1629 Epoch: 1 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:54:56,358-Speed 4459.44 samples/sec Loss 7.1566 Epoch: 1 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:55:09,181-Speed 3992.93 samples/sec Loss 7.1848 Epoch: 1 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:55:20,766-Speed 4419.76 samples/sec Loss 7.0982 Epoch: 1 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:55:32,612-Speed 4322.15 samples/sec Loss 7.0824 Epoch: 1 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:55:44,109-Speed 4453.74 samples/sec Loss 7.0795 Epoch: 1 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:55:55,953-Speed 4323.08 samples/sec Loss 7.1432 Epoch: 1 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:56:07,621-Speed 4388.03 samples/sec Loss 7.0916 Epoch: 1 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:56:19,099-Speed 4461.04 samples/sec Loss 7.1269 Epoch: 1 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:56:30,867-Speed 4350.77 samples/sec Loss 7.1091 Epoch: 1 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:56:42,553-Speed 4381.47 samples/sec Loss 7.1731 Epoch: 1 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:56:54,332-Speed 4347.03 samples/sec Loss 7.1090 Epoch: 1 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:57:07,854-Speed 3786.62 samples/sec Loss 7.1603 Epoch: 1 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:57:19,684-Speed 4328.23 samples/sec Loss 7.1325 Epoch: 1 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:57:31,237-Speed 4431.90 samples/sec Loss 7.0966 Epoch: 1 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:57:42,968-Speed 4364.59 samples/sec Loss 7.0410 Epoch: 1 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:57:54,810-Speed 4323.82 samples/sec Loss 7.1664 Epoch: 1 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:58:06,638-Speed 4328.92 samples/sec Loss 7.0843 Epoch: 1 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:58:18,386-Speed 4358.08 samples/sec Loss 7.0798 Epoch: 1 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:58:30,212-Speed 4329.74 samples/sec Loss 7.0935 Epoch: 1 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:58:41,989-Speed 4347.52 samples/sec Loss 7.1103 Epoch: 1 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:58:53,787-Speed 4339.95 samples/sec Loss 7.0777 Epoch: 1 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:59:05,723-Speed 4289.58 samples/sec Loss 7.1001 Epoch: 1 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:59:17,743-Speed 4259.93 samples/sec Loss 7.1084 Epoch: 1 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:59:29,776-Speed 4254.99 samples/sec Loss 7.0921 Epoch: 1 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:59:41,715-Speed 4288.74 samples/sec Loss 7.0852 Epoch: 1 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-14 23:59:53,625-Speed 4298.96 samples/sec Loss 7.1064 Epoch: 1 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:00:06,296-Speed 4041.05 samples/sec Loss 7.1630 Epoch: 1 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:00:17,945-Speed 4395.17 samples/sec Loss 7.1033 Epoch: 1 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:00:29,721-Speed 4348.12 samples/sec Loss 7.0494 Epoch: 1 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:00:41,341-Speed 4406.38 samples/sec Loss 7.0498 Epoch: 1 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:00:53,185-Speed 4323.00 samples/sec Loss 7.0235 Epoch: 1 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:01:05,767-Speed 4069.44 samples/sec Loss 7.0750 Epoch: 1 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:01:18,493-Speed 4023.40 samples/sec Loss 7.0697 Epoch: 1 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:01:30,037-Speed 4435.40 samples/sec Loss 7.0800 Epoch: 1 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:01:41,745-Speed 4373.38 samples/sec Loss 7.0529 Epoch: 1 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:01:53,693-Speed 4285.11 samples/sec Loss 7.0600 Epoch: 1 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:02:24,252-[lfw][28000]XNorm: 22.467694 Training: 2021-03-15 00:02:24,252-[lfw][28000]Accuracy-Flip: 0.99617+-0.00299 Training: 2021-03-15 00:02:24,252-[lfw][28000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:02:59,159-[cfp_fp][28000]XNorm: 19.090706 Training: 2021-03-15 00:02:59,159-[cfp_fp][28000]Accuracy-Flip: 0.92371+-0.01044 Training: 2021-03-15 00:02:59,159-[cfp_fp][28000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:03:29,321-[agedb_30][28000]XNorm: 21.609084 Training: 2021-03-15 00:03:29,322-[agedb_30][28000]Accuracy-Flip: 0.95283+-0.00806 Training: 2021-03-15 00:03:29,322-[agedb_30][28000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:03:41,020-Speed 477.05 samples/sec Loss 7.0471 Epoch: 1 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:03:52,558-Speed 4437.74 samples/sec Loss 7.0306 Epoch: 1 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:04:04,367-Speed 4335.70 samples/sec Loss 7.0082 Epoch: 1 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:04:16,083-Speed 4370.53 samples/sec Loss 7.0115 Epoch: 1 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:04:27,734-Speed 4394.60 samples/sec Loss 7.0326 Epoch: 1 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:04:40,348-Speed 4059.25 samples/sec Loss 7.0302 Epoch: 1 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:04:52,187-Speed 4324.66 samples/sec Loss 7.0460 Epoch: 1 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:05:03,811-Speed 4404.86 samples/sec Loss 6.9969 Epoch: 1 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:05:15,531-Speed 4368.74 samples/sec Loss 7.0533 Epoch: 1 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:05:27,206-Speed 4385.79 samples/sec Loss 6.9984 Epoch: 1 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:05:39,068-Speed 4316.44 samples/sec Loss 7.0785 Epoch: 1 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:05:50,637-Speed 4425.96 samples/sec Loss 7.0156 Epoch: 1 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:06:04,083-Speed 3807.96 samples/sec Loss 7.0829 Epoch: 1 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:06:15,687-Speed 4412.33 samples/sec Loss 7.0237 Epoch: 1 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:06:27,384-Speed 4377.37 samples/sec Loss 7.0005 Epoch: 1 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:06:39,062-Speed 4384.44 samples/sec Loss 7.0361 Epoch: 1 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:06:51,069-Speed 4264.55 samples/sec Loss 6.9575 Epoch: 1 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:07:03,626-Speed 4077.37 samples/sec Loss 7.0177 Epoch: 1 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:07:15,502-Speed 4311.35 samples/sec Loss 7.0311 Epoch: 1 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:07:27,196-Speed 4378.73 samples/sec Loss 6.9682 Epoch: 1 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:07:39,778-Speed 4069.22 samples/sec Loss 7.0236 Epoch: 1 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:07:51,073-Speed 4533.24 samples/sec Loss 6.9927 Epoch: 1 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:08:02,794-Speed 4368.66 samples/sec Loss 7.0034 Epoch: 1 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:08:14,367-Speed 4423.92 samples/sec Loss 7.0407 Epoch: 1 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:08:26,150-Speed 4345.73 samples/sec Loss 7.0328 Epoch: 1 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:08:37,778-Speed 4403.14 samples/sec Loss 7.0399 Epoch: 1 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:08:49,726-Speed 4285.53 samples/sec Loss 7.0259 Epoch: 1 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:09:01,495-Speed 4350.51 samples/sec Loss 7.0117 Epoch: 1 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:09:13,267-Speed 4349.58 samples/sec Loss 6.9564 Epoch: 1 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:09:24,904-Speed 4399.85 samples/sec Loss 7.0466 Epoch: 1 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:09:36,774-Speed 4313.37 samples/sec Loss 6.9629 Epoch: 1 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:09:48,442-Speed 4388.27 samples/sec Loss 6.9664 Epoch: 1 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:00,178-Speed 4363.15 samples/sec Loss 7.0204 Epoch: 1 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:11,752-Speed 4423.63 samples/sec Loss 6.9637 Epoch: 1 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:23,548-Speed 4340.77 samples/sec Loss 6.9918 Epoch: 1 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:35,414-Speed 4315.12 samples/sec Loss 7.0079 Epoch: 1 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:47,380-Speed 4278.70 samples/sec Loss 6.9905 Epoch: 1 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:10:59,088-Speed 4373.41 samples/sec Loss 6.9677 Epoch: 1 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:11:10,671-Speed 4420.42 samples/sec Loss 6.9839 Epoch: 1 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:11:23,327-Speed 4045.61 samples/sec Loss 7.0058 Epoch: 1 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:11:53,609-[lfw][30000]XNorm: 21.697741 Training: 2021-03-15 00:11:53,609-[lfw][30000]Accuracy-Flip: 0.99550+-0.00279 Training: 2021-03-15 00:11:53,609-[lfw][30000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:12:28,747-[cfp_fp][30000]XNorm: 18.029360 Training: 2021-03-15 00:12:28,748-[cfp_fp][30000]Accuracy-Flip: 0.93114+-0.00868 Training: 2021-03-15 00:12:28,748-[cfp_fp][30000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:12:59,087-[agedb_30][30000]XNorm: 20.499394 Training: 2021-03-15 00:12:59,087-[agedb_30][30000]Accuracy-Flip: 0.94767+-0.01070 Training: 2021-03-15 00:12:59,087-[agedb_30][30000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:13:10,774-Speed 476.52 samples/sec Loss 6.9676 Epoch: 1 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:13:23,335-Speed 4076.22 samples/sec Loss 7.0018 Epoch: 1 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:13:35,253-Speed 4296.07 samples/sec Loss 7.0111 Epoch: 1 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:13:47,039-Speed 4344.49 samples/sec Loss 6.9338 Epoch: 1 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:13:58,955-Speed 4296.81 samples/sec Loss 7.0036 Epoch: 1 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:14:10,649-Speed 4378.40 samples/sec Loss 6.9846 Epoch: 1 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:14:23,374-Speed 4023.80 samples/sec Loss 6.9609 Epoch: 1 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:14:34,952-Speed 4422.60 samples/sec Loss 6.9259 Epoch: 1 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:14:46,753-Speed 4338.82 samples/sec Loss 6.9833 Epoch: 1 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:14:58,400-Speed 4396.03 samples/sec Loss 6.9919 Epoch: 1 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:15:10,354-Speed 4283.36 samples/sec Loss 6.9228 Epoch: 1 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:15:22,806-Speed 4111.95 samples/sec Loss 6.9468 Epoch: 1 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:15:34,458-Speed 4394.28 samples/sec Loss 6.9119 Epoch: 1 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:15:46,184-Speed 4366.25 samples/sec Loss 6.9723 Epoch: 1 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:15:57,906-Speed 4368.08 samples/sec Loss 6.9480 Epoch: 1 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:16:09,608-Speed 4375.64 samples/sec Loss 7.0323 Epoch: 1 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:16:21,509-Speed 4302.20 samples/sec Loss 6.9448 Epoch: 1 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:16:33,225-Speed 4370.47 samples/sec Loss 6.9668 Epoch: 1 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:16:45,100-Speed 4311.65 samples/sec Loss 6.9636 Epoch: 1 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:16:56,889-Speed 4343.10 samples/sec Loss 6.9161 Epoch: 1 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:17:10,424-Speed 3783.13 samples/sec Loss 6.9710 Epoch: 1 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:17:22,281-Speed 4318.39 samples/sec Loss 6.9172 Epoch: 1 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:17:34,148-Speed 4314.71 samples/sec Loss 6.8726 Epoch: 1 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:17:46,626-Speed 4103.19 samples/sec Loss 6.9200 Epoch: 1 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:17:58,195-Speed 4425.92 samples/sec Loss 6.9579 Epoch: 1 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:18:09,590-Speed 4493.18 samples/sec Loss 6.9858 Epoch: 1 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:18:22,137-Speed 4081.00 samples/sec Loss 6.9451 Epoch: 1 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:18:33,655-Speed 4445.19 samples/sec Loss 6.9397 Epoch: 1 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:18:45,471-Speed 4333.42 samples/sec Loss 6.9318 Epoch: 1 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:18:56,946-Speed 4461.97 samples/sec Loss 6.9558 Epoch: 1 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:19:08,472-Speed 4442.68 samples/sec Loss 6.9190 Epoch: 1 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:19:20,236-Speed 4352.11 samples/sec Loss 6.9277 Epoch: 1 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:19:32,064-Speed 4328.87 samples/sec Loss 6.9224 Epoch: 1 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:19:44,052-Speed 4271.22 samples/sec Loss 6.8917 Epoch: 1 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:19:55,684-Speed 4401.97 samples/sec Loss 6.9334 Epoch: 1 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:20:07,475-Speed 4342.37 samples/sec Loss 6.8845 Epoch: 1 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:20:19,359-Speed 4308.38 samples/sec Loss 6.9119 Epoch: 1 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:20:31,122-Speed 4352.98 samples/sec Loss 6.9630 Epoch: 1 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:20:42,885-Speed 4352.86 samples/sec Loss 6.9664 Epoch: 1 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:20:54,545-Speed 4391.28 samples/sec Loss 6.9001 Epoch: 1 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:21:24,738-[lfw][32000]XNorm: 23.479725 Training: 2021-03-15 00:21:24,738-[lfw][32000]Accuracy-Flip: 0.99583+-0.00367 Training: 2021-03-15 00:21:24,738-[lfw][32000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:21:59,807-[cfp_fp][32000]XNorm: 20.244856 Training: 2021-03-15 00:21:59,807-[cfp_fp][32000]Accuracy-Flip: 0.92929+-0.01023 Training: 2021-03-15 00:21:59,807-[cfp_fp][32000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:22:30,195-[agedb_30][32000]XNorm: 22.935223 Training: 2021-03-15 00:22:30,196-[agedb_30][32000]Accuracy-Flip: 0.95133+-0.01021 Training: 2021-03-15 00:22:30,196-[agedb_30][32000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:22:42,037-Speed 476.31 samples/sec Loss 6.9375 Epoch: 1 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:22:53,798-Speed 4353.75 samples/sec Loss 6.9516 Epoch: 1 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:23:05,590-Speed 4341.82 samples/sec Loss 6.9279 Epoch: 1 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:23:17,489-Speed 4303.02 samples/sec Loss 6.8567 Epoch: 1 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:23:29,055-Speed 4427.01 samples/sec Loss 6.8910 Epoch: 1 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:23:41,006-Speed 4284.57 samples/sec Loss 6.9037 Epoch: 1 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:23:52,521-Speed 4446.35 samples/sec Loss 6.8559 Epoch: 1 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:24:05,132-Speed 4060.23 samples/sec Loss 6.9108 Epoch: 1 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:24:16,759-Speed 4403.69 samples/sec Loss 6.9035 Epoch: 1 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:24:28,575-Speed 4333.28 samples/sec Loss 6.9669 Epoch: 1 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:24:40,142-Speed 4426.35 samples/sec Loss 6.9086 Epoch: 1 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:24:52,814-Speed 4040.69 samples/sec Loss 6.8601 Epoch: 1 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:25:04,489-Speed 4385.49 samples/sec Loss 6.8377 Epoch: 1 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:25:16,402-Speed 4298.24 samples/sec Loss 6.9149 Epoch: 1 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:25:28,039-Speed 4399.84 samples/sec Loss 6.9230 Epoch: 1 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:25:39,757-Speed 4369.52 samples/sec Loss 6.8647 Epoch: 1 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:25:51,252-Speed 4454.39 samples/sec Loss 6.8705 Epoch: 1 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:26:04,001-Speed 4016.00 samples/sec Loss 6.8512 Epoch: 1 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:26:16,307-Speed 4160.93 samples/sec Loss 6.8369 Epoch: 1 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:26:28,278-Speed 4277.01 samples/sec Loss 6.8678 Epoch: 1 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:26:39,899-Speed 4407.04 samples/sec Loss 6.8801 Epoch: 1 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:26:51,659-Speed 4354.25 samples/sec Loss 6.7761 Epoch: 1 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:27:03,430-Speed 4349.82 samples/sec Loss 6.8630 Epoch: 1 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:27:15,532-Speed 4230.69 samples/sec Loss 6.9457 Epoch: 1 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:27:27,060-Speed 4441.48 samples/sec Loss 6.9032 Epoch: 1 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:27:38,949-Speed 4306.76 samples/sec Loss 6.9277 Epoch: 1 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:27:51,403-Speed 4111.34 samples/sec Loss 6.8612 Epoch: 1 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:28:16,314-Speed 2055.41 samples/sec Loss 6.6160 Epoch: 2 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:28:29,045-Speed 4021.80 samples/sec Loss 6.1797 Epoch: 2 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:28:41,261-Speed 4191.43 samples/sec Loss 6.2366 Epoch: 2 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:28:53,278-Speed 4260.86 samples/sec Loss 6.2896 Epoch: 2 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:29:04,882-Speed 4412.34 samples/sec Loss 6.2576 Epoch: 2 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:29:17,629-Speed 4017.00 samples/sec Loss 6.2931 Epoch: 2 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:29:29,430-Speed 4338.80 samples/sec Loss 6.2461 Epoch: 2 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:29:41,250-Speed 4331.88 samples/sec Loss 6.2800 Epoch: 2 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:29:52,696-Speed 4473.21 samples/sec Loss 6.3090 Epoch: 2 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:30:04,436-Speed 4361.39 samples/sec Loss 6.3506 Epoch: 2 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:30:15,984-Speed 4434.09 samples/sec Loss 6.3697 Epoch: 2 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:30:27,594-Speed 4410.05 samples/sec Loss 6.4083 Epoch: 2 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:30:39,019-Speed 4481.58 samples/sec Loss 6.3702 Epoch: 2 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:31:09,158-[lfw][34000]XNorm: 23.223269 Training: 2021-03-15 00:31:09,159-[lfw][34000]Accuracy-Flip: 0.99583+-0.00214 Training: 2021-03-15 00:31:09,159-[lfw][34000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:31:44,316-[cfp_fp][34000]XNorm: 19.646843 Training: 2021-03-15 00:31:44,317-[cfp_fp][34000]Accuracy-Flip: 0.91486+-0.00967 Training: 2021-03-15 00:31:44,317-[cfp_fp][34000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:32:14,591-[agedb_30][34000]XNorm: 22.171036 Training: 2021-03-15 00:32:14,591-[agedb_30][34000]Accuracy-Flip: 0.95233+-0.01160 Training: 2021-03-15 00:32:14,591-[agedb_30][34000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:32:26,078-Speed 478.24 samples/sec Loss 6.4138 Epoch: 2 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:32:37,689-Speed 4409.89 samples/sec Loss 6.3898 Epoch: 2 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:32:49,187-Speed 4452.95 samples/sec Loss 6.3360 Epoch: 2 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:00,583-Speed 4493.37 samples/sec Loss 6.4328 Epoch: 2 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:12,076-Speed 4455.03 samples/sec Loss 6.3999 Epoch: 2 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:23,846-Speed 4350.15 samples/sec Loss 6.4177 Epoch: 2 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:35,399-Speed 4431.82 samples/sec Loss 6.4327 Epoch: 2 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:47,021-Speed 4405.63 samples/sec Loss 6.4024 Epoch: 2 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:33:58,496-Speed 4462.03 samples/sec Loss 6.4977 Epoch: 2 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:34:09,964-Speed 4465.08 samples/sec Loss 6.5365 Epoch: 2 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:34:21,401-Speed 4476.61 samples/sec Loss 6.5194 Epoch: 2 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:34:32,758-Speed 4508.66 samples/sec Loss 6.5219 Epoch: 2 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:34:44,555-Speed 4340.07 samples/sec Loss 6.4935 Epoch: 2 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:34:56,865-Speed 4159.62 samples/sec Loss 6.4899 Epoch: 2 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:35:08,207-Speed 4514.45 samples/sec Loss 6.5521 Epoch: 2 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:35:19,562-Speed 4509.20 samples/sec Loss 6.5205 Epoch: 2 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:35:31,815-Speed 4178.66 samples/sec Loss 6.5399 Epoch: 2 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:35:43,160-Speed 4513.42 samples/sec Loss 6.5705 Epoch: 2 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:35:54,559-Speed 4491.63 samples/sec Loss 6.5048 Epoch: 2 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:36:05,867-Speed 4528.35 samples/sec Loss 6.5477 Epoch: 2 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:36:17,217-Speed 4511.06 samples/sec Loss 6.5206 Epoch: 2 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:36:28,709-Speed 4455.80 samples/sec Loss 6.5761 Epoch: 2 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:36:40,054-Speed 4512.91 samples/sec Loss 6.5873 Epoch: 2 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:36:51,618-Speed 4427.93 samples/sec Loss 6.5594 Epoch: 2 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:37:02,926-Speed 4528.19 samples/sec Loss 6.4853 Epoch: 2 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:37:15,563-Speed 4051.74 samples/sec Loss 6.5602 Epoch: 2 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:37:27,068-Speed 4450.51 samples/sec Loss 6.5843 Epoch: 2 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:37:38,597-Speed 4440.95 samples/sec Loss 6.6065 Epoch: 2 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:37:50,122-Speed 4442.95 samples/sec Loss 6.6198 Epoch: 2 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:38:01,588-Speed 4465.26 samples/sec Loss 6.6232 Epoch: 2 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:38:14,242-Speed 4046.63 samples/sec Loss 6.6667 Epoch: 2 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:38:25,834-Speed 4417.19 samples/sec Loss 6.6658 Epoch: 2 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:38:37,254-Speed 4483.47 samples/sec Loss 6.5998 Epoch: 2 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:38:48,861-Speed 4411.25 samples/sec Loss 6.6487 Epoch: 2 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:39:00,346-Speed 4458.36 samples/sec Loss 6.5964 Epoch: 2 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:39:12,444-Speed 4232.17 samples/sec Loss 6.5892 Epoch: 2 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:39:23,928-Speed 4458.67 samples/sec Loss 6.5882 Epoch: 2 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:39:36,237-Speed 4160.00 samples/sec Loss 6.6305 Epoch: 2 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:39:48,399-Speed 4209.83 samples/sec Loss 6.6424 Epoch: 2 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:40:00,017-Speed 4407.35 samples/sec Loss 6.6744 Epoch: 2 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:40:30,236-[lfw][36000]XNorm: 20.570616 Training: 2021-03-15 00:40:30,236-[lfw][36000]Accuracy-Flip: 0.99500+-0.00236 Training: 2021-03-15 00:40:30,236-[lfw][36000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:41:05,201-[cfp_fp][36000]XNorm: 17.999655 Training: 2021-03-15 00:41:05,202-[cfp_fp][36000]Accuracy-Flip: 0.92800+-0.01520 Training: 2021-03-15 00:41:05,202-[cfp_fp][36000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:41:35,313-[agedb_30][36000]XNorm: 20.378568 Training: 2021-03-15 00:41:35,313-[agedb_30][36000]Accuracy-Flip: 0.95233+-0.01274 Training: 2021-03-15 00:41:35,313-[agedb_30][36000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:41:46,701-Speed 479.92 samples/sec Loss 6.6073 Epoch: 2 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:41:58,992-Speed 4165.83 samples/sec Loss 6.6474 Epoch: 2 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:42:10,462-Speed 4463.95 samples/sec Loss 6.6359 Epoch: 2 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:42:22,134-Speed 4386.58 samples/sec Loss 6.7012 Epoch: 2 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:42:33,740-Speed 4411.78 samples/sec Loss 6.7025 Epoch: 2 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:42:45,285-Speed 4435.18 samples/sec Loss 6.6794 Epoch: 2 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:42:56,767-Speed 4459.49 samples/sec Loss 6.6519 Epoch: 2 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:43:08,358-Speed 4417.23 samples/sec Loss 6.6330 Epoch: 2 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:43:20,014-Speed 4392.93 samples/sec Loss 6.6598 Epoch: 2 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:43:31,396-Speed 4498.69 samples/sec Loss 6.6756 Epoch: 2 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:43:42,960-Speed 4427.70 samples/sec Loss 6.6149 Epoch: 2 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:43:54,488-Speed 4441.66 samples/sec Loss 6.6703 Epoch: 2 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:44:06,065-Speed 4422.58 samples/sec Loss 6.7147 Epoch: 2 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:44:17,334-Speed 4543.64 samples/sec Loss 6.6973 Epoch: 2 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:44:28,779-Speed 4473.72 samples/sec Loss 6.7330 Epoch: 2 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-15 00:44:40,169-Speed 4495.39 samples/sec Loss 6.6937 Epoch: 2 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:44:51,729-Speed 4429.49 samples/sec Loss 6.6738 Epoch: 2 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:45:03,338-Speed 4410.40 samples/sec Loss 6.6778 Epoch: 2 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:45:14,827-Speed 4456.88 samples/sec Loss 6.6994 Epoch: 2 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:45:26,318-Speed 4455.77 samples/sec Loss 6.6364 Epoch: 2 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:45:38,768-Speed 4112.50 samples/sec Loss 6.6529 Epoch: 2 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:45:50,226-Speed 4468.98 samples/sec Loss 6.6580 Epoch: 2 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:46:02,902-Speed 4039.32 samples/sec Loss 6.6522 Epoch: 2 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:46:14,226-Speed 4521.31 samples/sec Loss 6.7501 Epoch: 2 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:46:25,636-Speed 4487.66 samples/sec Loss 6.6698 Epoch: 2 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:46:37,090-Speed 4470.36 samples/sec Loss 6.6985 Epoch: 2 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:46:48,518-Speed 4480.35 samples/sec Loss 6.6713 Epoch: 2 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:00,095-Speed 4422.82 samples/sec Loss 6.7041 Epoch: 2 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:11,649-Speed 4431.59 samples/sec Loss 6.6873 Epoch: 2 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:22,939-Speed 4534.87 samples/sec Loss 6.7393 Epoch: 2 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:34,564-Speed 4404.73 samples/sec Loss 6.6885 Epoch: 2 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:46,011-Speed 4473.06 samples/sec Loss 6.7107 Epoch: 2 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:47:57,264-Speed 4550.28 samples/sec Loss 6.6815 Epoch: 2 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:48:09,597-Speed 4151.39 samples/sec Loss 6.6731 Epoch: 2 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:48:20,863-Speed 4545.14 samples/sec Loss 6.6914 Epoch: 2 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:48:32,300-Speed 4476.82 samples/sec Loss 6.7279 Epoch: 2 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:48:43,629-Speed 4519.67 samples/sec Loss 6.6584 Epoch: 2 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:48:55,057-Speed 4480.57 samples/sec Loss 6.7106 Epoch: 2 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:49:06,619-Speed 4428.55 samples/sec Loss 6.6735 Epoch: 2 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:49:18,145-Speed 4442.51 samples/sec Loss 6.6943 Epoch: 2 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:49:48,353-[lfw][38000]XNorm: 22.817098 Training: 2021-03-15 00:49:48,353-[lfw][38000]Accuracy-Flip: 0.99517+-0.00320 Training: 2021-03-15 00:49:48,353-[lfw][38000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:50:23,212-[cfp_fp][38000]XNorm: 20.055417 Training: 2021-03-15 00:50:23,212-[cfp_fp][38000]Accuracy-Flip: 0.93643+-0.00698 Training: 2021-03-15 00:50:23,213-[cfp_fp][38000]Accuracy-Highest: 0.93700 Training: 2021-03-15 00:50:53,363-[agedb_30][38000]XNorm: 22.155794 Training: 2021-03-15 00:50:53,364-[agedb_30][38000]Accuracy-Flip: 0.94950+-0.01019 Training: 2021-03-15 00:50:53,364-[agedb_30][38000]Accuracy-Highest: 0.95750 Training: 2021-03-15 00:51:04,715-Speed 480.44 samples/sec Loss 6.6643 Epoch: 2 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:51:16,179-Speed 4466.03 samples/sec Loss 6.7380 Epoch: 2 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:51:28,963-Speed 4005.19 samples/sec Loss 6.6832 Epoch: 2 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:51:40,396-Speed 4478.81 samples/sec Loss 6.7429 Epoch: 2 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:51:51,693-Speed 4532.12 samples/sec Loss 6.7047 Epoch: 2 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:52:03,865-Speed 4206.77 samples/sec Loss 6.7242 Epoch: 2 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:52:16,190-Speed 4154.27 samples/sec Loss 6.7390 Epoch: 2 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:52:28,431-Speed 4182.76 samples/sec Loss 6.7370 Epoch: 2 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:52:39,693-Speed 4546.63 samples/sec Loss 6.7663 Epoch: 2 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:52:51,097-Speed 4489.78 samples/sec Loss 6.7138 Epoch: 2 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:53:03,083-Speed 4271.82 samples/sec Loss 6.7123 Epoch: 2 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:53:14,397-Speed 4525.51 samples/sec Loss 6.7074 Epoch: 2 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:53:25,941-Speed 4435.57 samples/sec Loss 6.6982 Epoch: 2 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:53:37,472-Speed 4440.23 samples/sec Loss 6.7732 Epoch: 2 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:53:48,791-Speed 4523.81 samples/sec Loss 6.7558 Epoch: 2 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:00,277-Speed 4457.77 samples/sec Loss 6.7403 Epoch: 2 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:11,855-Speed 4422.39 samples/sec Loss 6.7056 Epoch: 2 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:23,176-Speed 4522.72 samples/sec Loss 6.6428 Epoch: 2 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:34,608-Speed 4478.77 samples/sec Loss 6.6441 Epoch: 2 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:45,943-Speed 4517.39 samples/sec Loss 6.7332 Epoch: 2 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:54:57,477-Speed 4439.04 samples/sec Loss 6.7336 Epoch: 2 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:55:08,913-Speed 4477.63 samples/sec Loss 6.7557 Epoch: 2 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:55:20,278-Speed 4505.02 samples/sec Loss 6.6858 Epoch: 2 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:55:31,559-Speed 4538.87 samples/sec Loss 6.7435 Epoch: 2 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:55:43,017-Speed 4468.57 samples/sec Loss 6.7769 Epoch: 2 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:55:54,308-Speed 4534.78 samples/sec Loss 6.7416 Epoch: 2 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:56:05,735-Speed 4481.08 samples/sec Loss 6.7553 Epoch: 2 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:56:17,367-Speed 4401.76 samples/sec Loss 6.7089 Epoch: 2 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:56:29,444-Speed 4239.82 samples/sec Loss 6.7551 Epoch: 2 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:56:40,803-Speed 4507.71 samples/sec Loss 6.6993 Epoch: 2 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:56:52,222-Speed 4483.81 samples/sec Loss 6.6823 Epoch: 2 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:57:04,524-Speed 4162.23 samples/sec Loss 6.7217 Epoch: 2 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:57:15,938-Speed 4486.09 samples/sec Loss 6.7355 Epoch: 2 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:57:27,457-Speed 4444.94 samples/sec Loss 6.6511 Epoch: 2 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:57:38,754-Speed 4532.49 samples/sec Loss 6.6919 Epoch: 2 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:57:49,999-Speed 4553.35 samples/sec Loss 6.6716 Epoch: 2 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:58:01,252-Speed 4550.03 samples/sec Loss 6.7257 Epoch: 2 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:58:12,493-Speed 4554.85 samples/sec Loss 6.7443 Epoch: 2 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:58:24,140-Speed 4396.46 samples/sec Loss 6.7707 Epoch: 2 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:58:35,493-Speed 4509.80 samples/sec Loss 6.7172 Epoch: 2 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 00:59:05,812-[lfw][40000]XNorm: 22.154830 Training: 2021-03-15 00:59:05,812-[lfw][40000]Accuracy-Flip: 0.99500+-0.00289 Training: 2021-03-15 00:59:05,812-[lfw][40000]Accuracy-Highest: 0.99617 Training: 2021-03-15 00:59:41,025-[cfp_fp][40000]XNorm: 18.652096 Training: 2021-03-15 00:59:41,025-[cfp_fp][40000]Accuracy-Flip: 0.92543+-0.01250 Training: 2021-03-15 00:59:41,025-[cfp_fp][40000]Accuracy-Highest: 0.93700 Training: 2021-03-15 01:00:11,519-[agedb_30][40000]XNorm: 21.494603 Training: 2021-03-15 01:00:11,519-[agedb_30][40000]Accuracy-Flip: 0.95067+-0.00907 Training: 2021-03-15 01:00:11,520-[agedb_30][40000]Accuracy-Highest: 0.95750 Training: 2021-03-15 01:00:22,792-Speed 477.18 samples/sec Loss 6.7375 Epoch: 2 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:00:34,217-Speed 4481.60 samples/sec Loss 6.7467 Epoch: 2 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:00:46,299-Speed 4237.85 samples/sec Loss 6.6918 Epoch: 2 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:00:57,671-Speed 4502.37 samples/sec Loss 6.6861 Epoch: 2 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:01:08,907-Speed 4557.21 samples/sec Loss 6.7089 Epoch: 2 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:01:20,450-Speed 4435.79 samples/sec Loss 6.6695 Epoch: 2 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:01:31,950-Speed 4452.09 samples/sec Loss 6.7001 Epoch: 2 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:01:43,271-Speed 4522.88 samples/sec Loss 6.7330 Epoch: 2 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:01:54,741-Speed 4464.02 samples/sec Loss 6.7136 Epoch: 2 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:02:05,990-Speed 4551.96 samples/sec Loss 6.7491 Epoch: 2 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:02:17,429-Speed 4475.91 samples/sec Loss 6.7223 Epoch: 2 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:02:28,669-Speed 4555.33 samples/sec Loss 6.7373 Epoch: 2 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:02:39,925-Speed 4549.06 samples/sec Loss 6.7670 Epoch: 2 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:02:52,117-Speed 4199.79 samples/sec Loss 6.7244 Epoch: 2 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:03:03,739-Speed 4405.62 samples/sec Loss 6.7795 Epoch: 2 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:03:16,936-Speed 3879.68 samples/sec Loss 6.6796 Epoch: 2 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:03:29,316-Speed 4135.83 samples/sec Loss 6.7528 Epoch: 2 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:03:40,721-Speed 4489.67 samples/sec Loss 6.7082 Epoch: 2 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:03:51,909-Speed 4576.58 samples/sec Loss 6.7541 Epoch: 2 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:04:03,364-Speed 4469.63 samples/sec Loss 6.7344 Epoch: 2 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:04:14,602-Speed 4556.58 samples/sec Loss 6.7074 Epoch: 2 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:04:26,890-Speed 4166.79 samples/sec Loss 6.7116 Epoch: 2 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:04:38,421-Speed 4440.52 samples/sec Loss 6.6800 Epoch: 2 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:04:49,672-Speed 4550.64 samples/sec Loss 6.7027 Epoch: 2 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:00,994-Speed 4522.69 samples/sec Loss 6.7426 Epoch: 2 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:12,270-Speed 4540.97 samples/sec Loss 6.7440 Epoch: 2 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:23,747-Speed 4461.34 samples/sec Loss 6.7095 Epoch: 2 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:35,203-Speed 4469.20 samples/sec Loss 6.7548 Epoch: 2 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:46,540-Speed 4516.53 samples/sec Loss 6.7046 Epoch: 2 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:05:57,922-Speed 4498.47 samples/sec Loss 6.6938 Epoch: 2 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:06:09,180-Speed 4548.30 samples/sec Loss 6.7260 Epoch: 2 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:06:20,676-Speed 4453.94 samples/sec Loss 6.7262 Epoch: 2 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:06:32,093-Speed 4484.80 samples/sec Loss 6.6891 Epoch: 2 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:06:43,448-Speed 4509.16 samples/sec Loss 6.7552 Epoch: 2 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:06:54,921-Speed 4462.95 samples/sec Loss 6.7090 Epoch: 2 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:07:06,527-Speed 4411.69 samples/sec Loss 6.6781 Epoch: 2 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:07:18,886-Speed 4142.85 samples/sec Loss 6.6933 Epoch: 2 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:07:30,415-Speed 4441.43 samples/sec Loss 6.7474 Epoch: 2 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:07:42,019-Speed 4412.64 samples/sec Loss 6.7302 Epoch: 2 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:07:53,363-Speed 4513.91 samples/sec Loss 6.7123 Epoch: 2 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:08:23,748-[lfw][42000]XNorm: 22.260112 Training: 2021-03-15 01:08:23,748-[lfw][42000]Accuracy-Flip: 0.99567+-0.00300 Training: 2021-03-15 01:08:23,748-[lfw][42000]Accuracy-Highest: 0.99617 Training: 2021-03-15 01:08:58,954-[cfp_fp][42000]XNorm: 19.033072 Training: 2021-03-15 01:08:58,955-[cfp_fp][42000]Accuracy-Flip: 0.93786+-0.01272 Training: 2021-03-15 01:08:58,955-[cfp_fp][42000]Accuracy-Highest: 0.93786 Training: 2021-03-15 01:09:29,349-[agedb_30][42000]XNorm: 21.351666 Training: 2021-03-15 01:09:29,350-[agedb_30][42000]Accuracy-Flip: 0.95167+-0.00904 Training: 2021-03-15 01:09:29,350-[agedb_30][42000]Accuracy-Highest: 0.95750 Training: 2021-03-15 01:09:40,664-Speed 477.16 samples/sec Loss 6.6863 Epoch: 2 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:09:51,950-Speed 4536.83 samples/sec Loss 6.7562 Epoch: 2 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:10:04,209-Speed 4176.55 samples/sec Loss 6.7501 Epoch: 2 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:10:15,562-Speed 4510.30 samples/sec Loss 6.7016 Epoch: 2 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:10:26,910-Speed 4512.15 samples/sec Loss 6.6736 Epoch: 2 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:10:38,284-Speed 4501.72 samples/sec Loss 6.7036 Epoch: 2 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:10:49,785-Speed 4452.03 samples/sec Loss 6.6647 Epoch: 2 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:01,177-Speed 4494.55 samples/sec Loss 6.7530 Epoch: 2 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:12,608-Speed 4479.06 samples/sec Loss 6.6931 Epoch: 2 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:24,160-Speed 4432.42 samples/sec Loss 6.6792 Epoch: 2 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:35,600-Speed 4475.61 samples/sec Loss 6.6838 Epoch: 2 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:47,311-Speed 4372.16 samples/sec Loss 6.7151 Epoch: 2 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:11:59,780-Speed 4106.51 samples/sec Loss 6.7616 Epoch: 2 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:12:11,298-Speed 4445.23 samples/sec Loss 6.7379 Epoch: 2 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:12:22,780-Speed 4459.53 samples/sec Loss 6.6442 Epoch: 2 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:12:34,126-Speed 4512.90 samples/sec Loss 6.7285 Epoch: 2 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:12:45,640-Speed 4446.99 samples/sec Loss 6.6708 Epoch: 2 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:12:57,115-Speed 4462.20 samples/sec Loss 6.6824 Epoch: 2 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:13:08,646-Speed 4440.32 samples/sec Loss 6.7075 Epoch: 2 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:13:19,931-Speed 4537.32 samples/sec Loss 6.7266 Epoch: 2 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:13:31,321-Speed 4495.53 samples/sec Loss 6.7772 Epoch: 2 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:13:42,652-Speed 4518.50 samples/sec Loss 6.6758 Epoch: 2 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:13:53,976-Speed 4521.54 samples/sec Loss 6.6779 Epoch: 2 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:14:06,130-Speed 4212.90 samples/sec Loss 6.6789 Epoch: 2 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:14:18,373-Speed 4182.18 samples/sec Loss 6.7157 Epoch: 2 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:14:31,538-Speed 3889.35 samples/sec Loss 6.6736 Epoch: 2 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:14:43,040-Speed 4451.56 samples/sec Loss 6.7317 Epoch: 2 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:14:54,332-Speed 4534.47 samples/sec Loss 6.7425 Epoch: 2 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:15:05,687-Speed 4509.14 samples/sec Loss 6.6712 Epoch: 2 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:15:17,375-Speed 4380.80 samples/sec Loss 6.7187 Epoch: 2 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:15:28,567-Speed 4575.05 samples/sec Loss 6.7276 Epoch: 2 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:15:40,948-Speed 4135.54 samples/sec Loss 6.6802 Epoch: 2 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:15:52,474-Speed 4442.26 samples/sec Loss 6.7245 Epoch: 2 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:16:03,817-Speed 4514.27 samples/sec Loss 6.7491 Epoch: 2 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:16:15,168-Speed 4510.83 samples/sec Loss 6.6686 Epoch: 2 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:16:26,559-Speed 4495.04 samples/sec Loss 6.6942 Epoch: 2 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:16:37,920-Speed 4506.73 samples/sec Loss 6.7054 Epoch: 2 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:16:49,381-Speed 4467.54 samples/sec Loss 6.7041 Epoch: 2 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:17:00,962-Speed 4421.28 samples/sec Loss 6.6784 Epoch: 2 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:17:12,312-Speed 4511.26 samples/sec Loss 6.6241 Epoch: 2 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:17:42,732-[lfw][44000]XNorm: 22.897952 Training: 2021-03-15 01:17:42,732-[lfw][44000]Accuracy-Flip: 0.99650+-0.00263 Training: 2021-03-15 01:17:42,733-[lfw][44000]Accuracy-Highest: 0.99650 Training: 2021-03-15 01:18:17,857-[cfp_fp][44000]XNorm: 19.490823 Training: 2021-03-15 01:18:17,857-[cfp_fp][44000]Accuracy-Flip: 0.93857+-0.00894 Training: 2021-03-15 01:18:17,857-[cfp_fp][44000]Accuracy-Highest: 0.93857 Training: 2021-03-15 01:18:48,094-[agedb_30][44000]XNorm: 21.792187 Training: 2021-03-15 01:18:48,095-[agedb_30][44000]Accuracy-Flip: 0.96083+-0.01216 Training: 2021-03-15 01:18:48,095-[agedb_30][44000]Accuracy-Highest: 0.96083 Training: 2021-03-15 01:18:59,501-Speed 477.66 samples/sec Loss 6.6795 Epoch: 2 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:19:10,913-Speed 4486.73 samples/sec Loss 6.7185 Epoch: 2 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:19:23,057-Speed 4216.25 samples/sec Loss 6.7046 Epoch: 2 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:19:34,461-Speed 4489.94 samples/sec Loss 6.6442 Epoch: 2 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:19:45,555-Speed 4615.22 samples/sec Loss 6.6686 Epoch: 2 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:19:57,061-Speed 4449.89 samples/sec Loss 6.7126 Epoch: 2 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:20:08,376-Speed 4525.28 samples/sec Loss 6.7436 Epoch: 2 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:20:19,599-Speed 4562.51 samples/sec Loss 6.6922 Epoch: 2 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:20:30,881-Speed 4538.46 samples/sec Loss 6.7226 Epoch: 2 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:20:42,225-Speed 4513.49 samples/sec Loss 6.6535 Epoch: 2 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:20:53,672-Speed 4472.99 samples/sec Loss 6.6582 Epoch: 2 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:21:04,751-Speed 4621.60 samples/sec Loss 6.6719 Epoch: 2 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:21:16,860-Speed 4228.70 samples/sec Loss 6.6691 Epoch: 2 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:21:28,194-Speed 4517.54 samples/sec Loss 6.7078 Epoch: 2 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:21:39,494-Speed 4531.27 samples/sec Loss 6.6568 Epoch: 2 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:21:50,849-Speed 4509.36 samples/sec Loss 6.6078 Epoch: 2 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:02,294-Speed 4473.68 samples/sec Loss 6.7068 Epoch: 2 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:13,809-Speed 4446.45 samples/sec Loss 6.6579 Epoch: 2 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:25,172-Speed 4506.11 samples/sec Loss 6.6719 Epoch: 2 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:36,503-Speed 4519.14 samples/sec Loss 6.7335 Epoch: 2 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:47,706-Speed 4570.38 samples/sec Loss 6.7057 Epoch: 2 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:22:58,983-Speed 4540.36 samples/sec Loss 6.6452 Epoch: 2 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:23:11,048-Speed 4243.89 samples/sec Loss 6.6623 Epoch: 2 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:23:22,248-Speed 4571.81 samples/sec Loss 6.6889 Epoch: 2 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:23:33,775-Speed 4442.14 samples/sec Loss 6.6972 Epoch: 2 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:23:45,066-Speed 4534.51 samples/sec Loss 6.6658 Epoch: 2 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:23:56,371-Speed 4529.18 samples/sec Loss 6.6653 Epoch: 2 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:24:07,780-Speed 4488.18 samples/sec Loss 6.6838 Epoch: 2 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:24:19,188-Speed 4488.01 samples/sec Loss 6.6469 Epoch: 2 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:24:30,679-Speed 4456.31 samples/sec Loss 6.6847 Epoch: 2 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:24:41,973-Speed 4533.58 samples/sec Loss 6.7398 Epoch: 2 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:24:53,393-Speed 4483.34 samples/sec Loss 6.6943 Epoch: 2 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:25:05,455-Speed 4244.92 samples/sec Loss 6.6513 Epoch: 2 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:25:16,744-Speed 4535.90 samples/sec Loss 6.7070 Epoch: 2 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:25:28,079-Speed 4517.13 samples/sec Loss 6.6443 Epoch: 2 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:25:40,227-Speed 4214.85 samples/sec Loss 6.5963 Epoch: 2 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:25:53,211-Speed 3943.34 samples/sec Loss 6.6499 Epoch: 2 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:26:04,697-Speed 4458.06 samples/sec Loss 6.6672 Epoch: 2 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:26:16,040-Speed 4514.08 samples/sec Loss 6.7393 Epoch: 2 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:26:27,453-Speed 4486.57 samples/sec Loss 6.7006 Epoch: 2 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:26:57,771-[lfw][46000]XNorm: 22.914506 Training: 2021-03-15 01:26:57,771-[lfw][46000]Accuracy-Flip: 0.99517+-0.00329 Training: 2021-03-15 01:26:57,771-[lfw][46000]Accuracy-Highest: 0.99650 Training: 2021-03-15 01:27:32,982-[cfp_fp][46000]XNorm: 19.014507 Training: 2021-03-15 01:27:32,982-[cfp_fp][46000]Accuracy-Flip: 0.92786+-0.01041 Training: 2021-03-15 01:27:32,982-[cfp_fp][46000]Accuracy-Highest: 0.93857 Training: 2021-03-15 01:28:03,249-[agedb_30][46000]XNorm: 22.407216 Training: 2021-03-15 01:28:03,250-[agedb_30][46000]Accuracy-Flip: 0.95500+-0.01041 Training: 2021-03-15 01:28:03,250-[agedb_30][46000]Accuracy-Highest: 0.96083 Training: 2021-03-15 01:28:14,752-Speed 477.17 samples/sec Loss 6.6068 Epoch: 2 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:28:26,087-Speed 4517.12 samples/sec Loss 6.6467 Epoch: 2 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:28:38,220-Speed 4220.29 samples/sec Loss 6.6218 Epoch: 2 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:28:49,682-Speed 4467.22 samples/sec Loss 6.6873 Epoch: 2 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:01,174-Speed 4455.64 samples/sec Loss 6.6778 Epoch: 2 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:12,624-Speed 4471.85 samples/sec Loss 6.6753 Epoch: 2 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:24,108-Speed 4458.83 samples/sec Loss 6.6462 Epoch: 2 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:35,583-Speed 4462.02 samples/sec Loss 6.6063 Epoch: 2 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:46,747-Speed 4586.71 samples/sec Loss 6.6514 Epoch: 2 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:29:58,226-Speed 4460.58 samples/sec Loss 6.6345 Epoch: 2 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:30:09,518-Speed 4534.25 samples/sec Loss 6.6674 Epoch: 2 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:30:20,679-Speed 4587.54 samples/sec Loss 6.6455 Epoch: 2 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:30:32,005-Speed 4520.95 samples/sec Loss 6.6619 Epoch: 2 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:30:44,418-Speed 4124.88 samples/sec Loss 6.6835 Epoch: 2 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:30:55,758-Speed 4515.44 samples/sec Loss 6.6259 Epoch: 2 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:31:07,021-Speed 4546.11 samples/sec Loss 6.7076 Epoch: 2 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:31:18,301-Speed 4539.16 samples/sec Loss 6.6665 Epoch: 2 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:31:29,846-Speed 4435.16 samples/sec Loss 6.5896 Epoch: 2 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:31:41,236-Speed 4495.47 samples/sec Loss 6.6127 Epoch: 2 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:31:52,670-Speed 4478.15 samples/sec Loss 6.6413 Epoch: 2 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:32:04,131-Speed 4467.23 samples/sec Loss 6.6982 Epoch: 2 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:32:16,333-Speed 4196.34 samples/sec Loss 6.6503 Epoch: 2 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:32:27,679-Speed 4512.68 samples/sec Loss 6.6417 Epoch: 2 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:32:39,087-Speed 4488.66 samples/sec Loss 6.6513 Epoch: 2 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:32:50,611-Speed 4442.98 samples/sec Loss 6.6909 Epoch: 2 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:33:02,026-Speed 4485.63 samples/sec Loss 6.6784 Epoch: 2 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:33:13,367-Speed 4514.79 samples/sec Loss 6.6895 Epoch: 2 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:33:24,811-Speed 4474.23 samples/sec Loss 6.6692 Epoch: 2 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:33:36,486-Speed 4385.58 samples/sec Loss 6.6315 Epoch: 2 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:33:47,933-Speed 4473.14 samples/sec Loss 6.6511 Epoch: 2 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:33:59,512-Speed 4421.70 samples/sec Loss 6.6674 Epoch: 2 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:34:10,912-Speed 4491.56 samples/sec Loss 6.6704 Epoch: 2 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:34:23,296-Speed 4134.76 samples/sec Loss 6.6957 Epoch: 2 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:34:34,752-Speed 4469.47 samples/sec Loss 6.6657 Epoch: 2 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:34:46,199-Speed 4472.98 samples/sec Loss 6.6616 Epoch: 2 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:34:57,448-Speed 4551.72 samples/sec Loss 6.6950 Epoch: 2 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:35:08,737-Speed 4535.85 samples/sec Loss 6.6875 Epoch: 2 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:35:20,212-Speed 4461.85 samples/sec Loss 6.6784 Epoch: 2 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:35:31,637-Speed 4481.60 samples/sec Loss 6.6762 Epoch: 2 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:35:43,080-Speed 4474.68 samples/sec Loss 6.6133 Epoch: 2 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:36:13,216-[lfw][48000]XNorm: 22.881096 Training: 2021-03-15 01:36:13,216-[lfw][48000]Accuracy-Flip: 0.99467+-0.00267 Training: 2021-03-15 01:36:13,216-[lfw][48000]Accuracy-Highest: 0.99650 Training: 2021-03-15 01:36:48,268-[cfp_fp][48000]XNorm: 18.981893 Training: 2021-03-15 01:36:48,269-[cfp_fp][48000]Accuracy-Flip: 0.93600+-0.00971 Training: 2021-03-15 01:36:48,269-[cfp_fp][48000]Accuracy-Highest: 0.93857 Training: 2021-03-15 01:37:18,400-[agedb_30][48000]XNorm: 22.124089 Training: 2021-03-15 01:37:18,401-[agedb_30][48000]Accuracy-Flip: 0.95517+-0.00883 Training: 2021-03-15 01:37:18,401-[agedb_30][48000]Accuracy-Highest: 0.96083 Training: 2021-03-15 01:37:29,898-Speed 479.32 samples/sec Loss 6.5774 Epoch: 2 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:37:41,341-Speed 4474.57 samples/sec Loss 6.6146 Epoch: 2 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:37:52,765-Speed 4481.74 samples/sec Loss 6.6525 Epoch: 2 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:38:04,985-Speed 4190.44 samples/sec Loss 6.6752 Epoch: 2 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:38:18,016-Speed 3929.36 samples/sec Loss 6.6300 Epoch: 2 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:38:29,404-Speed 4495.93 samples/sec Loss 6.6613 Epoch: 2 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:38:41,836-Speed 4118.58 samples/sec Loss 6.6094 Epoch: 2 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-15 01:38:53,420-Speed 4420.12 samples/sec Loss 6.6214 Epoch: 2 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:39:04,698-Speed 4540.26 samples/sec Loss 6.6035 Epoch: 2 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:39:15,959-Speed 4546.82 samples/sec Loss 6.6246 Epoch: 2 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:39:27,605-Speed 4396.70 samples/sec Loss 6.6149 Epoch: 2 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:39:39,308-Speed 4375.01 samples/sec Loss 6.6167 Epoch: 2 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:39:51,459-Speed 4214.00 samples/sec Loss 6.6511 Epoch: 2 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:02,867-Speed 4488.31 samples/sec Loss 6.6062 Epoch: 2 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:14,197-Speed 4519.26 samples/sec Loss 6.6501 Epoch: 2 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:25,530-Speed 4517.80 samples/sec Loss 6.5944 Epoch: 2 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:36,897-Speed 4504.54 samples/sec Loss 6.6432 Epoch: 2 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:48,385-Speed 4457.20 samples/sec Loss 6.6366 Epoch: 2 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:40:59,727-Speed 4514.27 samples/sec Loss 6.6075 Epoch: 2 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:41:11,044-Speed 4524.38 samples/sec Loss 6.6197 Epoch: 2 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:41:22,429-Speed 4497.57 samples/sec Loss 6.6056 Epoch: 2 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:41:33,817-Speed 4495.85 samples/sec Loss 6.6312 Epoch: 2 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:41:46,017-Speed 4196.97 samples/sec Loss 6.6568 Epoch: 2 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:41:57,598-Speed 4421.40 samples/sec Loss 6.6407 Epoch: 2 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:42:09,007-Speed 4487.74 samples/sec Loss 6.6028 Epoch: 2 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:42:20,444-Speed 4476.84 samples/sec Loss 6.6573 Epoch: 2 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:42:32,053-Speed 4410.73 samples/sec Loss 6.6166 Epoch: 2 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:42:43,419-Speed 4504.77 samples/sec Loss 6.6593 Epoch: 2 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:42:54,915-Speed 4454.04 samples/sec Loss 6.6486 Epoch: 2 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:43:06,320-Speed 4489.85 samples/sec Loss 6.6076 Epoch: 2 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:43:18,551-Speed 4186.13 samples/sec Loss 6.6561 Epoch: 2 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:43:29,868-Speed 4524.27 samples/sec Loss 6.6446 Epoch: 2 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:43:41,275-Speed 4488.87 samples/sec Loss 6.6685 Epoch: 2 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:43:52,606-Speed 4518.72 samples/sec Loss 6.6496 Epoch: 2 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:44:04,164-Speed 4430.06 samples/sec Loss 6.6786 Epoch: 2 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:44:15,792-Speed 4403.38 samples/sec Loss 6.6522 Epoch: 2 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:44:27,251-Speed 4468.35 samples/sec Loss 6.6108 Epoch: 2 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:44:38,554-Speed 4530.18 samples/sec Loss 6.5940 Epoch: 2 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:44:49,934-Speed 4499.11 samples/sec Loss 6.6030 Epoch: 2 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:45:01,364-Speed 4479.87 samples/sec Loss 6.6291 Epoch: 2 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:45:31,435-[lfw][50000]XNorm: 23.320279 Training: 2021-03-15 01:45:31,436-[lfw][50000]Accuracy-Flip: 0.99583+-0.00327 Training: 2021-03-15 01:45:31,436-[lfw][50000]Accuracy-Highest: 0.99650 Training: 2021-03-15 01:46:06,390-[cfp_fp][50000]XNorm: 20.119425 Training: 2021-03-15 01:46:06,390-[cfp_fp][50000]Accuracy-Flip: 0.92843+-0.01093 Training: 2021-03-15 01:46:06,392-[cfp_fp][50000]Accuracy-Highest: 0.93857 Training: 2021-03-15 01:46:36,475-[agedb_30][50000]XNorm: 22.347532 Training: 2021-03-15 01:46:36,475-[agedb_30][50000]Accuracy-Flip: 0.95200+-0.01087 Training: 2021-03-15 01:46:36,477-[agedb_30][50000]Accuracy-Highest: 0.96083 Training: 2021-03-15 01:46:47,885-Speed 480.66 samples/sec Loss 6.6844 Epoch: 2 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:47:12,769-Speed 2057.57 samples/sec Loss 6.2389 Epoch: 3 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:47:25,748-Speed 3945.23 samples/sec Loss 5.9183 Epoch: 3 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:47:37,301-Speed 4431.99 samples/sec Loss 5.9489 Epoch: 3 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:47:48,678-Speed 4500.61 samples/sec Loss 5.9715 Epoch: 3 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:00,034-Speed 4509.29 samples/sec Loss 5.9643 Epoch: 3 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:11,621-Speed 4418.75 samples/sec Loss 6.0014 Epoch: 3 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:23,049-Speed 4480.56 samples/sec Loss 6.0500 Epoch: 3 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:34,529-Speed 4460.18 samples/sec Loss 6.0379 Epoch: 3 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:46,166-Speed 4400.13 samples/sec Loss 6.0671 Epoch: 3 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:48:57,716-Speed 4432.80 samples/sec Loss 6.0976 Epoch: 3 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:49:09,151-Speed 4477.73 samples/sec Loss 6.1134 Epoch: 3 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:49:21,263-Speed 4227.48 samples/sec Loss 6.1419 Epoch: 3 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:49:33,927-Speed 4043.31 samples/sec Loss 6.0505 Epoch: 3 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:49:45,178-Speed 4551.04 samples/sec Loss 6.1251 Epoch: 3 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:49:58,397-Speed 3873.31 samples/sec Loss 6.1269 Epoch: 3 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:50:09,847-Speed 4471.83 samples/sec Loss 6.2183 Epoch: 3 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:50:21,170-Speed 4521.97 samples/sec Loss 6.2205 Epoch: 3 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:50:32,428-Speed 4547.79 samples/sec Loss 6.1690 Epoch: 3 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:50:43,871-Speed 4474.72 samples/sec Loss 6.2108 Epoch: 3 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:50:55,205-Speed 4517.77 samples/sec Loss 6.2185 Epoch: 3 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:51:07,497-Speed 4165.23 samples/sec Loss 6.2162 Epoch: 3 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:51:18,986-Speed 4456.95 samples/sec Loss 6.2302 Epoch: 3 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:51:30,395-Speed 4487.58 samples/sec Loss 6.2368 Epoch: 3 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:51:41,928-Speed 4439.86 samples/sec Loss 6.2610 Epoch: 3 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:51:53,230-Speed 4530.65 samples/sec Loss 6.2251 Epoch: 3 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:52:04,712-Speed 4459.30 samples/sec Loss 6.2469 Epoch: 3 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:52:16,265-Speed 4431.81 samples/sec Loss 6.2870 Epoch: 3 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:52:27,900-Speed 4400.85 samples/sec Loss 6.2902 Epoch: 3 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:52:39,453-Speed 4431.73 samples/sec Loss 6.3223 Epoch: 3 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:52:51,068-Speed 4408.30 samples/sec Loss 6.2786 Epoch: 3 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:53:03,605-Speed 4084.32 samples/sec Loss 6.3185 Epoch: 3 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:53:14,877-Speed 4542.21 samples/sec Loss 6.2959 Epoch: 3 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:53:26,337-Speed 4468.03 samples/sec Loss 6.2735 Epoch: 3 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:53:37,732-Speed 4493.40 samples/sec Loss 6.2778 Epoch: 3 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:53:49,322-Speed 4418.04 samples/sec Loss 6.2854 Epoch: 3 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:54:00,843-Speed 4444.34 samples/sec Loss 6.3482 Epoch: 3 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:54:12,215-Speed 4502.30 samples/sec Loss 6.3642 Epoch: 3 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:54:23,530-Speed 4525.19 samples/sec Loss 6.3767 Epoch: 3 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:54:35,035-Speed 4450.61 samples/sec Loss 6.3169 Epoch: 3 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:55:05,089-[lfw][52000]XNorm: 22.031309 Training: 2021-03-15 01:55:05,089-[lfw][52000]Accuracy-Flip: 0.99633+-0.00296 Training: 2021-03-15 01:55:05,089-[lfw][52000]Accuracy-Highest: 0.99650 Training: 2021-03-15 01:55:40,141-[cfp_fp][52000]XNorm: 18.503157 Training: 2021-03-15 01:55:40,141-[cfp_fp][52000]Accuracy-Flip: 0.93700+-0.00799 Training: 2021-03-15 01:55:40,142-[cfp_fp][52000]Accuracy-Highest: 0.93857 Training: 2021-03-15 01:56:10,422-[agedb_30][52000]XNorm: 21.467481 Training: 2021-03-15 01:56:10,422-[agedb_30][52000]Accuracy-Flip: 0.95333+-0.01378 Training: 2021-03-15 01:56:10,422-[agedb_30][52000]Accuracy-Highest: 0.96083 Training: 2021-03-15 01:56:22,729-Speed 475.42 samples/sec Loss 6.3724 Epoch: 3 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:56:34,103-Speed 4501.80 samples/sec Loss 6.3364 Epoch: 3 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:56:45,749-Speed 4396.28 samples/sec Loss 6.3421 Epoch: 3 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:56:57,363-Speed 4408.62 samples/sec Loss 6.3732 Epoch: 3 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:57:08,669-Speed 4528.98 samples/sec Loss 6.3300 Epoch: 3 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:57:20,175-Speed 4450.19 samples/sec Loss 6.3319 Epoch: 3 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:57:31,452-Speed 4540.13 samples/sec Loss 6.4012 Epoch: 3 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:57:43,022-Speed 4425.52 samples/sec Loss 6.3509 Epoch: 3 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:57:54,337-Speed 4525.39 samples/sec Loss 6.3531 Epoch: 3 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:58:05,815-Speed 4460.99 samples/sec Loss 6.3749 Epoch: 3 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:58:17,258-Speed 4474.84 samples/sec Loss 6.3810 Epoch: 3 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:58:28,883-Speed 4404.44 samples/sec Loss 6.3687 Epoch: 3 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:58:41,080-Speed 4197.82 samples/sec Loss 6.4106 Epoch: 3 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:58:52,379-Speed 4531.75 samples/sec Loss 6.4177 Epoch: 3 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:59:03,774-Speed 4493.43 samples/sec Loss 6.3955 Epoch: 3 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:59:15,214-Speed 4475.47 samples/sec Loss 6.3942 Epoch: 3 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:59:26,734-Speed 4444.61 samples/sec Loss 6.4489 Epoch: 3 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:59:38,257-Speed 4443.76 samples/sec Loss 6.4010 Epoch: 3 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 01:59:49,662-Speed 4489.30 samples/sec Loss 6.4217 Epoch: 3 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:00:01,045-Speed 4498.06 samples/sec Loss 6.3908 Epoch: 3 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:00:12,552-Speed 4450.00 samples/sec Loss 6.3996 Epoch: 3 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:00:24,059-Speed 4449.54 samples/sec Loss 6.3873 Epoch: 3 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:00:35,432-Speed 4502.16 samples/sec Loss 6.4064 Epoch: 3 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:00:48,597-Speed 3889.37 samples/sec Loss 6.3933 Epoch: 3 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:00,871-Speed 4171.44 samples/sec Loss 6.4523 Epoch: 3 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:13,031-Speed 4210.75 samples/sec Loss 6.4568 Epoch: 3 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:24,542-Speed 4448.43 samples/sec Loss 6.5042 Epoch: 3 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:35,981-Speed 4476.20 samples/sec Loss 6.4748 Epoch: 3 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:47,385-Speed 4489.83 samples/sec Loss 6.4743 Epoch: 3 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:01:58,870-Speed 4458.09 samples/sec Loss 6.4707 Epoch: 3 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:02:10,379-Speed 4448.90 samples/sec Loss 6.4697 Epoch: 3 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:02:21,722-Speed 4514.02 samples/sec Loss 6.4111 Epoch: 3 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:02:34,120-Speed 4130.17 samples/sec Loss 6.5018 Epoch: 3 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:02:45,540-Speed 4483.64 samples/sec Loss 6.4216 Epoch: 3 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:02:56,916-Speed 4500.77 samples/sec Loss 6.4575 Epoch: 3 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:03:08,398-Speed 4459.18 samples/sec Loss 6.4134 Epoch: 3 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:03:19,614-Speed 4565.31 samples/sec Loss 6.4233 Epoch: 3 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:03:31,014-Speed 4491.78 samples/sec Loss 6.4221 Epoch: 3 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:03:42,349-Speed 4516.98 samples/sec Loss 6.4435 Epoch: 3 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:03:53,832-Speed 4458.93 samples/sec Loss 6.4291 Epoch: 3 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:04:24,089-[lfw][54000]XNorm: 22.720136 Training: 2021-03-15 02:04:24,089-[lfw][54000]Accuracy-Flip: 0.99567+-0.00281 Training: 2021-03-15 02:04:24,090-[lfw][54000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:04:59,084-[cfp_fp][54000]XNorm: 19.108253 Training: 2021-03-15 02:04:59,084-[cfp_fp][54000]Accuracy-Flip: 0.92971+-0.00830 Training: 2021-03-15 02:04:59,084-[cfp_fp][54000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:05:29,299-[agedb_30][54000]XNorm: 22.336335 Training: 2021-03-15 02:05:29,299-[agedb_30][54000]Accuracy-Flip: 0.94850+-0.00794 Training: 2021-03-15 02:05:29,299-[agedb_30][54000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:05:40,647-Speed 479.34 samples/sec Loss 6.4873 Epoch: 3 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:05:53,041-Speed 4131.04 samples/sec Loss 6.4870 Epoch: 3 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:06:04,599-Speed 4430.11 samples/sec Loss 6.4877 Epoch: 3 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:06:16,060-Speed 4467.64 samples/sec Loss 6.4398 Epoch: 3 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:06:27,451-Speed 4495.13 samples/sec Loss 6.4867 Epoch: 3 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:06:38,860-Speed 4487.66 samples/sec Loss 6.4369 Epoch: 3 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:06:50,436-Speed 4423.26 samples/sec Loss 6.4610 Epoch: 3 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:01,915-Speed 4460.48 samples/sec Loss 6.4760 Epoch: 3 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:13,385-Speed 4464.04 samples/sec Loss 6.4744 Epoch: 3 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:25,616-Speed 4186.30 samples/sec Loss 6.4936 Epoch: 3 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:36,953-Speed 4516.26 samples/sec Loss 6.4601 Epoch: 3 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:48,391-Speed 4476.51 samples/sec Loss 6.5042 Epoch: 3 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:07:59,906-Speed 4446.77 samples/sec Loss 6.4752 Epoch: 3 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:08:11,347-Speed 4475.43 samples/sec Loss 6.4649 Epoch: 3 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:08:22,823-Speed 4461.57 samples/sec Loss 6.4738 Epoch: 3 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:08:34,314-Speed 4456.04 samples/sec Loss 6.4772 Epoch: 3 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:08:45,695-Speed 4498.79 samples/sec Loss 6.4714 Epoch: 3 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:08:57,103-Speed 4488.41 samples/sec Loss 6.4787 Epoch: 3 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:09:08,600-Speed 4453.51 samples/sec Loss 6.4692 Epoch: 3 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:09:19,901-Speed 4530.43 samples/sec Loss 6.5290 Epoch: 3 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:09:31,367-Speed 4465.91 samples/sec Loss 6.4465 Epoch: 3 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:09:43,764-Speed 4130.18 samples/sec Loss 6.5147 Epoch: 3 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:09:55,257-Speed 4454.86 samples/sec Loss 6.4895 Epoch: 3 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:10:06,560-Speed 4530.31 samples/sec Loss 6.5079 Epoch: 3 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:10:17,975-Speed 4485.45 samples/sec Loss 6.5109 Epoch: 3 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:10:29,436-Speed 4467.90 samples/sec Loss 6.5039 Epoch: 3 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:10:40,929-Speed 4455.08 samples/sec Loss 6.5697 Epoch: 3 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:10:52,167-Speed 4556.00 samples/sec Loss 6.4672 Epoch: 3 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:11:03,648-Speed 4459.94 samples/sec Loss 6.5416 Epoch: 3 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:11:14,902-Speed 4549.51 samples/sec Loss 6.5254 Epoch: 3 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:11:26,295-Speed 4494.46 samples/sec Loss 6.4715 Epoch: 3 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:11:37,784-Speed 4456.37 samples/sec Loss 6.5201 Epoch: 3 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:11:49,941-Speed 4212.00 samples/sec Loss 6.4819 Epoch: 3 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:12:02,056-Speed 4226.26 samples/sec Loss 6.4453 Epoch: 3 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:12:14,353-Speed 4163.79 samples/sec Loss 6.5004 Epoch: 3 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:12:26,548-Speed 4198.54 samples/sec Loss 6.5378 Epoch: 3 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:12:37,953-Speed 4489.44 samples/sec Loss 6.4905 Epoch: 3 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:12:49,413-Speed 4468.15 samples/sec Loss 6.4952 Epoch: 3 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:13:01,056-Speed 4397.64 samples/sec Loss 6.5168 Epoch: 3 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:13:12,592-Speed 4438.35 samples/sec Loss 6.5682 Epoch: 3 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:13:42,947-[lfw][56000]XNorm: 23.321237 Training: 2021-03-15 02:13:42,947-[lfw][56000]Accuracy-Flip: 0.99633+-0.00287 Training: 2021-03-15 02:13:42,947-[lfw][56000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:14:17,844-[cfp_fp][56000]XNorm: 19.507751 Training: 2021-03-15 02:14:17,844-[cfp_fp][56000]Accuracy-Flip: 0.93300+-0.01189 Training: 2021-03-15 02:14:17,845-[cfp_fp][56000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:14:48,180-[agedb_30][56000]XNorm: 22.529084 Training: 2021-03-15 02:14:48,181-[agedb_30][56000]Accuracy-Flip: 0.95233+-0.00879 Training: 2021-03-15 02:14:48,181-[agedb_30][56000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:14:59,510-Speed 478.88 samples/sec Loss 6.5016 Epoch: 3 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:15:10,811-Speed 4530.67 samples/sec Loss 6.5030 Epoch: 3 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:15:22,408-Speed 4415.35 samples/sec Loss 6.5276 Epoch: 3 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:15:34,775-Speed 4139.94 samples/sec Loss 6.4782 Epoch: 3 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:15:46,134-Speed 4507.95 samples/sec Loss 6.5100 Epoch: 3 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:15:57,490-Speed 4508.64 samples/sec Loss 6.4682 Epoch: 3 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:16:08,839-Speed 4511.66 samples/sec Loss 6.4991 Epoch: 3 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:16:20,304-Speed 4466.31 samples/sec Loss 6.4922 Epoch: 3 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:16:31,735-Speed 4479.14 samples/sec Loss 6.5244 Epoch: 3 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:16:43,994-Speed 4176.75 samples/sec Loss 6.5211 Epoch: 3 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:16:55,590-Speed 4415.46 samples/sec Loss 6.5167 Epoch: 3 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:17:06,867-Speed 4540.61 samples/sec Loss 6.5465 Epoch: 3 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:17:18,497-Speed 4402.42 samples/sec Loss 6.5174 Epoch: 3 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:17:29,906-Speed 4487.96 samples/sec Loss 6.5071 Epoch: 3 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:17:41,306-Speed 4491.50 samples/sec Loss 6.5193 Epoch: 3 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:17:52,805-Speed 4452.77 samples/sec Loss 6.5436 Epoch: 3 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:18:04,335-Speed 4440.66 samples/sec Loss 6.5335 Epoch: 3 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:18:15,593-Speed 4548.26 samples/sec Loss 6.5863 Epoch: 3 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:18:27,016-Speed 4482.37 samples/sec Loss 6.5113 Epoch: 3 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:18:39,258-Speed 4182.48 samples/sec Loss 6.5363 Epoch: 3 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:18:50,771-Speed 4447.35 samples/sec Loss 6.5259 Epoch: 3 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:02,221-Speed 4472.18 samples/sec Loss 6.5279 Epoch: 3 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:13,943-Speed 4368.17 samples/sec Loss 6.5427 Epoch: 3 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:25,422-Speed 4460.33 samples/sec Loss 6.5117 Epoch: 3 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:36,834-Speed 4486.69 samples/sec Loss 6.4979 Epoch: 3 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:48,202-Speed 4504.30 samples/sec Loss 6.5218 Epoch: 3 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:19:59,833-Speed 4402.30 samples/sec Loss 6.5768 Epoch: 3 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:20:11,307-Speed 4462.54 samples/sec Loss 6.5172 Epoch: 3 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:20:22,627-Speed 4523.02 samples/sec Loss 6.5254 Epoch: 3 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:20:34,077-Speed 4471.62 samples/sec Loss 6.5099 Epoch: 3 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:20:45,304-Speed 4560.93 samples/sec Loss 6.4761 Epoch: 3 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:20:57,654-Speed 4145.89 samples/sec Loss 6.5940 Epoch: 3 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:21:09,151-Speed 4453.57 samples/sec Loss 6.5372 Epoch: 3 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:21:20,683-Speed 4439.89 samples/sec Loss 6.5255 Epoch: 3 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:21:32,176-Speed 4455.16 samples/sec Loss 6.5578 Epoch: 3 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:21:43,482-Speed 4528.93 samples/sec Loss 6.5193 Epoch: 3 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:21:54,865-Speed 4497.85 samples/sec Loss 6.5438 Epoch: 3 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:22:06,321-Speed 4469.64 samples/sec Loss 6.5076 Epoch: 3 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:22:17,931-Speed 4410.22 samples/sec Loss 6.4878 Epoch: 3 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:22:29,427-Speed 4454.05 samples/sec Loss 6.4672 Epoch: 3 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:22:59,651-[lfw][58000]XNorm: 24.293617 Training: 2021-03-15 02:22:59,652-[lfw][58000]Accuracy-Flip: 0.99383+-0.00428 Training: 2021-03-15 02:22:59,652-[lfw][58000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:23:34,705-[cfp_fp][58000]XNorm: 20.142363 Training: 2021-03-15 02:23:34,706-[cfp_fp][58000]Accuracy-Flip: 0.92186+-0.01040 Training: 2021-03-15 02:23:34,706-[cfp_fp][58000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:24:04,938-[agedb_30][58000]XNorm: 23.187815 Training: 2021-03-15 02:24:04,939-[agedb_30][58000]Accuracy-Flip: 0.94833+-0.01216 Training: 2021-03-15 02:24:04,939-[agedb_30][58000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:24:16,161-Speed 479.70 samples/sec Loss 6.5281 Epoch: 3 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:24:28,311-Speed 4214.16 samples/sec Loss 6.6196 Epoch: 3 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:24:39,715-Speed 4489.93 samples/sec Loss 6.5183 Epoch: 3 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:24:51,062-Speed 4512.44 samples/sec Loss 6.5368 Epoch: 3 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:25:03,291-Speed 4186.80 samples/sec Loss 6.5065 Epoch: 3 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:25:15,477-Speed 4201.96 samples/sec Loss 6.5025 Epoch: 3 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:25:27,774-Speed 4163.79 samples/sec Loss 6.5854 Epoch: 3 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:25:39,312-Speed 4437.50 samples/sec Loss 6.5006 Epoch: 3 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:25:50,767-Speed 4469.88 samples/sec Loss 6.5740 Epoch: 3 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:02,272-Speed 4450.65 samples/sec Loss 6.4832 Epoch: 3 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:13,660-Speed 4496.30 samples/sec Loss 6.5026 Epoch: 3 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:25,048-Speed 4496.25 samples/sec Loss 6.5303 Epoch: 3 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:36,367-Speed 4523.21 samples/sec Loss 6.5485 Epoch: 3 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:48,649-Speed 4169.02 samples/sec Loss 6.5608 Epoch: 3 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:26:59,877-Speed 4560.09 samples/sec Loss 6.5212 Epoch: 3 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:27:11,321-Speed 4474.51 samples/sec Loss 6.5295 Epoch: 3 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:27:22,833-Speed 4447.60 samples/sec Loss 6.4925 Epoch: 3 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:27:35,080-Speed 4180.82 samples/sec Loss 6.5577 Epoch: 3 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:27:46,459-Speed 4499.84 samples/sec Loss 6.5077 Epoch: 3 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:27:57,694-Speed 4557.46 samples/sec Loss 6.5312 Epoch: 3 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:28:09,112-Speed 4484.56 samples/sec Loss 6.5564 Epoch: 3 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:28:20,556-Speed 4474.07 samples/sec Loss 6.5453 Epoch: 3 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:28:31,980-Speed 4482.07 samples/sec Loss 6.5211 Epoch: 3 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:28:43,419-Speed 4476.12 samples/sec Loss 6.5181 Epoch: 3 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:28:55,003-Speed 4420.08 samples/sec Loss 6.5207 Epoch: 3 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:29:06,527-Speed 4443.19 samples/sec Loss 6.5552 Epoch: 3 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:29:17,859-Speed 4518.47 samples/sec Loss 6.5528 Epoch: 3 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:29:30,086-Speed 4187.51 samples/sec Loss 6.5435 Epoch: 3 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:29:41,539-Speed 4470.87 samples/sec Loss 6.5501 Epoch: 3 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:29:52,791-Speed 4550.31 samples/sec Loss 6.5100 Epoch: 3 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:30:04,304-Speed 4447.41 samples/sec Loss 6.5547 Epoch: 3 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:30:15,763-Speed 4468.27 samples/sec Loss 6.5520 Epoch: 3 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:30:27,278-Speed 4446.60 samples/sec Loss 6.5067 Epoch: 3 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:30:38,712-Speed 4477.89 samples/sec Loss 6.5330 Epoch: 3 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:30:49,905-Speed 4574.71 samples/sec Loss 6.4919 Epoch: 3 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:31:01,184-Speed 4539.72 samples/sec Loss 6.5373 Epoch: 3 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:31:12,643-Speed 4468.28 samples/sec Loss 6.5563 Epoch: 3 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:31:24,146-Speed 4451.21 samples/sec Loss 6.5355 Epoch: 3 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:31:35,633-Speed 4457.34 samples/sec Loss 6.5722 Epoch: 3 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:31:47,063-Speed 4479.69 samples/sec Loss 6.5679 Epoch: 3 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:32:17,210-[lfw][60000]XNorm: 21.950607 Training: 2021-03-15 02:32:17,211-[lfw][60000]Accuracy-Flip: 0.99617+-0.00259 Training: 2021-03-15 02:32:17,211-[lfw][60000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:32:52,057-[cfp_fp][60000]XNorm: 18.251346 Training: 2021-03-15 02:32:52,058-[cfp_fp][60000]Accuracy-Flip: 0.92700+-0.00925 Training: 2021-03-15 02:32:52,058-[cfp_fp][60000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:33:22,085-[agedb_30][60000]XNorm: 21.422744 Training: 2021-03-15 02:33:22,085-[agedb_30][60000]Accuracy-Flip: 0.95633+-0.01122 Training: 2021-03-15 02:33:22,085-[agedb_30][60000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:33:33,363-Speed 481.66 samples/sec Loss 6.4734 Epoch: 3 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:33:45,061-Speed 4377.16 samples/sec Loss 6.5154 Epoch: 3 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:33:56,283-Speed 4562.71 samples/sec Loss 6.4735 Epoch: 3 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:34:08,532-Speed 4180.00 samples/sec Loss 6.5217 Epoch: 3 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:34:19,967-Speed 4478.00 samples/sec Loss 6.5188 Epoch: 3 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:34:31,438-Speed 4463.52 samples/sec Loss 6.4838 Epoch: 3 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-15 02:34:42,841-Speed 4490.48 samples/sec Loss 6.5008 Epoch: 3 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:34:54,225-Speed 4497.81 samples/sec Loss 6.4923 Epoch: 3 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:35:05,535-Speed 4527.03 samples/sec Loss 6.5025 Epoch: 3 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:35:16,931-Speed 4492.90 samples/sec Loss 6.5031 Epoch: 3 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:35:28,154-Speed 4562.24 samples/sec Loss 6.4830 Epoch: 3 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:35:39,609-Speed 4469.98 samples/sec Loss 6.5153 Epoch: 3 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:35:51,933-Speed 4154.59 samples/sec Loss 6.5327 Epoch: 3 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:36:03,524-Speed 4417.66 samples/sec Loss 6.4982 Epoch: 3 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:36:16,703-Speed 3885.02 samples/sec Loss 6.5189 Epoch: 3 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:36:28,094-Speed 4495.01 samples/sec Loss 6.5216 Epoch: 3 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:36:40,251-Speed 4211.90 samples/sec Loss 6.4969 Epoch: 3 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:36:51,766-Speed 4446.62 samples/sec Loss 6.5260 Epoch: 3 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:37:03,157-Speed 4495.03 samples/sec Loss 6.4689 Epoch: 3 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:37:14,534-Speed 4500.38 samples/sec Loss 6.5622 Epoch: 3 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:37:26,140-Speed 4411.90 samples/sec Loss 6.5552 Epoch: 3 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:37:37,689-Speed 4433.41 samples/sec Loss 6.5588 Epoch: 3 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:37:49,082-Speed 4494.14 samples/sec Loss 6.4939 Epoch: 3 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:01,268-Speed 4201.92 samples/sec Loss 6.4914 Epoch: 3 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:12,518-Speed 4551.25 samples/sec Loss 6.4848 Epoch: 3 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:24,758-Speed 4183.05 samples/sec Loss 6.4527 Epoch: 3 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:36,263-Speed 4450.52 samples/sec Loss 6.5118 Epoch: 3 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:47,728-Speed 4466.11 samples/sec Loss 6.4870 Epoch: 3 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:38:59,090-Speed 4506.31 samples/sec Loss 6.5544 Epoch: 3 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:39:10,591-Speed 4452.08 samples/sec Loss 6.4787 Epoch: 3 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:39:21,999-Speed 4488.35 samples/sec Loss 6.5150 Epoch: 3 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:39:33,406-Speed 4488.38 samples/sec Loss 6.4992 Epoch: 3 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:39:45,026-Speed 4406.49 samples/sec Loss 6.5311 Epoch: 3 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:39:56,426-Speed 4491.50 samples/sec Loss 6.5462 Epoch: 3 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:40:07,972-Speed 4434.67 samples/sec Loss 6.5374 Epoch: 3 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:40:19,305-Speed 4517.96 samples/sec Loss 6.5202 Epoch: 3 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:40:30,574-Speed 4544.01 samples/sec Loss 6.5239 Epoch: 3 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:40:43,000-Speed 4120.52 samples/sec Loss 6.5509 Epoch: 3 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:40:54,458-Speed 4468.76 samples/sec Loss 6.4925 Epoch: 3 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:41:05,889-Speed 4479.40 samples/sec Loss 6.4650 Epoch: 3 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:41:35,918-[lfw][62000]XNorm: 22.988556 Training: 2021-03-15 02:41:35,919-[lfw][62000]Accuracy-Flip: 0.99567+-0.00318 Training: 2021-03-15 02:41:35,919-[lfw][62000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:42:10,940-[cfp_fp][62000]XNorm: 19.164429 Training: 2021-03-15 02:42:10,941-[cfp_fp][62000]Accuracy-Flip: 0.93486+-0.00948 Training: 2021-03-15 02:42:10,941-[cfp_fp][62000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:42:41,553-[agedb_30][62000]XNorm: 21.976073 Training: 2021-03-15 02:42:41,553-[agedb_30][62000]Accuracy-Flip: 0.95850+-0.00953 Training: 2021-03-15 02:42:41,553-[agedb_30][62000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:42:53,123-Speed 477.46 samples/sec Loss 6.4986 Epoch: 3 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:43:04,512-Speed 4496.12 samples/sec Loss 6.5124 Epoch: 3 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:43:16,027-Speed 4446.56 samples/sec Loss 6.5236 Epoch: 3 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:43:27,554-Speed 4441.88 samples/sec Loss 6.4891 Epoch: 3 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:43:38,894-Speed 4515.34 samples/sec Loss 6.5355 Epoch: 3 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:43:50,299-Speed 4489.48 samples/sec Loss 6.5535 Epoch: 3 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:01,590-Speed 4534.69 samples/sec Loss 6.5363 Epoch: 3 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:13,318-Speed 4365.78 samples/sec Loss 6.5276 Epoch: 3 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:24,746-Speed 4480.46 samples/sec Loss 6.4828 Epoch: 3 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:36,156-Speed 4487.39 samples/sec Loss 6.5341 Epoch: 3 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:47,558-Speed 4490.79 samples/sec Loss 6.4870 Epoch: 3 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:44:58,960-Speed 4490.45 samples/sec Loss 6.4744 Epoch: 3 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:45:11,230-Speed 4173.17 samples/sec Loss 6.5238 Epoch: 3 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:45:22,629-Speed 4491.67 samples/sec Loss 6.5173 Epoch: 3 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:45:34,110-Speed 4459.75 samples/sec Loss 6.5295 Epoch: 3 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:45:45,445-Speed 4517.15 samples/sec Loss 6.5665 Epoch: 3 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:45:56,969-Speed 4443.19 samples/sec Loss 6.5192 Epoch: 3 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:46:08,463-Speed 4454.96 samples/sec Loss 6.4741 Epoch: 3 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:46:19,905-Speed 4474.82 samples/sec Loss 6.4308 Epoch: 3 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:46:31,399-Speed 4454.84 samples/sec Loss 6.5120 Epoch: 3 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:46:42,783-Speed 4497.83 samples/sec Loss 6.4858 Epoch: 3 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:46:54,574-Speed 4342.67 samples/sec Loss 6.5186 Epoch: 3 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:47:06,126-Speed 4432.28 samples/sec Loss 6.5066 Epoch: 3 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:47:17,476-Speed 4511.51 samples/sec Loss 6.5016 Epoch: 3 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:47:30,639-Speed 3889.71 samples/sec Loss 6.4799 Epoch: 3 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:47:43,054-Speed 4124.44 samples/sec Loss 6.4900 Epoch: 3 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:47:54,350-Speed 4532.65 samples/sec Loss 6.5537 Epoch: 3 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:48:05,927-Speed 4422.88 samples/sec Loss 6.4988 Epoch: 3 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:48:17,502-Speed 4423.45 samples/sec Loss 6.5446 Epoch: 3 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:48:29,696-Speed 4198.89 samples/sec Loss 6.4804 Epoch: 3 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:48:41,186-Speed 4456.29 samples/sec Loss 6.5050 Epoch: 3 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:48:52,626-Speed 4475.88 samples/sec Loss 6.5738 Epoch: 3 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:49:04,133-Speed 4449.65 samples/sec Loss 6.5603 Epoch: 3 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:49:16,332-Speed 4197.24 samples/sec Loss 6.5186 Epoch: 3 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:49:27,612-Speed 4539.21 samples/sec Loss 6.4992 Epoch: 3 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:49:40,024-Speed 4125.18 samples/sec Loss 6.5346 Epoch: 3 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:49:51,504-Speed 4460.19 samples/sec Loss 6.5224 Epoch: 3 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:50:02,933-Speed 4479.94 samples/sec Loss 6.4781 Epoch: 3 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:50:14,363-Speed 4479.99 samples/sec Loss 6.4404 Epoch: 3 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:50:25,864-Speed 4452.11 samples/sec Loss 6.4673 Epoch: 3 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:50:56,061-[lfw][64000]XNorm: 21.317652 Training: 2021-03-15 02:50:56,061-[lfw][64000]Accuracy-Flip: 0.99567+-0.00382 Training: 2021-03-15 02:50:56,061-[lfw][64000]Accuracy-Highest: 0.99650 Training: 2021-03-15 02:51:31,123-[cfp_fp][64000]XNorm: 17.825720 Training: 2021-03-15 02:51:31,124-[cfp_fp][64000]Accuracy-Flip: 0.92829+-0.01183 Training: 2021-03-15 02:51:31,124-[cfp_fp][64000]Accuracy-Highest: 0.93857 Training: 2021-03-15 02:52:01,354-[agedb_30][64000]XNorm: 20.244293 Training: 2021-03-15 02:52:01,354-[agedb_30][64000]Accuracy-Flip: 0.95633+-0.00977 Training: 2021-03-15 02:52:01,354-[agedb_30][64000]Accuracy-Highest: 0.96083 Training: 2021-03-15 02:52:12,531-Speed 480.00 samples/sec Loss 6.4907 Epoch: 3 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:52:23,720-Speed 4576.31 samples/sec Loss 6.4678 Epoch: 3 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:52:34,892-Speed 4583.11 samples/sec Loss 6.5191 Epoch: 3 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:52:46,307-Speed 4485.55 samples/sec Loss 6.4588 Epoch: 3 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:52:57,689-Speed 4498.56 samples/sec Loss 6.4290 Epoch: 3 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:53:09,185-Speed 4453.90 samples/sec Loss 6.4753 Epoch: 3 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:53:20,654-Speed 4464.31 samples/sec Loss 6.4652 Epoch: 3 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:53:33,138-Speed 4101.35 samples/sec Loss 6.4999 Epoch: 3 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:53:44,310-Speed 4583.28 samples/sec Loss 6.5067 Epoch: 3 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:53:55,857-Speed 4434.40 samples/sec Loss 6.5208 Epoch: 3 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:54:07,240-Speed 4497.85 samples/sec Loss 6.5230 Epoch: 3 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:54:18,476-Speed 4557.06 samples/sec Loss 6.5164 Epoch: 3 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:54:29,788-Speed 4526.30 samples/sec Loss 6.4842 Epoch: 3 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:54:41,254-Speed 4465.54 samples/sec Loss 6.5371 Epoch: 3 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:54:52,676-Speed 4483.03 samples/sec Loss 6.5338 Epoch: 3 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:55:03,905-Speed 4559.84 samples/sec Loss 6.5462 Epoch: 3 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:55:15,288-Speed 4498.17 samples/sec Loss 6.5208 Epoch: 3 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:55:26,528-Speed 4555.50 samples/sec Loss 6.4671 Epoch: 3 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:55:38,150-Speed 4405.56 samples/sec Loss 6.4666 Epoch: 3 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:55:49,475-Speed 4521.21 samples/sec Loss 6.5070 Epoch: 3 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:00,932-Speed 4468.97 samples/sec Loss 6.4977 Epoch: 3 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:12,545-Speed 4409.04 samples/sec Loss 6.3958 Epoch: 3 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:24,884-Speed 4149.76 samples/sec Loss 6.4754 Epoch: 3 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:36,257-Speed 4501.97 samples/sec Loss 6.4814 Epoch: 3 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:47,740-Speed 4459.10 samples/sec Loss 6.5236 Epoch: 3 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:56:59,194-Speed 4470.05 samples/sec Loss 6.5018 Epoch: 3 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:57:10,743-Speed 4433.54 samples/sec Loss 6.4815 Epoch: 3 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:57:22,132-Speed 4495.89 samples/sec Loss 6.4664 Epoch: 3 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:57:33,628-Speed 4453.99 samples/sec Loss 6.4826 Epoch: 3 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:57:45,092-Speed 4466.36 samples/sec Loss 6.5052 Epoch: 3 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:57:56,567-Speed 4462.22 samples/sec Loss 6.4430 Epoch: 3 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:58:08,033-Speed 4465.39 samples/sec Loss 6.4632 Epoch: 3 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:58:19,329-Speed 4532.73 samples/sec Loss 6.4884 Epoch: 3 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:58:30,978-Speed 4395.42 samples/sec Loss 6.5568 Epoch: 3 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:58:42,531-Speed 4432.07 samples/sec Loss 6.5530 Epoch: 3 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:58:55,590-Speed 3920.96 samples/sec Loss 6.4405 Epoch: 3 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:59:07,742-Speed 4213.53 samples/sec Loss 6.4927 Epoch: 3 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:59:19,065-Speed 4521.78 samples/sec Loss 6.5363 Epoch: 3 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:59:30,272-Speed 4568.78 samples/sec Loss 6.4745 Epoch: 3 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 02:59:41,641-Speed 4503.68 samples/sec Loss 6.4599 Epoch: 3 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:00:11,950-[lfw][66000]XNorm: 23.341014 Training: 2021-03-15 03:00:11,950-[lfw][66000]Accuracy-Flip: 0.99633+-0.00296 Training: 2021-03-15 03:00:11,950-[lfw][66000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:00:47,170-[cfp_fp][66000]XNorm: 19.829844 Training: 2021-03-15 03:00:47,170-[cfp_fp][66000]Accuracy-Flip: 0.92986+-0.00976 Training: 2021-03-15 03:00:47,170-[cfp_fp][66000]Accuracy-Highest: 0.93857 Training: 2021-03-15 03:01:17,510-[agedb_30][66000]XNorm: 22.365957 Training: 2021-03-15 03:01:17,510-[agedb_30][66000]Accuracy-Flip: 0.95900+-0.00926 Training: 2021-03-15 03:01:17,510-[agedb_30][66000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:01:29,000-Speed 476.91 samples/sec Loss 6.4947 Epoch: 3 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:01:41,128-Speed 4221.58 samples/sec Loss 6.5428 Epoch: 3 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:01:53,414-Speed 4167.64 samples/sec Loss 6.4715 Epoch: 3 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:02:04,720-Speed 4528.67 samples/sec Loss 6.4212 Epoch: 3 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:02:16,877-Speed 4211.87 samples/sec Loss 6.4469 Epoch: 3 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:02:28,447-Speed 4425.44 samples/sec Loss 6.4476 Epoch: 3 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:02:39,964-Speed 4446.02 samples/sec Loss 6.4964 Epoch: 3 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:02:51,490-Speed 4442.30 samples/sec Loss 6.4820 Epoch: 3 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:02,906-Speed 4485.17 samples/sec Loss 6.5012 Epoch: 3 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:14,196-Speed 4535.01 samples/sec Loss 6.5228 Epoch: 3 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:25,568-Speed 4502.62 samples/sec Loss 6.4469 Epoch: 3 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:36,941-Speed 4501.91 samples/sec Loss 6.4833 Epoch: 3 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:48,451-Speed 4448.78 samples/sec Loss 6.4838 Epoch: 3 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:03:59,989-Speed 4437.53 samples/sec Loss 6.4753 Epoch: 3 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:04:11,463-Speed 4462.36 samples/sec Loss 6.4954 Epoch: 3 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:04:36,399-Speed 2053.36 samples/sec Loss 6.0562 Epoch: 4 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:04:48,826-Speed 4120.28 samples/sec Loss 5.8159 Epoch: 4 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:05:00,938-Speed 4227.62 samples/sec Loss 5.8592 Epoch: 4 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:05:13,706-Speed 4010.13 samples/sec Loss 5.8411 Epoch: 4 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:05:25,661-Speed 4283.26 samples/sec Loss 5.8609 Epoch: 4 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:05:37,654-Speed 4269.31 samples/sec Loss 5.8688 Epoch: 4 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:05:49,374-Speed 4368.83 samples/sec Loss 5.9010 Epoch: 4 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:06:01,397-Speed 4258.95 samples/sec Loss 5.9103 Epoch: 4 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:06:13,358-Speed 4280.85 samples/sec Loss 5.9483 Epoch: 4 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:06:25,273-Speed 4297.45 samples/sec Loss 5.9680 Epoch: 4 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:06:37,290-Speed 4260.47 samples/sec Loss 5.9546 Epoch: 4 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:06:49,101-Speed 4335.50 samples/sec Loss 6.0044 Epoch: 4 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:07:01,043-Speed 4287.38 samples/sec Loss 5.9549 Epoch: 4 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:07:12,747-Speed 4374.97 samples/sec Loss 6.0196 Epoch: 4 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:07:24,576-Speed 4328.76 samples/sec Loss 6.0017 Epoch: 4 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:07:36,317-Speed 4360.94 samples/sec Loss 6.0078 Epoch: 4 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:07:49,177-Speed 3981.78 samples/sec Loss 5.9901 Epoch: 4 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:08:00,819-Speed 4398.20 samples/sec Loss 6.0749 Epoch: 4 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:08:12,705-Speed 4307.70 samples/sec Loss 6.0101 Epoch: 4 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:08:24,639-Speed 4290.45 samples/sec Loss 6.0670 Epoch: 4 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:08:36,480-Speed 4324.27 samples/sec Loss 6.0775 Epoch: 4 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:08:48,396-Speed 4297.28 samples/sec Loss 6.0961 Epoch: 4 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:09:00,408-Speed 4262.62 samples/sec Loss 6.1143 Epoch: 4 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:09:12,323-Speed 4297.02 samples/sec Loss 6.0691 Epoch: 4 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:09:24,302-Speed 4274.49 samples/sec Loss 6.0927 Epoch: 4 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:09:54,842-[lfw][68000]XNorm: 23.024013 Training: 2021-03-15 03:09:54,842-[lfw][68000]Accuracy-Flip: 0.99583+-0.00201 Training: 2021-03-15 03:09:54,842-[lfw][68000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:10:30,263-[cfp_fp][68000]XNorm: 19.171282 Training: 2021-03-15 03:10:30,263-[cfp_fp][68000]Accuracy-Flip: 0.92886+-0.01281 Training: 2021-03-15 03:10:30,263-[cfp_fp][68000]Accuracy-Highest: 0.93857 Training: 2021-03-15 03:11:00,626-[agedb_30][68000]XNorm: 22.514177 Training: 2021-03-15 03:11:00,627-[agedb_30][68000]Accuracy-Flip: 0.94833+-0.01453 Training: 2021-03-15 03:11:00,627-[agedb_30][68000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:11:12,292-Speed 474.12 samples/sec Loss 6.1857 Epoch: 4 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:11:24,136-Speed 4323.01 samples/sec Loss 6.1669 Epoch: 4 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:11:35,952-Speed 4333.36 samples/sec Loss 6.1309 Epoch: 4 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:11:48,633-Speed 4037.63 samples/sec Loss 6.1454 Epoch: 4 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:12:00,191-Speed 4430.24 samples/sec Loss 6.1435 Epoch: 4 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:12:12,198-Speed 4264.48 samples/sec Loss 6.1181 Epoch: 4 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:12:24,738-Speed 4083.21 samples/sec Loss 6.1693 Epoch: 4 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:12:37,747-Speed 3935.93 samples/sec Loss 6.1882 Epoch: 4 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:12:50,078-Speed 4152.43 samples/sec Loss 6.1384 Epoch: 4 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:13:01,866-Speed 4343.67 samples/sec Loss 6.2290 Epoch: 4 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:13:13,610-Speed 4359.91 samples/sec Loss 6.1779 Epoch: 4 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:13:25,524-Speed 4297.54 samples/sec Loss 6.1293 Epoch: 4 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:13:37,058-Speed 4439.49 samples/sec Loss 6.1785 Epoch: 4 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:13:50,012-Speed 3952.52 samples/sec Loss 6.2313 Epoch: 4 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:14:02,833-Speed 3993.80 samples/sec Loss 6.1836 Epoch: 4 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:14:14,655-Speed 4331.23 samples/sec Loss 6.2168 Epoch: 4 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:14:26,590-Speed 4290.24 samples/sec Loss 6.2347 Epoch: 4 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:14:38,262-Speed 4386.61 samples/sec Loss 6.1740 Epoch: 4 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:14:50,026-Speed 4352.79 samples/sec Loss 6.2730 Epoch: 4 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:15:01,756-Speed 4365.02 samples/sec Loss 6.2577 Epoch: 4 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:15:13,488-Speed 4364.25 samples/sec Loss 6.2377 Epoch: 4 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:15:25,479-Speed 4270.10 samples/sec Loss 6.2474 Epoch: 4 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:15:37,254-Speed 4348.76 samples/sec Loss 6.2433 Epoch: 4 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:15:49,097-Speed 4323.30 samples/sec Loss 6.2147 Epoch: 4 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:16:01,062-Speed 4279.69 samples/sec Loss 6.2436 Epoch: 4 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:16:13,964-Speed 3968.59 samples/sec Loss 6.2666 Epoch: 4 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:16:25,873-Speed 4299.52 samples/sec Loss 6.2670 Epoch: 4 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:16:37,720-Speed 4321.86 samples/sec Loss 6.3218 Epoch: 4 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:16:49,639-Speed 4296.22 samples/sec Loss 6.2700 Epoch: 4 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:17:01,424-Speed 4344.91 samples/sec Loss 6.2890 Epoch: 4 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:17:13,187-Speed 4352.91 samples/sec Loss 6.2951 Epoch: 4 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:17:24,926-Speed 4361.65 samples/sec Loss 6.2659 Epoch: 4 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:17:36,827-Speed 4302.63 samples/sec Loss 6.3195 Epoch: 4 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:17:48,716-Speed 4306.64 samples/sec Loss 6.3104 Epoch: 4 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:18:00,670-Speed 4283.42 samples/sec Loss 6.3312 Epoch: 4 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:18:12,497-Speed 4329.32 samples/sec Loss 6.3781 Epoch: 4 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:18:24,194-Speed 4377.27 samples/sec Loss 6.3263 Epoch: 4 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:18:36,819-Speed 4055.83 samples/sec Loss 6.3228 Epoch: 4 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:18:48,410-Speed 4417.54 samples/sec Loss 6.2816 Epoch: 4 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:19:00,075-Speed 4389.38 samples/sec Loss 6.2751 Epoch: 4 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:19:30,649-[lfw][70000]XNorm: 21.851341 Training: 2021-03-15 03:19:30,650-[lfw][70000]Accuracy-Flip: 0.99467+-0.00256 Training: 2021-03-15 03:19:30,650-[lfw][70000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:20:06,352-[cfp_fp][70000]XNorm: 18.314594 Training: 2021-03-15 03:20:06,352-[cfp_fp][70000]Accuracy-Flip: 0.93929+-0.00848 Training: 2021-03-15 03:20:06,352-[cfp_fp][70000]Accuracy-Highest: 0.93929 Training: 2021-03-15 03:20:36,850-[agedb_30][70000]XNorm: 20.978793 Training: 2021-03-15 03:20:36,851-[agedb_30][70000]Accuracy-Flip: 0.95483+-0.00867 Training: 2021-03-15 03:20:36,851-[agedb_30][70000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:20:48,828-Speed 470.79 samples/sec Loss 6.3627 Epoch: 4 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:00,636-Speed 4336.12 samples/sec Loss 6.3537 Epoch: 4 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:12,562-Speed 4293.58 samples/sec Loss 6.2941 Epoch: 4 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:24,164-Speed 4413.28 samples/sec Loss 6.3092 Epoch: 4 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:35,967-Speed 4338.29 samples/sec Loss 6.4032 Epoch: 4 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:47,788-Speed 4331.51 samples/sec Loss 6.3136 Epoch: 4 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:21:59,568-Speed 4346.75 samples/sec Loss 6.3518 Epoch: 4 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:22:11,452-Speed 4308.77 samples/sec Loss 6.3560 Epoch: 4 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:22:23,144-Speed 4379.16 samples/sec Loss 6.3145 Epoch: 4 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:22:35,941-Speed 4001.35 samples/sec Loss 6.3435 Epoch: 4 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:22:47,793-Speed 4320.23 samples/sec Loss 6.3142 Epoch: 4 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:22:59,524-Speed 4364.59 samples/sec Loss 6.3600 Epoch: 4 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:23:11,159-Speed 4401.08 samples/sec Loss 6.3371 Epoch: 4 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:23:25,057-Speed 3684.05 samples/sec Loss 6.3441 Epoch: 4 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:23:37,668-Speed 4060.08 samples/sec Loss 6.3841 Epoch: 4 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:23:49,262-Speed 4416.45 samples/sec Loss 6.3700 Epoch: 4 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:24:00,788-Speed 4442.63 samples/sec Loss 6.3683 Epoch: 4 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:24:12,506-Speed 4369.55 samples/sec Loss 6.3698 Epoch: 4 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:24:24,130-Speed 4405.02 samples/sec Loss 6.3784 Epoch: 4 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:24:35,728-Speed 4414.65 samples/sec Loss 6.3364 Epoch: 4 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:24:48,736-Speed 3936.30 samples/sec Loss 6.3376 Epoch: 4 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:00,474-Speed 4362.43 samples/sec Loss 6.3417 Epoch: 4 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:12,212-Speed 4361.88 samples/sec Loss 6.3324 Epoch: 4 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:23,961-Speed 4358.35 samples/sec Loss 6.3180 Epoch: 4 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:35,727-Speed 4351.69 samples/sec Loss 6.3201 Epoch: 4 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:47,568-Speed 4324.32 samples/sec Loss 6.3641 Epoch: 4 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:25:58,949-Speed 4498.81 samples/sec Loss 6.3348 Epoch: 4 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:26:11,625-Speed 4039.53 samples/sec Loss 6.3955 Epoch: 4 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:26:23,297-Speed 4386.69 samples/sec Loss 6.3829 Epoch: 4 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:26:35,326-Speed 4256.69 samples/sec Loss 6.3588 Epoch: 4 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:26:47,159-Speed 4327.33 samples/sec Loss 6.3460 Epoch: 4 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:26:59,031-Speed 4313.01 samples/sec Loss 6.3429 Epoch: 4 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:27:11,381-Speed 4145.93 samples/sec Loss 6.4146 Epoch: 4 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:27:23,321-Speed 4288.19 samples/sec Loss 6.4392 Epoch: 4 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:27:35,109-Speed 4343.59 samples/sec Loss 6.3608 Epoch: 4 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:27:47,123-Speed 4261.80 samples/sec Loss 6.3520 Epoch: 4 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:27:58,782-Speed 4391.70 samples/sec Loss 6.3912 Epoch: 4 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:28:10,541-Speed 4354.24 samples/sec Loss 6.4164 Epoch: 4 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:28:22,263-Speed 4368.12 samples/sec Loss 6.4396 Epoch: 4 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:28:34,089-Speed 4329.92 samples/sec Loss 6.4063 Epoch: 4 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:29:04,751-[lfw][72000]XNorm: 22.716878 Training: 2021-03-15 03:29:04,751-[lfw][72000]Accuracy-Flip: 0.99483+-0.00241 Training: 2021-03-15 03:29:04,751-[lfw][72000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:29:39,948-[cfp_fp][72000]XNorm: 18.549333 Training: 2021-03-15 03:29:39,948-[cfp_fp][72000]Accuracy-Flip: 0.93329+-0.00785 Training: 2021-03-15 03:29:39,948-[cfp_fp][72000]Accuracy-Highest: 0.93929 Training: 2021-03-15 03:30:10,165-[agedb_30][72000]XNorm: 21.629643 Training: 2021-03-15 03:30:10,166-[agedb_30][72000]Accuracy-Flip: 0.95500+-0.01059 Training: 2021-03-15 03:30:10,166-[agedb_30][72000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:30:21,689-Speed 475.84 samples/sec Loss 6.4134 Epoch: 4 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:30:33,383-Speed 4378.24 samples/sec Loss 6.4107 Epoch: 4 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:30:45,168-Speed 4345.06 samples/sec Loss 6.3937 Epoch: 4 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:30:56,819-Speed 4394.68 samples/sec Loss 6.4262 Epoch: 4 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:31:08,554-Speed 4363.32 samples/sec Loss 6.3862 Epoch: 4 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:31:21,164-Speed 4060.51 samples/sec Loss 6.4146 Epoch: 4 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:31:32,887-Speed 4367.65 samples/sec Loss 6.3894 Epoch: 4 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:31:44,637-Speed 4357.85 samples/sec Loss 6.4088 Epoch: 4 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:31:56,632-Speed 4268.71 samples/sec Loss 6.3772 Epoch: 4 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:32:08,547-Speed 4297.70 samples/sec Loss 6.3890 Epoch: 4 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:32:20,323-Speed 4348.07 samples/sec Loss 6.3713 Epoch: 4 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:32:32,040-Speed 4370.15 samples/sec Loss 6.4131 Epoch: 4 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:32:43,690-Speed 4394.84 samples/sec Loss 6.4156 Epoch: 4 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:32:55,226-Speed 4438.59 samples/sec Loss 6.3861 Epoch: 4 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:33:07,118-Speed 4305.85 samples/sec Loss 6.4463 Epoch: 4 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:33:19,099-Speed 4273.81 samples/sec Loss 6.3632 Epoch: 4 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:33:31,970-Speed 3978.13 samples/sec Loss 6.4055 Epoch: 4 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:33:43,958-Speed 4271.03 samples/sec Loss 6.4154 Epoch: 4 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-15 03:33:55,841-Speed 4309.15 samples/sec Loss 6.4208 Epoch: 4 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:34:08,514-Speed 4040.12 samples/sec Loss 6.4285 Epoch: 4 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:34:20,593-Speed 4239.03 samples/sec Loss 6.3889 Epoch: 4 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:34:32,456-Speed 4316.21 samples/sec Loss 6.3377 Epoch: 4 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:34:45,406-Speed 3953.77 samples/sec Loss 6.4058 Epoch: 4 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:34:57,196-Speed 4343.11 samples/sec Loss 6.4378 Epoch: 4 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:35:08,995-Speed 4339.58 samples/sec Loss 6.4196 Epoch: 4 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:35:20,783-Speed 4343.62 samples/sec Loss 6.3703 Epoch: 4 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:35:32,602-Speed 4332.06 samples/sec Loss 6.4545 Epoch: 4 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:35:44,718-Speed 4226.08 samples/sec Loss 6.4016 Epoch: 4 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:35:56,696-Speed 4274.96 samples/sec Loss 6.3547 Epoch: 4 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:36:09,140-Speed 4114.59 samples/sec Loss 6.4297 Epoch: 4 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:36:20,804-Speed 4389.84 samples/sec Loss 6.4203 Epoch: 4 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:36:32,587-Speed 4345.51 samples/sec Loss 6.4300 Epoch: 4 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:36:44,376-Speed 4343.20 samples/sec Loss 6.4565 Epoch: 4 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:36:55,986-Speed 4410.18 samples/sec Loss 6.3941 Epoch: 4 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:37:07,760-Speed 4348.80 samples/sec Loss 6.4240 Epoch: 4 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:37:19,700-Speed 4288.31 samples/sec Loss 6.4705 Epoch: 4 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:37:31,619-Speed 4296.04 samples/sec Loss 6.3975 Epoch: 4 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:37:43,336-Speed 4369.83 samples/sec Loss 6.4513 Epoch: 4 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:37:55,253-Speed 4296.66 samples/sec Loss 6.3676 Epoch: 4 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:38:06,980-Speed 4366.30 samples/sec Loss 6.3863 Epoch: 4 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:38:37,254-[lfw][74000]XNorm: 21.983966 Training: 2021-03-15 03:38:37,254-[lfw][74000]Accuracy-Flip: 0.99533+-0.00364 Training: 2021-03-15 03:38:37,254-[lfw][74000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:39:12,500-[cfp_fp][74000]XNorm: 18.388811 Training: 2021-03-15 03:39:12,501-[cfp_fp][74000]Accuracy-Flip: 0.93971+-0.01165 Training: 2021-03-15 03:39:12,501-[cfp_fp][74000]Accuracy-Highest: 0.93971 Training: 2021-03-15 03:39:43,253-[agedb_30][74000]XNorm: 21.357583 Training: 2021-03-15 03:39:43,254-[agedb_30][74000]Accuracy-Flip: 0.95367+-0.00918 Training: 2021-03-15 03:39:43,254-[agedb_30][74000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:39:55,838-Speed 470.34 samples/sec Loss 6.3955 Epoch: 4 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:40:07,506-Speed 4388.27 samples/sec Loss 6.4016 Epoch: 4 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:40:19,267-Speed 4353.86 samples/sec Loss 6.4563 Epoch: 4 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:40:32,033-Speed 4010.90 samples/sec Loss 6.4218 Epoch: 4 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:40:44,023-Speed 4270.29 samples/sec Loss 6.4769 Epoch: 4 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:40:55,715-Speed 4379.30 samples/sec Loss 6.4276 Epoch: 4 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:41:07,475-Speed 4354.15 samples/sec Loss 6.4280 Epoch: 4 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:41:18,930-Speed 4469.89 samples/sec Loss 6.4520 Epoch: 4 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:41:30,712-Speed 4346.02 samples/sec Loss 6.4159 Epoch: 4 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:41:42,381-Speed 4387.67 samples/sec Loss 6.4495 Epoch: 4 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:41:53,943-Speed 4428.73 samples/sec Loss 6.4176 Epoch: 4 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:42:05,781-Speed 4325.36 samples/sec Loss 6.4369 Epoch: 4 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:42:18,231-Speed 4112.84 samples/sec Loss 6.4073 Epoch: 4 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:42:29,799-Speed 4426.19 samples/sec Loss 6.4386 Epoch: 4 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:42:41,461-Speed 4390.61 samples/sec Loss 6.3810 Epoch: 4 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:42:53,119-Speed 4392.08 samples/sec Loss 6.3985 Epoch: 4 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:43:04,879-Speed 4354.09 samples/sec Loss 6.4517 Epoch: 4 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:43:16,575-Speed 4378.13 samples/sec Loss 6.4509 Epoch: 4 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:43:28,308-Speed 4363.85 samples/sec Loss 6.4583 Epoch: 4 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:43:40,210-Speed 4302.12 samples/sec Loss 6.3741 Epoch: 4 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:43:52,058-Speed 4321.85 samples/sec Loss 6.4368 Epoch: 4 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:44:03,710-Speed 4394.32 samples/sec Loss 6.3701 Epoch: 4 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:44:15,596-Speed 4307.79 samples/sec Loss 6.4663 Epoch: 4 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:44:28,318-Speed 4024.72 samples/sec Loss 6.4855 Epoch: 4 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:44:41,026-Speed 4029.47 samples/sec Loss 6.4198 Epoch: 4 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:44:52,834-Speed 4336.39 samples/sec Loss 6.4222 Epoch: 4 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:45:04,689-Speed 4319.04 samples/sec Loss 6.4535 Epoch: 4 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:45:16,601-Speed 4298.25 samples/sec Loss 6.4173 Epoch: 4 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:45:28,277-Speed 4385.41 samples/sec Loss 6.4604 Epoch: 4 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:45:40,083-Speed 4337.13 samples/sec Loss 6.4435 Epoch: 4 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:45:52,009-Speed 4293.27 samples/sec Loss 6.4428 Epoch: 4 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:46:04,891-Speed 3974.79 samples/sec Loss 6.4484 Epoch: 4 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:46:17,425-Speed 4085.34 samples/sec Loss 6.3921 Epoch: 4 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:46:29,164-Speed 4361.64 samples/sec Loss 6.3877 Epoch: 4 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:46:41,021-Speed 4318.25 samples/sec Loss 6.3963 Epoch: 4 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:46:52,726-Speed 4374.75 samples/sec Loss 6.4286 Epoch: 4 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:47:04,643-Speed 4296.61 samples/sec Loss 6.4165 Epoch: 4 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:47:16,589-Speed 4286.20 samples/sec Loss 6.4454 Epoch: 4 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:47:29,126-Speed 4084.12 samples/sec Loss 6.3996 Epoch: 4 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:47:41,054-Speed 4292.70 samples/sec Loss 6.4127 Epoch: 4 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:48:11,399-[lfw][76000]XNorm: 23.233263 Training: 2021-03-15 03:48:11,400-[lfw][76000]Accuracy-Flip: 0.99433+-0.00359 Training: 2021-03-15 03:48:11,400-[lfw][76000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:48:46,604-[cfp_fp][76000]XNorm: 19.400579 Training: 2021-03-15 03:48:46,604-[cfp_fp][76000]Accuracy-Flip: 0.93457+-0.00954 Training: 2021-03-15 03:48:46,606-[cfp_fp][76000]Accuracy-Highest: 0.93971 Training: 2021-03-15 03:49:17,009-[agedb_30][76000]XNorm: 22.489102 Training: 2021-03-15 03:49:17,009-[agedb_30][76000]Accuracy-Flip: 0.95533+-0.01194 Training: 2021-03-15 03:49:17,009-[agedb_30][76000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:49:28,660-Speed 475.81 samples/sec Loss 6.4461 Epoch: 4 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:49:40,586-Speed 4293.28 samples/sec Loss 6.4482 Epoch: 4 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:49:52,371-Speed 4344.66 samples/sec Loss 6.4494 Epoch: 4 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:50:04,197-Speed 4329.65 samples/sec Loss 6.4286 Epoch: 4 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:50:16,183-Speed 4271.93 samples/sec Loss 6.4177 Epoch: 4 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:50:28,004-Speed 4331.41 samples/sec Loss 6.4079 Epoch: 4 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:50:39,756-Speed 4356.97 samples/sec Loss 6.3564 Epoch: 4 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:50:51,684-Speed 4292.70 samples/sec Loss 6.3537 Epoch: 4 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:51:04,281-Speed 4064.51 samples/sec Loss 6.3641 Epoch: 4 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:51:16,170-Speed 4307.29 samples/sec Loss 6.4892 Epoch: 4 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:51:28,088-Speed 4296.03 samples/sec Loss 6.4215 Epoch: 4 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:51:39,823-Speed 4363.51 samples/sec Loss 6.4555 Epoch: 4 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:51:51,551-Speed 4365.80 samples/sec Loss 6.3940 Epoch: 4 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:52:03,335-Speed 4345.34 samples/sec Loss 6.3612 Epoch: 4 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:52:15,179-Speed 4322.94 samples/sec Loss 6.4181 Epoch: 4 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:52:26,903-Speed 4367.22 samples/sec Loss 6.3890 Epoch: 4 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:52:38,886-Speed 4273.11 samples/sec Loss 6.4281 Epoch: 4 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:52:51,589-Speed 4030.68 samples/sec Loss 6.4540 Epoch: 4 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:53:03,520-Speed 4291.56 samples/sec Loss 6.4414 Epoch: 4 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:53:15,537-Speed 4260.95 samples/sec Loss 6.4923 Epoch: 4 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:53:27,225-Speed 4380.88 samples/sec Loss 6.4430 Epoch: 4 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:53:39,900-Speed 4039.79 samples/sec Loss 6.4484 Epoch: 4 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:53:51,674-Speed 4348.64 samples/sec Loss 6.3989 Epoch: 4 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:54:03,696-Speed 4259.22 samples/sec Loss 6.4488 Epoch: 4 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:54:15,540-Speed 4322.93 samples/sec Loss 6.4403 Epoch: 4 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:54:27,327-Speed 4343.86 samples/sec Loss 6.4280 Epoch: 4 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:54:39,356-Speed 4256.55 samples/sec Loss 6.4507 Epoch: 4 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:54:50,989-Speed 4401.58 samples/sec Loss 6.3811 Epoch: 4 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:55:02,882-Speed 4305.45 samples/sec Loss 6.4037 Epoch: 4 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:55:15,537-Speed 4046.24 samples/sec Loss 6.4518 Epoch: 4 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:55:28,163-Speed 4055.21 samples/sec Loss 6.5162 Epoch: 4 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:55:39,988-Speed 4330.36 samples/sec Loss 6.4406 Epoch: 4 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:55:51,748-Speed 4354.03 samples/sec Loss 6.4143 Epoch: 4 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:56:03,622-Speed 4312.05 samples/sec Loss 6.3843 Epoch: 4 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:56:15,396-Speed 4349.03 samples/sec Loss 6.3697 Epoch: 4 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:56:27,155-Speed 4354.59 samples/sec Loss 6.4450 Epoch: 4 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:56:38,753-Speed 4414.51 samples/sec Loss 6.4473 Epoch: 4 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:56:50,345-Speed 4417.39 samples/sec Loss 6.3647 Epoch: 4 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:57:02,330-Speed 4272.14 samples/sec Loss 6.4040 Epoch: 4 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:57:15,000-Speed 4041.26 samples/sec Loss 6.4028 Epoch: 4 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:57:45,681-[lfw][78000]XNorm: 22.616188 Training: 2021-03-15 03:57:45,681-[lfw][78000]Accuracy-Flip: 0.99483+-0.00189 Training: 2021-03-15 03:57:45,681-[lfw][78000]Accuracy-Highest: 0.99650 Training: 2021-03-15 03:58:20,967-[cfp_fp][78000]XNorm: 18.761890 Training: 2021-03-15 03:58:20,967-[cfp_fp][78000]Accuracy-Flip: 0.93086+-0.01114 Training: 2021-03-15 03:58:20,967-[cfp_fp][78000]Accuracy-Highest: 0.93971 Training: 2021-03-15 03:58:51,885-[agedb_30][78000]XNorm: 21.528658 Training: 2021-03-15 03:58:51,886-[agedb_30][78000]Accuracy-Flip: 0.95383+-0.01014 Training: 2021-03-15 03:58:51,886-[agedb_30][78000]Accuracy-Highest: 0.96083 Training: 2021-03-15 03:59:04,455-Speed 467.78 samples/sec Loss 6.4411 Epoch: 4 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:59:15,989-Speed 4439.39 samples/sec Loss 6.3563 Epoch: 4 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:59:27,850-Speed 4316.55 samples/sec Loss 6.5370 Epoch: 4 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:59:39,726-Speed 4311.49 samples/sec Loss 6.4526 Epoch: 4 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 03:59:51,853-Speed 4222.49 samples/sec Loss 6.3916 Epoch: 4 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:00:04,703-Speed 3984.52 samples/sec Loss 6.4414 Epoch: 4 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:00:16,369-Speed 4389.18 samples/sec Loss 6.3780 Epoch: 4 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:00:28,137-Speed 4350.87 samples/sec Loss 6.4561 Epoch: 4 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:00:39,747-Speed 4410.23 samples/sec Loss 6.3871 Epoch: 4 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:00:51,463-Speed 4370.57 samples/sec Loss 6.3960 Epoch: 4 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:01:03,333-Speed 4313.32 samples/sec Loss 6.4527 Epoch: 4 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:01:15,025-Speed 4379.27 samples/sec Loss 6.4566 Epoch: 4 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:01:26,862-Speed 4326.04 samples/sec Loss 6.4421 Epoch: 4 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:01:38,519-Speed 4392.32 samples/sec Loss 6.3628 Epoch: 4 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:01:50,363-Speed 4323.05 samples/sec Loss 6.4275 Epoch: 4 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:02:02,990-Speed 4055.14 samples/sec Loss 6.4158 Epoch: 4 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:02:14,923-Speed 4290.76 samples/sec Loss 6.4109 Epoch: 4 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:02:26,866-Speed 4287.55 samples/sec Loss 6.4116 Epoch: 4 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:02:38,609-Speed 4360.14 samples/sec Loss 6.4146 Epoch: 4 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:02:50,562-Speed 4283.64 samples/sec Loss 6.4588 Epoch: 4 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:03:02,398-Speed 4326.34 samples/sec Loss 6.4113 Epoch: 4 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:03:14,106-Speed 4373.42 samples/sec Loss 6.4212 Epoch: 4 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:03:25,782-Speed 4385.25 samples/sec Loss 6.4223 Epoch: 4 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:03:37,324-Speed 4436.16 samples/sec Loss 6.3811 Epoch: 4 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:03:49,208-Speed 4308.62 samples/sec Loss 6.3893 Epoch: 4 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:04:01,563-Speed 4144.44 samples/sec Loss 6.3796 Epoch: 4 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:04:13,341-Speed 4347.23 samples/sec Loss 6.4161 Epoch: 4 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:04:25,166-Speed 4330.08 samples/sec Loss 6.4210 Epoch: 4 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:04:36,742-Speed 4423.08 samples/sec Loss 6.4736 Epoch: 4 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:04:48,575-Speed 4327.15 samples/sec Loss 6.4341 Epoch: 4 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:05:00,577-Speed 4266.07 samples/sec Loss 6.4116 Epoch: 4 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:05:12,454-Speed 4311.48 samples/sec Loss 6.3933 Epoch: 4 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:05:24,410-Speed 4282.62 samples/sec Loss 6.4765 Epoch: 4 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:05:36,113-Speed 4374.94 samples/sec Loss 6.4518 Epoch: 4 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:05:49,197-Speed 3913.54 samples/sec Loss 6.4942 Epoch: 4 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:06:01,815-Speed 4058.06 samples/sec Loss 6.4201 Epoch: 4 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:06:13,589-Speed 4348.69 samples/sec Loss 6.3622 Epoch: 4 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:06:25,216-Speed 4403.81 samples/sec Loss 6.4385 Epoch: 4 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:06:36,956-Speed 4361.47 samples/sec Loss 6.4507 Epoch: 4 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:06:48,687-Speed 4364.93 samples/sec Loss 6.4285 Epoch: 4 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:07:19,085-[lfw][80000]XNorm: 21.741372 Training: 2021-03-15 04:07:19,086-[lfw][80000]Accuracy-Flip: 0.99533+-0.00420 Training: 2021-03-15 04:07:19,086-[lfw][80000]Accuracy-Highest: 0.99650 Training: 2021-03-15 04:07:54,100-[cfp_fp][80000]XNorm: 17.899919 Training: 2021-03-15 04:07:54,101-[cfp_fp][80000]Accuracy-Flip: 0.93243+-0.00658 Training: 2021-03-15 04:07:54,101-[cfp_fp][80000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:08:24,249-[agedb_30][80000]XNorm: 21.215334 Training: 2021-03-15 04:08:24,250-[agedb_30][80000]Accuracy-Flip: 0.95250+-0.01070 Training: 2021-03-15 04:08:24,250-[agedb_30][80000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:08:35,969-Speed 477.25 samples/sec Loss 6.4299 Epoch: 4 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:08:47,566-Speed 4414.97 samples/sec Loss 6.3647 Epoch: 4 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:08:59,532-Speed 4278.93 samples/sec Loss 6.3733 Epoch: 4 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:09:11,200-Speed 4388.34 samples/sec Loss 6.3837 Epoch: 4 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:09:22,961-Speed 4353.70 samples/sec Loss 6.3829 Epoch: 4 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:09:34,842-Speed 4309.43 samples/sec Loss 6.3884 Epoch: 4 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:09:46,614-Speed 4349.85 samples/sec Loss 6.4158 Epoch: 4 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:09:58,378-Speed 4352.47 samples/sec Loss 6.3943 Epoch: 4 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:10:11,153-Speed 4007.84 samples/sec Loss 6.4136 Epoch: 4 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:10:23,773-Speed 4057.50 samples/sec Loss 6.4462 Epoch: 4 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:10:35,591-Speed 4332.28 samples/sec Loss 6.3440 Epoch: 4 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:10:47,468-Speed 4311.14 samples/sec Loss 6.4256 Epoch: 4 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:10:59,196-Speed 4366.06 samples/sec Loss 6.3619 Epoch: 4 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:11:10,869-Speed 4386.52 samples/sec Loss 6.3648 Epoch: 4 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:11:22,799-Speed 4291.74 samples/sec Loss 6.4317 Epoch: 4 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:11:34,529-Speed 4365.25 samples/sec Loss 6.4100 Epoch: 4 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:11:47,207-Speed 4038.84 samples/sec Loss 6.3689 Epoch: 4 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:11:58,880-Speed 4386.45 samples/sec Loss 6.4364 Epoch: 4 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:12:10,760-Speed 4309.94 samples/sec Loss 6.3876 Epoch: 4 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:12:22,645-Speed 4307.94 samples/sec Loss 6.3889 Epoch: 4 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:12:34,435-Speed 4343.03 samples/sec Loss 6.4256 Epoch: 4 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:12:46,155-Speed 4369.01 samples/sec Loss 6.3878 Epoch: 4 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:12:58,075-Speed 4295.59 samples/sec Loss 6.3725 Epoch: 4 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:13:10,540-Speed 4107.65 samples/sec Loss 6.4314 Epoch: 4 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:13:22,360-Speed 4331.86 samples/sec Loss 6.3922 Epoch: 4 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:13:34,108-Speed 4358.47 samples/sec Loss 6.4227 Epoch: 4 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:13:46,009-Speed 4302.24 samples/sec Loss 6.4462 Epoch: 4 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:13:57,734-Speed 4367.22 samples/sec Loss 6.3964 Epoch: 4 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:14:09,397-Speed 4389.86 samples/sec Loss 6.4174 Epoch: 4 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:14:21,096-Speed 4376.76 samples/sec Loss 6.4033 Epoch: 4 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:14:32,926-Speed 4328.54 samples/sec Loss 6.4163 Epoch: 4 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:14:45,669-Speed 4018.04 samples/sec Loss 6.4155 Epoch: 4 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:14:57,243-Speed 4423.92 samples/sec Loss 6.4236 Epoch: 4 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:15:09,275-Speed 4255.47 samples/sec Loss 6.4775 Epoch: 4 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:15:21,125-Speed 4321.09 samples/sec Loss 6.3767 Epoch: 4 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:15:33,062-Speed 4289.28 samples/sec Loss 6.4123 Epoch: 4 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:15:44,731-Speed 4387.91 samples/sec Loss 6.4455 Epoch: 4 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:15:57,562-Speed 3990.78 samples/sec Loss 6.3923 Epoch: 4 Global Step: 81900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:16:09,286-Speed 4367.41 samples/sec Loss 6.4722 Epoch: 4 Global Step: 81950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:16:21,082-Speed 4340.62 samples/sec Loss 6.4173 Epoch: 4 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:16:51,341-[lfw][82000]XNorm: 22.922236 Training: 2021-03-15 04:16:51,341-[lfw][82000]Accuracy-Flip: 0.99583+-0.00271 Training: 2021-03-15 04:16:51,341-[lfw][82000]Accuracy-Highest: 0.99650 Training: 2021-03-15 04:17:26,126-[cfp_fp][82000]XNorm: 19.396101 Training: 2021-03-15 04:17:26,126-[cfp_fp][82000]Accuracy-Flip: 0.93386+-0.01129 Training: 2021-03-15 04:17:26,126-[cfp_fp][82000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:17:56,255-[agedb_30][82000]XNorm: 22.526123 Training: 2021-03-15 04:17:56,256-[agedb_30][82000]Accuracy-Flip: 0.95783+-0.00803 Training: 2021-03-15 04:17:56,256-[agedb_30][82000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:18:08,907-Speed 474.85 samples/sec Loss 6.4027 Epoch: 4 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:18:20,640-Speed 4363.84 samples/sec Loss 6.4470 Epoch: 4 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:18:32,314-Speed 4386.23 samples/sec Loss 6.4013 Epoch: 4 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:18:43,936-Speed 4405.36 samples/sec Loss 6.4277 Epoch: 4 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:18:55,752-Speed 4333.57 samples/sec Loss 6.4088 Epoch: 4 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:19:07,610-Speed 4317.79 samples/sec Loss 6.4271 Epoch: 4 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:19:19,371-Speed 4353.78 samples/sec Loss 6.3716 Epoch: 4 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:19:31,058-Speed 4381.33 samples/sec Loss 6.3701 Epoch: 4 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:19:42,599-Speed 4436.44 samples/sec Loss 6.4387 Epoch: 4 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:19:54,209-Speed 4410.08 samples/sec Loss 6.4516 Epoch: 4 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:20:06,679-Speed 4106.26 samples/sec Loss 6.4124 Epoch: 4 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:20:18,436-Speed 4355.25 samples/sec Loss 6.4147 Epoch: 4 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:20:30,165-Speed 4365.39 samples/sec Loss 6.4024 Epoch: 4 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:20:41,902-Speed 4362.50 samples/sec Loss 6.3173 Epoch: 4 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:20:53,538-Speed 4400.39 samples/sec Loss 6.4311 Epoch: 4 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:21:06,103-Speed 4075.13 samples/sec Loss 6.4253 Epoch: 4 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:21:17,973-Speed 4313.71 samples/sec Loss 6.4046 Epoch: 4 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:21:30,769-Speed 4001.43 samples/sec Loss 6.3841 Epoch: 4 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:21:42,422-Speed 4394.00 samples/sec Loss 6.3789 Epoch: 4 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:21:54,329-Speed 4300.42 samples/sec Loss 6.4918 Epoch: 4 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:22:05,788-Speed 4468.15 samples/sec Loss 6.3887 Epoch: 4 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:22:17,742-Speed 4283.53 samples/sec Loss 6.4815 Epoch: 4 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:22:29,581-Speed 4324.76 samples/sec Loss 6.3654 Epoch: 4 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:22:41,284-Speed 4375.10 samples/sec Loss 6.3865 Epoch: 4 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:22:52,986-Speed 4375.69 samples/sec Loss 6.4169 Epoch: 4 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:23:04,871-Speed 4308.24 samples/sec Loss 6.3787 Epoch: 4 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:23:17,603-Speed 4021.58 samples/sec Loss 6.3318 Epoch: 4 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:23:29,334-Speed 4365.08 samples/sec Loss 6.4155 Epoch: 4 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:23:41,508-Speed 4205.85 samples/sec Loss 6.4276 Epoch: 4 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:24:05,996-Speed 2090.93 samples/sec Loss 5.8517 Epoch: 5 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:24:18,607-Speed 4060.28 samples/sec Loss 5.7205 Epoch: 5 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:24:29,970-Speed 4505.91 samples/sec Loss 5.7212 Epoch: 5 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:24:41,560-Speed 4417.86 samples/sec Loss 5.8082 Epoch: 5 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:24:53,202-Speed 4398.21 samples/sec Loss 5.7738 Epoch: 5 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:25:04,808-Speed 4411.60 samples/sec Loss 5.7504 Epoch: 5 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:25:16,289-Speed 4460.01 samples/sec Loss 5.7985 Epoch: 5 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:25:27,855-Speed 4426.70 samples/sec Loss 5.8419 Epoch: 5 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:25:39,289-Speed 4478.29 samples/sec Loss 5.8646 Epoch: 5 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:25:50,913-Speed 4404.91 samples/sec Loss 5.8656 Epoch: 5 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:26:02,544-Speed 4402.26 samples/sec Loss 5.8916 Epoch: 5 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:26:32,936-[lfw][84000]XNorm: 21.798989 Training: 2021-03-15 04:26:32,936-[lfw][84000]Accuracy-Flip: 0.99683+-0.00293 Training: 2021-03-15 04:26:32,936-[lfw][84000]Accuracy-Highest: 0.99683 Training: 2021-03-15 04:27:08,156-[cfp_fp][84000]XNorm: 18.341610 Training: 2021-03-15 04:27:08,157-[cfp_fp][84000]Accuracy-Flip: 0.93643+-0.01088 Training: 2021-03-15 04:27:08,157-[cfp_fp][84000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:27:38,296-[agedb_30][84000]XNorm: 20.874474 Training: 2021-03-15 04:27:38,296-[agedb_30][84000]Accuracy-Flip: 0.95700+-0.00897 Training: 2021-03-15 04:27:38,298-[agedb_30][84000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:27:50,682-Speed 473.47 samples/sec Loss 5.9156 Epoch: 5 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:28:01,962-Speed 4539.10 samples/sec Loss 5.9296 Epoch: 5 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:28:14,341-Speed 4136.39 samples/sec Loss 5.9503 Epoch: 5 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:28:25,627-Speed 4536.94 samples/sec Loss 5.9544 Epoch: 5 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:28:37,208-Speed 4421.07 samples/sec Loss 5.9271 Epoch: 5 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:28:48,730-Speed 4444.05 samples/sec Loss 5.9674 Epoch: 5 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:00,205-Speed 4461.93 samples/sec Loss 5.9632 Epoch: 5 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:12,160-Speed 4283.12 samples/sec Loss 6.0066 Epoch: 5 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:23,581-Speed 4483.19 samples/sec Loss 6.0031 Epoch: 5 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:34,953-Speed 4502.43 samples/sec Loss 6.0166 Epoch: 5 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:46,430-Speed 4461.31 samples/sec Loss 6.0128 Epoch: 5 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:29:57,767-Speed 4516.55 samples/sec Loss 6.0659 Epoch: 5 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:30:09,151-Speed 4497.52 samples/sec Loss 6.0546 Epoch: 5 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:30:20,660-Speed 4448.99 samples/sec Loss 5.9683 Epoch: 5 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:30:32,092-Speed 4478.83 samples/sec Loss 6.0866 Epoch: 5 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:30:43,466-Speed 4501.74 samples/sec Loss 6.0927 Epoch: 5 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:30:54,876-Speed 4487.35 samples/sec Loss 6.0468 Epoch: 5 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:31:06,191-Speed 4525.17 samples/sec Loss 6.1297 Epoch: 5 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:31:17,597-Speed 4489.32 samples/sec Loss 6.0905 Epoch: 5 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:31:28,917-Speed 4523.04 samples/sec Loss 6.1403 Epoch: 5 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:31:40,182-Speed 4545.26 samples/sec Loss 6.0984 Epoch: 5 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:31:51,510-Speed 4520.09 samples/sec Loss 6.0876 Epoch: 5 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:32:02,871-Speed 4507.06 samples/sec Loss 6.1927 Epoch: 5 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:32:14,241-Speed 4503.15 samples/sec Loss 6.1285 Epoch: 5 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:32:27,087-Speed 3986.09 samples/sec Loss 6.1290 Epoch: 5 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:32:38,665-Speed 4422.36 samples/sec Loss 6.1546 Epoch: 5 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:32:51,228-Speed 4075.66 samples/sec Loss 6.1229 Epoch: 5 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:02,544-Speed 4524.56 samples/sec Loss 6.1184 Epoch: 5 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:13,866-Speed 4522.58 samples/sec Loss 6.1450 Epoch: 5 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:25,399-Speed 4439.66 samples/sec Loss 6.1283 Epoch: 5 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:36,779-Speed 4499.44 samples/sec Loss 6.1631 Epoch: 5 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:48,208-Speed 4479.88 samples/sec Loss 6.1413 Epoch: 5 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:33:59,774-Speed 4427.22 samples/sec Loss 6.1258 Epoch: 5 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:34:11,089-Speed 4524.99 samples/sec Loss 6.1614 Epoch: 5 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:34:22,287-Speed 4572.51 samples/sec Loss 6.2326 Epoch: 5 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:34:33,784-Speed 4453.67 samples/sec Loss 6.1509 Epoch: 5 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:34:46,071-Speed 4167.25 samples/sec Loss 6.1893 Epoch: 5 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:34:57,339-Speed 4543.89 samples/sec Loss 6.1785 Epoch: 5 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:35:08,575-Speed 4557.18 samples/sec Loss 6.1603 Epoch: 5 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:35:19,759-Speed 4578.20 samples/sec Loss 6.1590 Epoch: 5 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:35:50,054-[lfw][86000]XNorm: 24.369923 Training: 2021-03-15 04:35:50,054-[lfw][86000]Accuracy-Flip: 0.99517+-0.00252 Training: 2021-03-15 04:35:50,054-[lfw][86000]Accuracy-Highest: 0.99683 Training: 2021-03-15 04:36:25,287-[cfp_fp][86000]XNorm: 20.421685 Training: 2021-03-15 04:36:25,288-[cfp_fp][86000]Accuracy-Flip: 0.93414+-0.01368 Training: 2021-03-15 04:36:25,288-[cfp_fp][86000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:36:55,597-[agedb_30][86000]XNorm: 23.602139 Training: 2021-03-15 04:36:55,598-[agedb_30][86000]Accuracy-Flip: 0.95750+-0.01302 Training: 2021-03-15 04:36:55,598-[agedb_30][86000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:37:07,694-Speed 474.36 samples/sec Loss 6.2311 Epoch: 5 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:37:18,940-Speed 4553.15 samples/sec Loss 6.1649 Epoch: 5 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:37:30,348-Speed 4488.27 samples/sec Loss 6.1811 Epoch: 5 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:37:41,870-Speed 4443.86 samples/sec Loss 6.1524 Epoch: 5 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:37:53,177-Speed 4528.59 samples/sec Loss 6.1953 Epoch: 5 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:38:04,499-Speed 4522.26 samples/sec Loss 6.1645 Epoch: 5 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-15 04:38:15,972-Speed 4462.85 samples/sec Loss 6.2516 Epoch: 5 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:38:27,514-Speed 4436.46 samples/sec Loss 6.2176 Epoch: 5 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:38:39,060-Speed 4434.57 samples/sec Loss 6.2350 Epoch: 5 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:38:50,645-Speed 4419.94 samples/sec Loss 6.2592 Epoch: 5 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:39:01,825-Speed 4579.85 samples/sec Loss 6.1913 Epoch: 5 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:39:14,102-Speed 4170.38 samples/sec Loss 6.2003 Epoch: 5 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:39:25,380-Speed 4540.33 samples/sec Loss 6.2562 Epoch: 5 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:39:36,855-Speed 4462.11 samples/sec Loss 6.2402 Epoch: 5 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:39:49,175-Speed 4155.81 samples/sec Loss 6.1679 Epoch: 5 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:00,609-Speed 4478.10 samples/sec Loss 6.2334 Epoch: 5 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:12,009-Speed 4491.54 samples/sec Loss 6.2820 Epoch: 5 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:23,336-Speed 4520.35 samples/sec Loss 6.2054 Epoch: 5 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:35,590-Speed 4178.42 samples/sec Loss 6.2516 Epoch: 5 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:46,948-Speed 4508.55 samples/sec Loss 6.3041 Epoch: 5 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:40:58,353-Speed 4489.52 samples/sec Loss 6.2633 Epoch: 5 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:41:09,692-Speed 4515.68 samples/sec Loss 6.2705 Epoch: 5 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:41:21,048-Speed 4508.62 samples/sec Loss 6.2362 Epoch: 5 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:41:32,378-Speed 4519.25 samples/sec Loss 6.2908 Epoch: 5 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:41:43,770-Speed 4494.65 samples/sec Loss 6.2376 Epoch: 5 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:41:54,934-Speed 4586.54 samples/sec Loss 6.2962 Epoch: 5 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:42:06,530-Speed 4415.47 samples/sec Loss 6.2383 Epoch: 5 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:42:18,047-Speed 4445.93 samples/sec Loss 6.2224 Epoch: 5 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:42:29,386-Speed 4515.33 samples/sec Loss 6.3635 Epoch: 5 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:42:41,017-Speed 4402.39 samples/sec Loss 6.2504 Epoch: 5 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:42:52,489-Speed 4463.24 samples/sec Loss 6.2360 Epoch: 5 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:43:03,866-Speed 4500.78 samples/sec Loss 6.2362 Epoch: 5 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:43:15,139-Speed 4541.84 samples/sec Loss 6.2705 Epoch: 5 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:43:26,373-Speed 4557.87 samples/sec Loss 6.2517 Epoch: 5 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:43:37,604-Speed 4559.16 samples/sec Loss 6.3217 Epoch: 5 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:43:49,037-Speed 4478.44 samples/sec Loss 6.3295 Epoch: 5 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:44:01,148-Speed 4227.72 samples/sec Loss 6.2658 Epoch: 5 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:44:14,315-Speed 3888.66 samples/sec Loss 6.3266 Epoch: 5 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:44:25,577-Speed 4546.51 samples/sec Loss 6.2919 Epoch: 5 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:44:37,049-Speed 4463.23 samples/sec Loss 6.3142 Epoch: 5 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:45:07,201-[lfw][88000]XNorm: 22.220488 Training: 2021-03-15 04:45:07,201-[lfw][88000]Accuracy-Flip: 0.99617+-0.00269 Training: 2021-03-15 04:45:07,201-[lfw][88000]Accuracy-Highest: 0.99683 Training: 2021-03-15 04:45:42,201-[cfp_fp][88000]XNorm: 18.743801 Training: 2021-03-15 04:45:42,202-[cfp_fp][88000]Accuracy-Flip: 0.93500+-0.01138 Training: 2021-03-15 04:45:42,202-[cfp_fp][88000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:46:12,427-[agedb_30][88000]XNorm: 21.790101 Training: 2021-03-15 04:46:12,427-[agedb_30][88000]Accuracy-Flip: 0.95333+-0.01033 Training: 2021-03-15 04:46:12,427-[agedb_30][88000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:46:23,876-Speed 479.28 samples/sec Loss 6.3232 Epoch: 5 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:46:35,044-Speed 4584.89 samples/sec Loss 6.3608 Epoch: 5 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:46:46,482-Speed 4476.42 samples/sec Loss 6.3795 Epoch: 5 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:46:57,779-Speed 4532.50 samples/sec Loss 6.3029 Epoch: 5 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:47:09,221-Speed 4475.03 samples/sec Loss 6.2666 Epoch: 5 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:47:20,598-Speed 4500.45 samples/sec Loss 6.2887 Epoch: 5 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:47:32,981-Speed 4135.03 samples/sec Loss 6.3235 Epoch: 5 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:47:44,633-Speed 4394.23 samples/sec Loss 6.2594 Epoch: 5 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:47:56,130-Speed 4453.58 samples/sec Loss 6.2989 Epoch: 5 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:48:07,442-Speed 4526.56 samples/sec Loss 6.3026 Epoch: 5 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:48:18,838-Speed 4492.96 samples/sec Loss 6.3356 Epoch: 5 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:48:31,207-Speed 4139.51 samples/sec Loss 6.3340 Epoch: 5 Global Step: 88600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:48:42,626-Speed 4483.98 samples/sec Loss 6.3957 Epoch: 5 Global Step: 88650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:48:53,865-Speed 4555.63 samples/sec Loss 6.3750 Epoch: 5 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:49:05,392-Speed 4442.06 samples/sec Loss 6.2937 Epoch: 5 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:49:16,723-Speed 4518.83 samples/sec Loss 6.3077 Epoch: 5 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:49:28,116-Speed 4493.97 samples/sec Loss 6.2696 Epoch: 5 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:49:39,430-Speed 4525.76 samples/sec Loss 6.3257 Epoch: 5 Global Step: 88900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:49:50,888-Speed 4468.50 samples/sec Loss 6.3933 Epoch: 5 Global Step: 88950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:50:02,334-Speed 4473.62 samples/sec Loss 6.3369 Epoch: 5 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:50:13,897-Speed 4427.98 samples/sec Loss 6.3252 Epoch: 5 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:50:25,174-Speed 4540.52 samples/sec Loss 6.3326 Epoch: 5 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:50:37,217-Speed 4251.67 samples/sec Loss 6.3368 Epoch: 5 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:50:48,580-Speed 4506.20 samples/sec Loss 6.2922 Epoch: 5 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:00,587-Speed 4264.23 samples/sec Loss 6.3392 Epoch: 5 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:11,877-Speed 4535.24 samples/sec Loss 6.3082 Epoch: 5 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:23,311-Speed 4478.25 samples/sec Loss 6.3341 Epoch: 5 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:34,441-Speed 4600.51 samples/sec Loss 6.3530 Epoch: 5 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:45,735-Speed 4533.47 samples/sec Loss 6.3436 Epoch: 5 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:51:57,158-Speed 4482.19 samples/sec Loss 6.3092 Epoch: 5 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:52:09,452-Speed 4165.00 samples/sec Loss 6.3352 Epoch: 5 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:52:20,768-Speed 4524.71 samples/sec Loss 6.3820 Epoch: 5 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:52:32,191-Speed 4482.28 samples/sec Loss 6.3678 Epoch: 5 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:52:43,609-Speed 4484.42 samples/sec Loss 6.2680 Epoch: 5 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:52:55,037-Speed 4480.68 samples/sec Loss 6.3442 Epoch: 5 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:53:06,459-Speed 4482.70 samples/sec Loss 6.2708 Epoch: 5 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:53:17,896-Speed 4476.96 samples/sec Loss 6.3567 Epoch: 5 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:53:29,429-Speed 4439.73 samples/sec Loss 6.3558 Epoch: 5 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:53:40,901-Speed 4463.34 samples/sec Loss 6.3399 Epoch: 5 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:53:52,297-Speed 4492.95 samples/sec Loss 6.2357 Epoch: 5 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:54:22,484-[lfw][90000]XNorm: 21.771358 Training: 2021-03-15 04:54:22,485-[lfw][90000]Accuracy-Flip: 0.99600+-0.00318 Training: 2021-03-15 04:54:22,485-[lfw][90000]Accuracy-Highest: 0.99683 Training: 2021-03-15 04:54:57,384-[cfp_fp][90000]XNorm: 18.492584 Training: 2021-03-15 04:54:57,384-[cfp_fp][90000]Accuracy-Flip: 0.93614+-0.01050 Training: 2021-03-15 04:54:57,385-[cfp_fp][90000]Accuracy-Highest: 0.93971 Training: 2021-03-15 04:55:27,410-[agedb_30][90000]XNorm: 20.924728 Training: 2021-03-15 04:55:27,411-[agedb_30][90000]Accuracy-Flip: 0.95367+-0.00977 Training: 2021-03-15 04:55:27,411-[agedb_30][90000]Accuracy-Highest: 0.96083 Training: 2021-03-15 04:55:38,831-Speed 480.60 samples/sec Loss 6.3736 Epoch: 5 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:55:50,115-Speed 4537.90 samples/sec Loss 6.3379 Epoch: 5 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:01,597-Speed 4459.42 samples/sec Loss 6.2870 Epoch: 5 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:13,113-Speed 4446.06 samples/sec Loss 6.2935 Epoch: 5 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:24,546-Speed 4478.60 samples/sec Loss 6.3470 Epoch: 5 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:36,044-Speed 4453.04 samples/sec Loss 6.4084 Epoch: 5 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:47,287-Speed 4554.02 samples/sec Loss 6.3451 Epoch: 5 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:56:59,650-Speed 4141.80 samples/sec Loss 6.3694 Epoch: 5 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:57:11,898-Speed 4180.13 samples/sec Loss 6.3717 Epoch: 5 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:57:23,351-Speed 4470.95 samples/sec Loss 6.3750 Epoch: 5 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:57:34,768-Speed 4484.61 samples/sec Loss 6.3680 Epoch: 5 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:57:46,507-Speed 4361.63 samples/sec Loss 6.2752 Epoch: 5 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:57:58,660-Speed 4213.11 samples/sec Loss 6.3313 Epoch: 5 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:58:10,049-Speed 4495.94 samples/sec Loss 6.3189 Epoch: 5 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:58:21,507-Speed 4468.61 samples/sec Loss 6.3268 Epoch: 5 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:58:32,726-Speed 4564.10 samples/sec Loss 6.3482 Epoch: 5 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:58:44,006-Speed 4539.13 samples/sec Loss 6.2579 Epoch: 5 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:58:56,137-Speed 4220.84 samples/sec Loss 6.3056 Epoch: 5 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:59:07,646-Speed 4449.07 samples/sec Loss 6.3268 Epoch: 5 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:59:19,187-Speed 4436.79 samples/sec Loss 6.3885 Epoch: 5 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:59:31,437-Speed 4179.59 samples/sec Loss 6.3612 Epoch: 5 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:59:42,747-Speed 4527.06 samples/sec Loss 6.3490 Epoch: 5 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 04:59:54,318-Speed 4425.41 samples/sec Loss 6.3327 Epoch: 5 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:00:05,805-Speed 4457.34 samples/sec Loss 6.3526 Epoch: 5 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:00:17,082-Speed 4540.57 samples/sec Loss 6.3451 Epoch: 5 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:00:28,328-Speed 4553.00 samples/sec Loss 6.3324 Epoch: 5 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:00:39,833-Speed 4450.61 samples/sec Loss 6.3591 Epoch: 5 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:00:51,166-Speed 4517.83 samples/sec Loss 6.3288 Epoch: 5 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:01:02,515-Speed 4511.57 samples/sec Loss 6.3827 Epoch: 5 Global Step: 91450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:01:13,821-Speed 4528.88 samples/sec Loss 6.4170 Epoch: 5 Global Step: 91500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:01:25,385-Speed 4427.65 samples/sec Loss 6.4177 Epoch: 5 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:01:37,490-Speed 4230.15 samples/sec Loss 6.3834 Epoch: 5 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:01:48,845-Speed 4509.12 samples/sec Loss 6.4036 Epoch: 5 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:00,163-Speed 4524.16 samples/sec Loss 6.3496 Epoch: 5 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:12,386-Speed 4188.91 samples/sec Loss 6.3966 Epoch: 5 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:23,699-Speed 4526.02 samples/sec Loss 6.3144 Epoch: 5 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:35,348-Speed 4395.40 samples/sec Loss 6.3104 Epoch: 5 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:46,795-Speed 4473.03 samples/sec Loss 6.2964 Epoch: 5 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:02:58,063-Speed 4543.97 samples/sec Loss 6.3581 Epoch: 5 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:03:09,398-Speed 4517.36 samples/sec Loss 6.3397 Epoch: 5 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:03:39,934-[lfw][92000]XNorm: 21.433957 Training: 2021-03-15 05:03:39,934-[lfw][92000]Accuracy-Flip: 0.99517+-0.00217 Training: 2021-03-15 05:03:39,934-[lfw][92000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:04:15,248-[cfp_fp][92000]XNorm: 17.290130 Training: 2021-03-15 05:04:15,248-[cfp_fp][92000]Accuracy-Flip: 0.93757+-0.01060 Training: 2021-03-15 05:04:15,248-[cfp_fp][92000]Accuracy-Highest: 0.93971 Training: 2021-03-15 05:04:45,474-[agedb_30][92000]XNorm: 20.034763 Training: 2021-03-15 05:04:45,474-[agedb_30][92000]Accuracy-Flip: 0.95517+-0.00724 Training: 2021-03-15 05:04:45,475-[agedb_30][92000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:04:57,612-Speed 473.14 samples/sec Loss 6.3925 Epoch: 5 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:05:09,171-Speed 4429.48 samples/sec Loss 6.4049 Epoch: 5 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:05:20,707-Speed 4438.69 samples/sec Loss 6.3638 Epoch: 5 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:05:32,276-Speed 4425.92 samples/sec Loss 6.3543 Epoch: 5 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:05:43,534-Speed 4548.23 samples/sec Loss 6.3566 Epoch: 5 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:05:54,841-Speed 4528.44 samples/sec Loss 6.3531 Epoch: 5 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:06:06,297-Speed 4469.37 samples/sec Loss 6.3211 Epoch: 5 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:06:17,975-Speed 4384.55 samples/sec Loss 6.3576 Epoch: 5 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:06:29,444-Speed 4464.19 samples/sec Loss 6.3854 Epoch: 5 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:06:40,885-Speed 4475.50 samples/sec Loss 6.4044 Epoch: 5 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:06:52,074-Speed 4576.15 samples/sec Loss 6.3450 Epoch: 5 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:07:03,440-Speed 4504.80 samples/sec Loss 6.3391 Epoch: 5 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:07:14,904-Speed 4466.56 samples/sec Loss 6.3567 Epoch: 5 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:07:26,123-Speed 4564.10 samples/sec Loss 6.3741 Epoch: 5 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:07:37,644-Speed 4444.18 samples/sec Loss 6.3530 Epoch: 5 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:07:49,174-Speed 4440.82 samples/sec Loss 6.4170 Epoch: 5 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:00,565-Speed 4494.84 samples/sec Loss 6.3513 Epoch: 5 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:12,916-Speed 4145.58 samples/sec Loss 6.2963 Epoch: 5 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:25,180-Speed 4175.01 samples/sec Loss 6.3422 Epoch: 5 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:36,518-Speed 4516.34 samples/sec Loss 6.3470 Epoch: 5 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:47,866-Speed 4512.14 samples/sec Loss 6.3636 Epoch: 5 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:08:59,404-Speed 4437.73 samples/sec Loss 6.3076 Epoch: 5 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:09:10,682-Speed 4540.03 samples/sec Loss 6.4177 Epoch: 5 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:09:22,053-Speed 4502.83 samples/sec Loss 6.3961 Epoch: 5 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:09:33,445-Speed 4494.71 samples/sec Loss 6.3760 Epoch: 5 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:09:44,855-Speed 4487.22 samples/sec Loss 6.3851 Epoch: 5 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:09:56,299-Speed 4474.19 samples/sec Loss 6.3329 Epoch: 5 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:10:08,534-Speed 4185.06 samples/sec Loss 6.3982 Epoch: 5 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:10:20,696-Speed 4210.08 samples/sec Loss 6.3691 Epoch: 5 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:10:32,040-Speed 4513.67 samples/sec Loss 6.3392 Epoch: 5 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:10:43,542-Speed 4451.71 samples/sec Loss 6.3612 Epoch: 5 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:10:55,615-Speed 4241.11 samples/sec Loss 6.3650 Epoch: 5 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:11:06,965-Speed 4511.03 samples/sec Loss 6.3830 Epoch: 5 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:11:18,372-Speed 4489.05 samples/sec Loss 6.3016 Epoch: 5 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:11:29,895-Speed 4443.30 samples/sec Loss 6.3959 Epoch: 5 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:11:41,515-Speed 4406.78 samples/sec Loss 6.3618 Epoch: 5 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:11:52,979-Speed 4466.41 samples/sec Loss 6.3774 Epoch: 5 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:12:04,374-Speed 4493.26 samples/sec Loss 6.3915 Epoch: 5 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:12:15,574-Speed 4571.36 samples/sec Loss 6.4189 Epoch: 5 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:12:27,897-Speed 4155.12 samples/sec Loss 6.3706 Epoch: 5 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:12:58,294-[lfw][94000]XNorm: 21.448588 Training: 2021-03-15 05:12:58,294-[lfw][94000]Accuracy-Flip: 0.99533+-0.00287 Training: 2021-03-15 05:12:58,294-[lfw][94000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:13:33,470-[cfp_fp][94000]XNorm: 17.957500 Training: 2021-03-15 05:13:33,470-[cfp_fp][94000]Accuracy-Flip: 0.93571+-0.01262 Training: 2021-03-15 05:13:33,470-[cfp_fp][94000]Accuracy-Highest: 0.93971 Training: 2021-03-15 05:14:03,478-[agedb_30][94000]XNorm: 21.081120 Training: 2021-03-15 05:14:03,478-[agedb_30][94000]Accuracy-Flip: 0.95200+-0.00888 Training: 2021-03-15 05:14:03,478-[agedb_30][94000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:14:14,773-Speed 479.07 samples/sec Loss 6.3613 Epoch: 5 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:14:26,018-Speed 4553.23 samples/sec Loss 6.3293 Epoch: 5 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:14:37,362-Speed 4513.47 samples/sec Loss 6.3561 Epoch: 5 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:14:48,820-Speed 4468.84 samples/sec Loss 6.4380 Epoch: 5 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:00,930-Speed 4228.17 samples/sec Loss 6.3709 Epoch: 5 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:12,423-Speed 4455.17 samples/sec Loss 6.3366 Epoch: 5 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:23,831-Speed 4488.19 samples/sec Loss 6.3620 Epoch: 5 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:35,268-Speed 4476.81 samples/sec Loss 6.3791 Epoch: 5 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:46,773-Speed 4450.68 samples/sec Loss 6.3833 Epoch: 5 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:15:58,169-Speed 4492.72 samples/sec Loss 6.3279 Epoch: 5 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:16:09,596-Speed 4481.11 samples/sec Loss 6.3606 Epoch: 5 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:16:21,806-Speed 4193.32 samples/sec Loss 6.3788 Epoch: 5 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:16:33,263-Speed 4469.38 samples/sec Loss 6.3605 Epoch: 5 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:16:44,614-Speed 4510.72 samples/sec Loss 6.3819 Epoch: 5 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:16:56,059-Speed 4473.63 samples/sec Loss 6.3329 Epoch: 5 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:17:07,447-Speed 4496.31 samples/sec Loss 6.3707 Epoch: 5 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:17:18,908-Speed 4467.87 samples/sec Loss 6.3924 Epoch: 5 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:17:30,424-Speed 4446.18 samples/sec Loss 6.3048 Epoch: 5 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:17:41,638-Speed 4565.94 samples/sec Loss 6.3470 Epoch: 5 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:17:52,951-Speed 4526.01 samples/sec Loss 6.2995 Epoch: 5 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:18:04,453-Speed 4451.50 samples/sec Loss 6.3656 Epoch: 5 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:18:15,783-Speed 4519.34 samples/sec Loss 6.4112 Epoch: 5 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:18:27,145-Speed 4506.37 samples/sec Loss 6.3612 Epoch: 5 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:18:38,522-Speed 4500.38 samples/sec Loss 6.3074 Epoch: 5 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:18:49,818-Speed 4533.16 samples/sec Loss 6.3591 Epoch: 5 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:01,171-Speed 4509.94 samples/sec Loss 6.3328 Epoch: 5 Global Step: 95300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:12,659-Speed 4457.12 samples/sec Loss 6.3476 Epoch: 5 Global Step: 95350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:24,007-Speed 4511.89 samples/sec Loss 6.3717 Epoch: 5 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:36,143-Speed 4219.25 samples/sec Loss 6.3226 Epoch: 5 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:48,242-Speed 4231.99 samples/sec Loss 6.3777 Epoch: 5 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:19:59,600-Speed 4508.07 samples/sec Loss 6.3901 Epoch: 5 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:20:11,080-Speed 4459.96 samples/sec Loss 6.3514 Epoch: 5 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:20:22,497-Speed 4485.06 samples/sec Loss 6.4248 Epoch: 5 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:20:33,756-Speed 4547.53 samples/sec Loss 6.3434 Epoch: 5 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:20:45,118-Speed 4506.37 samples/sec Loss 6.2860 Epoch: 5 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:20:56,569-Speed 4471.36 samples/sec Loss 6.3499 Epoch: 5 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:21:07,870-Speed 4530.81 samples/sec Loss 6.3185 Epoch: 5 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:21:19,239-Speed 4503.68 samples/sec Loss 6.3638 Epoch: 5 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:21:30,416-Speed 4581.45 samples/sec Loss 6.3656 Epoch: 5 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:21:41,733-Speed 4524.13 samples/sec Loss 6.3406 Epoch: 5 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:22:12,106-[lfw][96000]XNorm: 22.397084 Training: 2021-03-15 05:22:12,106-[lfw][96000]Accuracy-Flip: 0.99517+-0.00263 Training: 2021-03-15 05:22:12,106-[lfw][96000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:22:47,118-[cfp_fp][96000]XNorm: 18.672205 Training: 2021-03-15 05:22:47,118-[cfp_fp][96000]Accuracy-Flip: 0.93957+-0.00950 Training: 2021-03-15 05:22:47,118-[cfp_fp][96000]Accuracy-Highest: 0.93971 Training: 2021-03-15 05:23:17,307-[agedb_30][96000]XNorm: 21.602866 Training: 2021-03-15 05:23:17,307-[agedb_30][96000]Accuracy-Flip: 0.94983+-0.01127 Training: 2021-03-15 05:23:17,307-[agedb_30][96000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:23:29,562-Speed 474.83 samples/sec Loss 6.3512 Epoch: 5 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:23:41,867-Speed 4161.04 samples/sec Loss 6.3935 Epoch: 5 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:23:53,015-Speed 4592.68 samples/sec Loss 6.3448 Epoch: 5 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:24:05,522-Speed 4093.96 samples/sec Loss 6.3612 Epoch: 5 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:24:16,881-Speed 4507.81 samples/sec Loss 6.3617 Epoch: 5 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:24:28,360-Speed 4460.54 samples/sec Loss 6.3603 Epoch: 5 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:24:39,717-Speed 4508.51 samples/sec Loss 6.3751 Epoch: 5 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:24:51,173-Speed 4469.42 samples/sec Loss 6.3790 Epoch: 5 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:25:02,762-Speed 4418.18 samples/sec Loss 6.3265 Epoch: 5 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:25:14,046-Speed 4537.55 samples/sec Loss 6.3430 Epoch: 5 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:25:25,456-Speed 4487.69 samples/sec Loss 6.3785 Epoch: 5 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:25:37,638-Speed 4203.00 samples/sec Loss 6.3750 Epoch: 5 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:25:48,954-Speed 4524.92 samples/sec Loss 6.3810 Epoch: 5 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:00,389-Speed 4477.56 samples/sec Loss 6.3834 Epoch: 5 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:11,727-Speed 4516.44 samples/sec Loss 6.3274 Epoch: 5 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:23,811-Speed 4237.04 samples/sec Loss 6.3577 Epoch: 5 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:35,085-Speed 4541.63 samples/sec Loss 6.3375 Epoch: 5 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:46,551-Speed 4465.77 samples/sec Loss 6.4087 Epoch: 5 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:26:57,743-Speed 4575.04 samples/sec Loss 6.3218 Epoch: 5 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:27:08,939-Speed 4573.24 samples/sec Loss 6.3213 Epoch: 5 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:27:20,284-Speed 4512.93 samples/sec Loss 6.4250 Epoch: 5 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:27:32,445-Speed 4210.40 samples/sec Loss 6.3413 Epoch: 5 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:27:44,044-Speed 4414.44 samples/sec Loss 6.4275 Epoch: 5 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:27:55,394-Speed 4511.57 samples/sec Loss 6.4501 Epoch: 5 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:28:06,819-Speed 4481.64 samples/sec Loss 6.3536 Epoch: 5 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:28:18,333-Speed 4447.07 samples/sec Loss 6.3635 Epoch: 5 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:28:29,539-Speed 4569.41 samples/sec Loss 6.2962 Epoch: 5 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:28:40,859-Speed 4523.25 samples/sec Loss 6.3082 Epoch: 5 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:28:52,323-Speed 4466.37 samples/sec Loss 6.3238 Epoch: 5 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:29:03,689-Speed 4504.79 samples/sec Loss 6.3653 Epoch: 5 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:29:15,161-Speed 4463.32 samples/sec Loss 6.3172 Epoch: 5 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:29:26,447-Speed 4536.69 samples/sec Loss 6.3635 Epoch: 5 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:29:37,802-Speed 4509.38 samples/sec Loss 6.3462 Epoch: 5 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:29:49,258-Speed 4469.24 samples/sec Loss 6.3072 Epoch: 5 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:30:00,732-Speed 4462.49 samples/sec Loss 6.3406 Epoch: 5 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:30:12,090-Speed 4508.06 samples/sec Loss 6.4367 Epoch: 5 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:30:23,466-Speed 4501.20 samples/sec Loss 6.4156 Epoch: 5 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:30:35,075-Speed 4410.66 samples/sec Loss 6.3820 Epoch: 5 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:30:46,470-Speed 4493.18 samples/sec Loss 6.3325 Epoch: 5 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:30:59,406-Speed 3958.04 samples/sec Loss 6.3917 Epoch: 5 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:31:29,432-[lfw][98000]XNorm: 22.464468 Training: 2021-03-15 05:31:29,432-[lfw][98000]Accuracy-Flip: 0.99633+-0.00245 Training: 2021-03-15 05:31:29,433-[lfw][98000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:32:04,386-[cfp_fp][98000]XNorm: 18.897389 Training: 2021-03-15 05:32:04,386-[cfp_fp][98000]Accuracy-Flip: 0.93429+-0.01204 Training: 2021-03-15 05:32:04,386-[cfp_fp][98000]Accuracy-Highest: 0.93971 Training: 2021-03-15 05:32:34,559-[agedb_30][98000]XNorm: 21.562872 Training: 2021-03-15 05:32:34,560-[agedb_30][98000]Accuracy-Flip: 0.95083+-0.01184 Training: 2021-03-15 05:32:34,560-[agedb_30][98000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:32:45,963-Speed 480.50 samples/sec Loss 6.3227 Epoch: 5 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:32:57,632-Speed 4387.83 samples/sec Loss 6.3063 Epoch: 5 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:33:09,136-Speed 4450.62 samples/sec Loss 6.3781 Epoch: 5 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:33:20,504-Speed 4504.15 samples/sec Loss 6.3289 Epoch: 5 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:33:31,866-Speed 4506.53 samples/sec Loss 6.3214 Epoch: 5 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:33:43,125-Speed 4547.71 samples/sec Loss 6.3506 Epoch: 5 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:33:54,465-Speed 4515.32 samples/sec Loss 6.3128 Epoch: 5 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:34:05,849-Speed 4497.94 samples/sec Loss 6.3277 Epoch: 5 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-15 05:34:17,285-Speed 4477.32 samples/sec Loss 6.3303 Epoch: 5 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:34:28,706-Speed 4482.82 samples/sec Loss 6.3435 Epoch: 5 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:34:40,984-Speed 4170.50 samples/sec Loss 6.3563 Epoch: 5 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:34:52,315-Speed 4518.73 samples/sec Loss 6.3793 Epoch: 5 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:35:04,416-Speed 4231.14 samples/sec Loss 6.3319 Epoch: 5 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:35:15,797-Speed 4498.95 samples/sec Loss 6.3968 Epoch: 5 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:35:27,117-Speed 4523.13 samples/sec Loss 6.3239 Epoch: 5 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:35:38,573-Speed 4469.68 samples/sec Loss 6.3292 Epoch: 5 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:35:49,849-Speed 4540.81 samples/sec Loss 6.3906 Epoch: 5 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:02,001-Speed 4213.71 samples/sec Loss 6.3587 Epoch: 5 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:13,290-Speed 4535.58 samples/sec Loss 6.3576 Epoch: 5 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:24,849-Speed 4429.59 samples/sec Loss 6.4100 Epoch: 5 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:36,343-Speed 4454.64 samples/sec Loss 6.3751 Epoch: 5 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:48,663-Speed 4156.08 samples/sec Loss 6.3735 Epoch: 5 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:36:59,970-Speed 4528.37 samples/sec Loss 6.3633 Epoch: 5 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:37:11,481-Speed 4448.02 samples/sec Loss 6.3307 Epoch: 5 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:37:22,792-Speed 4526.95 samples/sec Loss 6.3230 Epoch: 5 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:37:34,962-Speed 4207.26 samples/sec Loss 6.3812 Epoch: 5 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:37:46,185-Speed 4562.26 samples/sec Loss 6.2981 Epoch: 5 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:37:57,416-Speed 4558.89 samples/sec Loss 6.3546 Epoch: 5 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:38:08,868-Speed 4471.01 samples/sec Loss 6.3397 Epoch: 5 Global Step: 99450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:38:20,091-Speed 4562.55 samples/sec Loss 6.3357 Epoch: 5 Global Step: 99500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:38:31,547-Speed 4469.62 samples/sec Loss 6.3618 Epoch: 5 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:38:43,095-Speed 4433.65 samples/sec Loss 6.3254 Epoch: 5 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:38:55,223-Speed 4222.04 samples/sec Loss 6.3434 Epoch: 5 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:39:06,437-Speed 4565.86 samples/sec Loss 6.3304 Epoch: 5 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:39:17,858-Speed 4483.16 samples/sec Loss 6.3641 Epoch: 5 Global Step: 99750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:39:29,068-Speed 4567.43 samples/sec Loss 6.3207 Epoch: 5 Global Step: 99800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:39:40,687-Speed 4406.99 samples/sec Loss 6.3496 Epoch: 5 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:39:51,938-Speed 4551.03 samples/sec Loss 6.4110 Epoch: 5 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:40:03,313-Speed 4501.21 samples/sec Loss 6.3213 Epoch: 5 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:40:14,593-Speed 4539.28 samples/sec Loss 6.3230 Epoch: 5 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:40:44,799-[lfw][100000]XNorm: 23.834990 Training: 2021-03-15 05:40:44,800-[lfw][100000]Accuracy-Flip: 0.99517+-0.00329 Training: 2021-03-15 05:40:44,800-[lfw][100000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:41:19,762-[cfp_fp][100000]XNorm: 20.201047 Training: 2021-03-15 05:41:19,763-[cfp_fp][100000]Accuracy-Flip: 0.92957+-0.01129 Training: 2021-03-15 05:41:19,763-[cfp_fp][100000]Accuracy-Highest: 0.93971 Training: 2021-03-15 05:41:49,948-[agedb_30][100000]XNorm: 23.182780 Training: 2021-03-15 05:41:49,948-[agedb_30][100000]Accuracy-Flip: 0.95583+-0.00638 Training: 2021-03-15 05:41:49,948-[agedb_30][100000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:42:01,445-Speed 479.17 samples/sec Loss 6.3367 Epoch: 5 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:42:12,834-Speed 4495.53 samples/sec Loss 6.3845 Epoch: 5 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:42:37,424-Speed 2082.23 samples/sec Loss 6.3249 Epoch: 6 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:42:49,537-Speed 4227.40 samples/sec Loss 5.6827 Epoch: 6 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:01,160-Speed 4405.28 samples/sec Loss 5.6338 Epoch: 6 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:12,605-Speed 4473.96 samples/sec Loss 5.7001 Epoch: 6 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:23,990-Speed 4497.49 samples/sec Loss 5.7009 Epoch: 6 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:35,360-Speed 4503.27 samples/sec Loss 5.7255 Epoch: 6 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:46,851-Speed 4455.93 samples/sec Loss 5.7340 Epoch: 6 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:43:58,214-Speed 4506.02 samples/sec Loss 5.7643 Epoch: 6 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:44:11,333-Speed 3902.88 samples/sec Loss 5.8082 Epoch: 6 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:44:22,753-Speed 4483.90 samples/sec Loss 5.8410 Epoch: 6 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:44:34,178-Speed 4481.57 samples/sec Loss 5.8900 Epoch: 6 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:44:45,439-Speed 4547.06 samples/sec Loss 5.8767 Epoch: 6 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:44:56,954-Speed 4446.43 samples/sec Loss 5.8167 Epoch: 6 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:45:08,190-Speed 4557.30 samples/sec Loss 5.8808 Epoch: 6 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:45:19,636-Speed 4473.42 samples/sec Loss 5.8880 Epoch: 6 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:45:30,831-Speed 4573.62 samples/sec Loss 5.9092 Epoch: 6 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:45:42,168-Speed 4516.43 samples/sec Loss 5.9016 Epoch: 6 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:45:53,554-Speed 4497.26 samples/sec Loss 5.9225 Epoch: 6 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:46:04,829-Speed 4541.23 samples/sec Loss 5.9154 Epoch: 6 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:46:16,198-Speed 4503.72 samples/sec Loss 5.9210 Epoch: 6 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:46:28,376-Speed 4204.32 samples/sec Loss 5.9740 Epoch: 6 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:46:40,663-Speed 4167.54 samples/sec Loss 5.9339 Epoch: 6 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:46:52,036-Speed 4501.92 samples/sec Loss 5.9696 Epoch: 6 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:47:03,422-Speed 4497.04 samples/sec Loss 5.9958 Epoch: 6 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:47:14,723-Speed 4530.71 samples/sec Loss 5.9702 Epoch: 6 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:47:25,993-Speed 4543.54 samples/sec Loss 6.0170 Epoch: 6 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:47:37,553-Speed 4429.48 samples/sec Loss 5.9784 Epoch: 6 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:47:49,720-Speed 4208.16 samples/sec Loss 6.0375 Epoch: 6 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:01,205-Speed 4458.12 samples/sec Loss 5.9822 Epoch: 6 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:12,533-Speed 4520.11 samples/sec Loss 6.0325 Epoch: 6 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:24,643-Speed 4228.45 samples/sec Loss 6.0192 Epoch: 6 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:36,161-Speed 4445.22 samples/sec Loss 6.0152 Epoch: 6 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:47,481-Speed 4523.49 samples/sec Loss 6.0160 Epoch: 6 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:48:58,814-Speed 4517.65 samples/sec Loss 6.0270 Epoch: 6 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:49:10,947-Speed 4220.38 samples/sec Loss 6.0293 Epoch: 6 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:49:22,371-Speed 4481.71 samples/sec Loss 6.0437 Epoch: 6 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:49:33,503-Speed 4599.81 samples/sec Loss 6.1138 Epoch: 6 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:49:44,912-Speed 4488.07 samples/sec Loss 6.1040 Epoch: 6 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:50:15,225-[lfw][102000]XNorm: 22.824346 Training: 2021-03-15 05:50:15,225-[lfw][102000]Accuracy-Flip: 0.99567+-0.00396 Training: 2021-03-15 05:50:15,225-[lfw][102000]Accuracy-Highest: 0.99683 Training: 2021-03-15 05:50:50,420-[cfp_fp][102000]XNorm: 19.088704 Training: 2021-03-15 05:50:50,420-[cfp_fp][102000]Accuracy-Flip: 0.94043+-0.01026 Training: 2021-03-15 05:50:50,420-[cfp_fp][102000]Accuracy-Highest: 0.94043 Training: 2021-03-15 05:51:21,661-[agedb_30][102000]XNorm: 22.023752 Training: 2021-03-15 05:51:21,662-[agedb_30][102000]Accuracy-Flip: 0.95050+-0.01135 Training: 2021-03-15 05:51:21,662-[agedb_30][102000]Accuracy-Highest: 0.96083 Training: 2021-03-15 05:51:33,050-Speed 473.47 samples/sec Loss 6.1044 Epoch: 6 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:51:44,213-Speed 4586.57 samples/sec Loss 6.0974 Epoch: 6 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:51:55,494-Speed 4539.13 samples/sec Loss 6.0895 Epoch: 6 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:52:06,820-Speed 4520.76 samples/sec Loss 6.0980 Epoch: 6 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:52:18,989-Speed 4207.61 samples/sec Loss 6.0726 Epoch: 6 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:52:30,366-Speed 4500.30 samples/sec Loss 6.0562 Epoch: 6 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:52:41,521-Speed 4590.33 samples/sec Loss 6.1077 Epoch: 6 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:52:53,187-Speed 4388.99 samples/sec Loss 6.0953 Epoch: 6 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:53:04,643-Speed 4469.29 samples/sec Loss 6.0802 Epoch: 6 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:53:15,906-Speed 4546.15 samples/sec Loss 6.0852 Epoch: 6 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:53:27,315-Speed 4488.13 samples/sec Loss 6.1451 Epoch: 6 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:53:38,569-Speed 4549.60 samples/sec Loss 6.1352 Epoch: 6 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:53:50,029-Speed 4467.96 samples/sec Loss 6.0880 Epoch: 6 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:01,395-Speed 4505.09 samples/sec Loss 6.1341 Epoch: 6 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:12,651-Speed 4548.65 samples/sec Loss 6.0944 Epoch: 6 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:24,227-Speed 4423.34 samples/sec Loss 6.1357 Epoch: 6 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:35,647-Speed 4483.86 samples/sec Loss 6.1443 Epoch: 6 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:47,075-Speed 4480.15 samples/sec Loss 6.1414 Epoch: 6 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:54:58,520-Speed 4473.98 samples/sec Loss 6.1312 Epoch: 6 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:55:09,794-Speed 4541.89 samples/sec Loss 6.1258 Epoch: 6 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:55:21,941-Speed 4215.07 samples/sec Loss 6.2054 Epoch: 6 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:55:33,274-Speed 4518.10 samples/sec Loss 6.1183 Epoch: 6 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:55:45,516-Speed 4182.44 samples/sec Loss 6.1544 Epoch: 6 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:55:56,832-Speed 4524.97 samples/sec Loss 6.1639 Epoch: 6 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:56:08,240-Speed 4488.18 samples/sec Loss 6.1858 Epoch: 6 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:56:19,523-Speed 4537.78 samples/sec Loss 6.1870 Epoch: 6 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:56:30,854-Speed 4519.00 samples/sec Loss 6.1859 Epoch: 6 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:56:42,035-Speed 4579.47 samples/sec Loss 6.1880 Epoch: 6 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:56:53,269-Speed 4557.74 samples/sec Loss 6.1862 Epoch: 6 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:57:04,465-Speed 4573.26 samples/sec Loss 6.1192 Epoch: 6 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:57:15,808-Speed 4513.99 samples/sec Loss 6.1515 Epoch: 6 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:57:27,246-Speed 4476.89 samples/sec Loss 6.2564 Epoch: 6 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:57:39,603-Speed 4143.84 samples/sec Loss 6.2054 Epoch: 6 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:57:51,758-Speed 4212.48 samples/sec Loss 6.2099 Epoch: 6 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:03,007-Speed 4551.47 samples/sec Loss 6.2596 Epoch: 6 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:14,590-Speed 4420.44 samples/sec Loss 6.2247 Epoch: 6 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:25,991-Speed 4491.31 samples/sec Loss 6.2001 Epoch: 6 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:37,349-Speed 4507.82 samples/sec Loss 6.2448 Epoch: 6 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:48,510-Speed 4587.95 samples/sec Loss 6.1904 Epoch: 6 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:58:59,770-Speed 4547.03 samples/sec Loss 6.2495 Epoch: 6 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 05:59:29,987-[lfw][104000]XNorm: 21.877295 Training: 2021-03-15 05:59:29,987-[lfw][104000]Accuracy-Flip: 0.99500+-0.00269 Training: 2021-03-15 05:59:29,987-[lfw][104000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:00:05,140-[cfp_fp][104000]XNorm: 18.594539 Training: 2021-03-15 06:00:05,140-[cfp_fp][104000]Accuracy-Flip: 0.93829+-0.01076 Training: 2021-03-15 06:00:05,140-[cfp_fp][104000]Accuracy-Highest: 0.94043 Training: 2021-03-15 06:00:35,536-[agedb_30][104000]XNorm: 21.228340 Training: 2021-03-15 06:00:35,537-[agedb_30][104000]Accuracy-Flip: 0.95833+-0.00955 Training: 2021-03-15 06:00:35,537-[agedb_30][104000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:00:47,063-Speed 477.20 samples/sec Loss 6.1871 Epoch: 6 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:00:59,383-Speed 4155.99 samples/sec Loss 6.2184 Epoch: 6 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:01:10,580-Speed 4573.06 samples/sec Loss 6.2323 Epoch: 6 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:01:22,635-Speed 4247.21 samples/sec Loss 6.1853 Epoch: 6 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:01:34,284-Speed 4395.28 samples/sec Loss 6.2411 Epoch: 6 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:01:45,738-Speed 4470.39 samples/sec Loss 6.1758 Epoch: 6 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:01:57,183-Speed 4473.67 samples/sec Loss 6.2733 Epoch: 6 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:02:08,397-Speed 4566.03 samples/sec Loss 6.2745 Epoch: 6 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:02:19,865-Speed 4465.02 samples/sec Loss 6.2361 Epoch: 6 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:02:32,085-Speed 4189.92 samples/sec Loss 6.2110 Epoch: 6 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:02:43,296-Speed 4567.29 samples/sec Loss 6.2201 Epoch: 6 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:02:54,666-Speed 4503.45 samples/sec Loss 6.2328 Epoch: 6 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:03:06,035-Speed 4503.82 samples/sec Loss 6.2355 Epoch: 6 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:03:17,388-Speed 4509.97 samples/sec Loss 6.2426 Epoch: 6 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:03:29,613-Speed 4188.24 samples/sec Loss 6.2434 Epoch: 6 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:03:41,008-Speed 4493.77 samples/sec Loss 6.2394 Epoch: 6 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:03:52,174-Speed 4585.41 samples/sec Loss 6.2065 Epoch: 6 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:04:03,533-Speed 4507.77 samples/sec Loss 6.2228 Epoch: 6 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:04:15,014-Speed 4459.86 samples/sec Loss 6.2464 Epoch: 6 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:04:26,364-Speed 4511.15 samples/sec Loss 6.3024 Epoch: 6 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:04:37,947-Speed 4420.58 samples/sec Loss 6.2449 Epoch: 6 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:04:49,530-Speed 4420.54 samples/sec Loss 6.2849 Epoch: 6 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:00,960-Speed 4479.64 samples/sec Loss 6.2372 Epoch: 6 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:12,262-Speed 4530.43 samples/sec Loss 6.3146 Epoch: 6 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:23,629-Speed 4504.31 samples/sec Loss 6.2504 Epoch: 6 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:35,080-Speed 4471.57 samples/sec Loss 6.3137 Epoch: 6 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:46,580-Speed 4452.45 samples/sec Loss 6.2413 Epoch: 6 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:05:57,879-Speed 4531.82 samples/sec Loss 6.2566 Epoch: 6 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:06:09,535-Speed 4392.67 samples/sec Loss 6.2328 Epoch: 6 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:06:20,871-Speed 4516.85 samples/sec Loss 6.2858 Epoch: 6 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:06:32,336-Speed 4465.82 samples/sec Loss 6.3253 Epoch: 6 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:06:44,652-Speed 4157.58 samples/sec Loss 6.2276 Epoch: 6 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:06:56,040-Speed 4495.97 samples/sec Loss 6.2157 Epoch: 6 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:07:07,351-Speed 4526.95 samples/sec Loss 6.2791 Epoch: 6 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:07:19,682-Speed 4152.37 samples/sec Loss 6.2556 Epoch: 6 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:07:31,146-Speed 4466.48 samples/sec Loss 6.2630 Epoch: 6 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:07:42,574-Speed 4480.49 samples/sec Loss 6.3174 Epoch: 6 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:07:53,930-Speed 4508.60 samples/sec Loss 6.2991 Epoch: 6 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:08:05,123-Speed 4574.51 samples/sec Loss 6.2948 Epoch: 6 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:08:16,582-Speed 4468.21 samples/sec Loss 6.3375 Epoch: 6 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:08:47,021-[lfw][106000]XNorm: 22.699439 Training: 2021-03-15 06:08:47,022-[lfw][106000]Accuracy-Flip: 0.99567+-0.00260 Training: 2021-03-15 06:08:47,022-[lfw][106000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:09:22,335-[cfp_fp][106000]XNorm: 18.955659 Training: 2021-03-15 06:09:22,335-[cfp_fp][106000]Accuracy-Flip: 0.94129+-0.01078 Training: 2021-03-15 06:09:22,335-[cfp_fp][106000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:09:52,772-[agedb_30][106000]XNorm: 21.613513 Training: 2021-03-15 06:09:52,772-[agedb_30][106000]Accuracy-Flip: 0.95217+-0.01274 Training: 2021-03-15 06:09:52,772-[agedb_30][106000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:10:04,000-Speed 476.65 samples/sec Loss 6.3073 Epoch: 6 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:10:15,129-Speed 4601.03 samples/sec Loss 6.2278 Epoch: 6 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:10:26,362-Speed 4558.18 samples/sec Loss 6.2568 Epoch: 6 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:10:38,586-Speed 4188.60 samples/sec Loss 6.2844 Epoch: 6 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:10:50,748-Speed 4210.35 samples/sec Loss 6.3288 Epoch: 6 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:02,227-Speed 4460.56 samples/sec Loss 6.3172 Epoch: 6 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:13,786-Speed 4429.60 samples/sec Loss 6.2896 Epoch: 6 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:25,107-Speed 4522.97 samples/sec Loss 6.3174 Epoch: 6 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:36,573-Speed 4465.69 samples/sec Loss 6.2929 Epoch: 6 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:48,025-Speed 4471.00 samples/sec Loss 6.2817 Epoch: 6 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:11:59,541-Speed 4446.14 samples/sec Loss 6.3193 Epoch: 6 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:12:10,898-Speed 4508.63 samples/sec Loss 6.2579 Epoch: 6 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:12:22,169-Speed 4542.59 samples/sec Loss 6.3332 Epoch: 6 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:12:34,290-Speed 4224.37 samples/sec Loss 6.2595 Epoch: 6 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:12:45,787-Speed 4453.85 samples/sec Loss 6.3506 Epoch: 6 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:12:57,231-Speed 4474.29 samples/sec Loss 6.2880 Epoch: 6 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:13:09,319-Speed 4235.68 samples/sec Loss 6.2904 Epoch: 6 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:13:20,843-Speed 4443.26 samples/sec Loss 6.2324 Epoch: 6 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:13:32,193-Speed 4511.29 samples/sec Loss 6.2931 Epoch: 6 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:13:43,486-Speed 4534.15 samples/sec Loss 6.2956 Epoch: 6 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:13:54,824-Speed 4516.04 samples/sec Loss 6.2699 Epoch: 6 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:14:07,161-Speed 4150.17 samples/sec Loss 6.2886 Epoch: 6 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:14:18,541-Speed 4499.57 samples/sec Loss 6.2950 Epoch: 6 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:14:30,112-Speed 4425.00 samples/sec Loss 6.3209 Epoch: 6 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:14:42,503-Speed 4132.28 samples/sec Loss 6.3171 Epoch: 6 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:14:53,873-Speed 4503.20 samples/sec Loss 6.2856 Epoch: 6 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:15:05,098-Speed 4561.25 samples/sec Loss 6.3504 Epoch: 6 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:15:16,378-Speed 4539.30 samples/sec Loss 6.3208 Epoch: 6 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:15:27,761-Speed 4498.10 samples/sec Loss 6.3363 Epoch: 6 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:15:39,113-Speed 4510.43 samples/sec Loss 6.3333 Epoch: 6 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:15:50,512-Speed 4492.14 samples/sec Loss 6.2867 Epoch: 6 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:01,873-Speed 4506.68 samples/sec Loss 6.3252 Epoch: 6 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:13,191-Speed 4524.28 samples/sec Loss 6.2755 Epoch: 6 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:24,721-Speed 4440.56 samples/sec Loss 6.3078 Epoch: 6 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:35,988-Speed 4544.71 samples/sec Loss 6.2943 Epoch: 6 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:47,394-Speed 4489.14 samples/sec Loss 6.3030 Epoch: 6 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:16:59,186-Speed 4342.02 samples/sec Loss 6.3199 Epoch: 6 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:17:10,719-Speed 4439.59 samples/sec Loss 6.2990 Epoch: 6 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:17:22,048-Speed 4519.84 samples/sec Loss 6.3256 Epoch: 6 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:17:33,386-Speed 4515.94 samples/sec Loss 6.2281 Epoch: 6 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:18:03,361-[lfw][108000]XNorm: 21.888654 Training: 2021-03-15 06:18:03,361-[lfw][108000]Accuracy-Flip: 0.99567+-0.00343 Training: 2021-03-15 06:18:03,361-[lfw][108000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:18:38,358-[cfp_fp][108000]XNorm: 18.382821 Training: 2021-03-15 06:18:38,358-[cfp_fp][108000]Accuracy-Flip: 0.94000+-0.00992 Training: 2021-03-15 06:18:38,358-[cfp_fp][108000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:19:08,574-[agedb_30][108000]XNorm: 21.013366 Training: 2021-03-15 06:19:08,574-[agedb_30][108000]Accuracy-Flip: 0.95633+-0.00865 Training: 2021-03-15 06:19:08,574-[agedb_30][108000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:19:19,797-Speed 481.15 samples/sec Loss 6.2899 Epoch: 6 Global Step: 108050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:19:31,341-Speed 4435.60 samples/sec Loss 6.3223 Epoch: 6 Global Step: 108100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:19:43,492-Speed 4213.82 samples/sec Loss 6.3495 Epoch: 6 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:19:54,822-Speed 4519.31 samples/sec Loss 6.2759 Epoch: 6 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:20:07,025-Speed 4196.00 samples/sec Loss 6.3086 Epoch: 6 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:20:18,477-Speed 4471.29 samples/sec Loss 6.2936 Epoch: 6 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:20:29,772-Speed 4533.13 samples/sec Loss 6.3056 Epoch: 6 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:20:41,280-Speed 4449.58 samples/sec Loss 6.3114 Epoch: 6 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:20:52,588-Speed 4527.78 samples/sec Loss 6.3408 Epoch: 6 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:21:04,118-Speed 4440.79 samples/sec Loss 6.2899 Epoch: 6 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:21:15,490-Speed 4502.48 samples/sec Loss 6.3274 Epoch: 6 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:21:27,024-Speed 4439.52 samples/sec Loss 6.3112 Epoch: 6 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:21:38,405-Speed 4498.90 samples/sec Loss 6.2563 Epoch: 6 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:21:49,702-Speed 4532.24 samples/sec Loss 6.2471 Epoch: 6 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:02,891-Speed 3882.41 samples/sec Loss 6.2320 Epoch: 6 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:14,291-Speed 4491.40 samples/sec Loss 6.3270 Epoch: 6 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:25,603-Speed 4526.26 samples/sec Loss 6.3129 Epoch: 6 Global Step: 108850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:36,899-Speed 4532.89 samples/sec Loss 6.2927 Epoch: 6 Global Step: 108900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:48,383-Speed 4458.82 samples/sec Loss 6.3373 Epoch: 6 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:22:59,759-Speed 4500.69 samples/sec Loss 6.2944 Epoch: 6 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:23:11,175-Speed 4485.33 samples/sec Loss 6.3529 Epoch: 6 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:23:23,367-Speed 4199.63 samples/sec Loss 6.2788 Epoch: 6 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:23:34,566-Speed 4572.09 samples/sec Loss 6.2914 Epoch: 6 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:23:45,766-Speed 4571.82 samples/sec Loss 6.3166 Epoch: 6 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:23:57,100-Speed 4517.43 samples/sec Loss 6.3272 Epoch: 6 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:24:08,544-Speed 4474.06 samples/sec Loss 6.4029 Epoch: 6 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:24:20,084-Speed 4437.10 samples/sec Loss 6.2829 Epoch: 6 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:24:31,497-Speed 4486.11 samples/sec Loss 6.3506 Epoch: 6 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:24:43,904-Speed 4127.00 samples/sec Loss 6.2522 Epoch: 6 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:24:55,296-Speed 4494.35 samples/sec Loss 6.2797 Epoch: 6 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:25:06,530-Speed 4558.00 samples/sec Loss 6.3480 Epoch: 6 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:25:18,840-Speed 4159.52 samples/sec Loss 6.2764 Epoch: 6 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:25:30,206-Speed 4504.66 samples/sec Loss 6.3594 Epoch: 6 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:25:41,639-Speed 4478.45 samples/sec Loss 6.3057 Epoch: 6 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:25:53,080-Speed 4475.61 samples/sec Loss 6.3029 Epoch: 6 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:26:05,248-Speed 4207.67 samples/sec Loss 6.3141 Epoch: 6 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:26:16,504-Speed 4549.14 samples/sec Loss 6.2959 Epoch: 6 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:26:27,922-Speed 4484.43 samples/sec Loss 6.2778 Epoch: 6 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:26:39,374-Speed 4471.19 samples/sec Loss 6.3167 Epoch: 6 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:26:50,768-Speed 4493.63 samples/sec Loss 6.3388 Epoch: 6 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:27:21,155-[lfw][110000]XNorm: 23.370308 Training: 2021-03-15 06:27:21,155-[lfw][110000]Accuracy-Flip: 0.99667+-0.00316 Training: 2021-03-15 06:27:21,155-[lfw][110000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:27:56,378-[cfp_fp][110000]XNorm: 19.410597 Training: 2021-03-15 06:27:56,378-[cfp_fp][110000]Accuracy-Flip: 0.94000+-0.01441 Training: 2021-03-15 06:27:56,379-[cfp_fp][110000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:28:26,789-[agedb_30][110000]XNorm: 22.538324 Training: 2021-03-15 06:28:26,789-[agedb_30][110000]Accuracy-Flip: 0.95783+-0.01216 Training: 2021-03-15 06:28:26,789-[agedb_30][110000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:28:38,015-Speed 477.41 samples/sec Loss 6.3176 Epoch: 6 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:28:49,381-Speed 4504.91 samples/sec Loss 6.3191 Epoch: 6 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:00,740-Speed 4507.77 samples/sec Loss 6.3486 Epoch: 6 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:12,329-Speed 4418.43 samples/sec Loss 6.2575 Epoch: 6 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:23,785-Speed 4469.41 samples/sec Loss 6.3307 Epoch: 6 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:35,035-Speed 4551.58 samples/sec Loss 6.2846 Epoch: 6 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:46,490-Speed 4469.94 samples/sec Loss 6.3020 Epoch: 6 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:29:57,945-Speed 4469.55 samples/sec Loss 6.3314 Epoch: 6 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:30:09,192-Speed 4552.85 samples/sec Loss 6.3764 Epoch: 6 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:30:20,416-Speed 4561.53 samples/sec Loss 6.2573 Epoch: 6 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:30:31,803-Speed 4496.92 samples/sec Loss 6.3371 Epoch: 6 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:30:43,309-Speed 4450.06 samples/sec Loss 6.3144 Epoch: 6 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:30:55,575-Speed 4174.10 samples/sec Loss 6.3349 Epoch: 6 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:31:06,738-Speed 4587.14 samples/sec Loss 6.2606 Epoch: 6 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-15 06:31:19,102-Speed 4141.17 samples/sec Loss 6.2703 Epoch: 6 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:31:30,261-Speed 4588.52 samples/sec Loss 6.2864 Epoch: 6 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:31:41,654-Speed 4494.21 samples/sec Loss 6.3087 Epoch: 6 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:31:53,013-Speed 4507.34 samples/sec Loss 6.3681 Epoch: 6 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:32:04,631-Speed 4407.25 samples/sec Loss 6.3414 Epoch: 6 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:32:16,237-Speed 4411.68 samples/sec Loss 6.2867 Epoch: 6 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:32:27,652-Speed 4485.79 samples/sec Loss 6.3393 Epoch: 6 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:32:38,851-Speed 4571.69 samples/sec Loss 6.3370 Epoch: 6 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:32:50,188-Speed 4516.41 samples/sec Loss 6.2894 Epoch: 6 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:33:01,675-Speed 4457.37 samples/sec Loss 6.2885 Epoch: 6 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:33:13,924-Speed 4180.33 samples/sec Loss 6.2848 Epoch: 6 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:33:26,225-Speed 4162.51 samples/sec Loss 6.2769 Epoch: 6 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:33:37,667-Speed 4474.68 samples/sec Loss 6.3555 Epoch: 6 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:33:49,036-Speed 4503.98 samples/sec Loss 6.4020 Epoch: 6 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:00,300-Speed 4545.69 samples/sec Loss 6.3052 Epoch: 6 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:11,645-Speed 4513.09 samples/sec Loss 6.3132 Epoch: 6 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:23,077-Speed 4478.87 samples/sec Loss 6.3513 Epoch: 6 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:35,350-Speed 4171.97 samples/sec Loss 6.2598 Epoch: 6 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:46,593-Speed 4554.13 samples/sec Loss 6.2600 Epoch: 6 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:34:57,953-Speed 4507.27 samples/sec Loss 6.2992 Epoch: 6 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:35:09,370-Speed 4484.79 samples/sec Loss 6.2663 Epoch: 6 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:35:20,761-Speed 4495.12 samples/sec Loss 6.2906 Epoch: 6 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:35:32,221-Speed 4467.85 samples/sec Loss 6.3042 Epoch: 6 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:35:43,688-Speed 4465.30 samples/sec Loss 6.3614 Epoch: 6 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:35:55,014-Speed 4520.92 samples/sec Loss 6.3352 Epoch: 6 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:36:06,451-Speed 4477.07 samples/sec Loss 6.3488 Epoch: 6 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:36:36,802-[lfw][112000]XNorm: 21.007831 Training: 2021-03-15 06:36:36,802-[lfw][112000]Accuracy-Flip: 0.99500+-0.00325 Training: 2021-03-15 06:36:36,802-[lfw][112000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:37:12,004-[cfp_fp][112000]XNorm: 17.703900 Training: 2021-03-15 06:37:12,005-[cfp_fp][112000]Accuracy-Flip: 0.93186+-0.00948 Training: 2021-03-15 06:37:12,005-[cfp_fp][112000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:37:42,350-[agedb_30][112000]XNorm: 20.072852 Training: 2021-03-15 06:37:42,350-[agedb_30][112000]Accuracy-Flip: 0.95867+-0.00924 Training: 2021-03-15 06:37:42,351-[agedb_30][112000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:37:53,605-Speed 477.82 samples/sec Loss 6.3463 Epoch: 6 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:38:04,871-Speed 4544.58 samples/sec Loss 6.3605 Epoch: 6 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:38:17,151-Speed 4169.90 samples/sec Loss 6.2915 Epoch: 6 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:38:28,455-Speed 4529.47 samples/sec Loss 6.2499 Epoch: 6 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:38:40,631-Speed 4205.25 samples/sec Loss 6.3187 Epoch: 6 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:38:53,068-Speed 4117.01 samples/sec Loss 6.3360 Epoch: 6 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:39:04,421-Speed 4509.93 samples/sec Loss 6.2850 Epoch: 6 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:39:15,797-Speed 4500.84 samples/sec Loss 6.2988 Epoch: 6 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:39:27,165-Speed 4504.09 samples/sec Loss 6.3252 Epoch: 6 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:39:38,713-Speed 4434.10 samples/sec Loss 6.3651 Epoch: 6 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:39:50,020-Speed 4528.36 samples/sec Loss 6.2842 Epoch: 6 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:01,294-Speed 4541.60 samples/sec Loss 6.3116 Epoch: 6 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:12,679-Speed 4497.41 samples/sec Loss 6.3052 Epoch: 6 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:24,116-Speed 4477.09 samples/sec Loss 6.2684 Epoch: 6 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:35,598-Speed 4459.09 samples/sec Loss 6.3130 Epoch: 6 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:47,022-Speed 4482.00 samples/sec Loss 6.3212 Epoch: 6 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:40:58,458-Speed 4477.42 samples/sec Loss 6.3702 Epoch: 6 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:41:09,943-Speed 4458.24 samples/sec Loss 6.3144 Epoch: 6 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:41:21,199-Speed 4548.96 samples/sec Loss 6.3412 Epoch: 6 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:41:32,615-Speed 4484.93 samples/sec Loss 6.2811 Epoch: 6 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:41:43,893-Speed 4540.06 samples/sec Loss 6.3308 Epoch: 6 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:41:55,254-Speed 4507.01 samples/sec Loss 6.3557 Epoch: 6 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:42:06,580-Speed 4520.44 samples/sec Loss 6.3493 Epoch: 6 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:42:18,735-Speed 4212.47 samples/sec Loss 6.2364 Epoch: 6 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:42:30,270-Speed 4438.99 samples/sec Loss 6.3804 Epoch: 6 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:42:42,391-Speed 4224.46 samples/sec Loss 6.3579 Epoch: 6 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:42:53,664-Speed 4541.93 samples/sec Loss 6.3256 Epoch: 6 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:43:05,025-Speed 4506.95 samples/sec Loss 6.3071 Epoch: 6 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:43:16,436-Speed 4486.96 samples/sec Loss 6.3238 Epoch: 6 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:43:27,783-Speed 4512.26 samples/sec Loss 6.2733 Epoch: 6 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:43:39,231-Speed 4472.93 samples/sec Loss 6.2931 Epoch: 6 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:43:50,473-Speed 4554.58 samples/sec Loss 6.2961 Epoch: 6 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:44:01,907-Speed 4477.74 samples/sec Loss 6.3363 Epoch: 6 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:44:13,308-Speed 4491.43 samples/sec Loss 6.2729 Epoch: 6 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:44:24,935-Speed 4403.79 samples/sec Loss 6.2810 Epoch: 6 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:44:37,970-Speed 3928.09 samples/sec Loss 6.3297 Epoch: 6 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:44:49,342-Speed 4502.32 samples/sec Loss 6.3479 Epoch: 6 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:45:00,620-Speed 4540.41 samples/sec Loss 6.3417 Epoch: 6 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:45:11,995-Speed 4501.14 samples/sec Loss 6.3144 Epoch: 6 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:45:23,450-Speed 4470.04 samples/sec Loss 6.2589 Epoch: 6 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:45:53,791-[lfw][114000]XNorm: 23.548013 Training: 2021-03-15 06:45:53,791-[lfw][114000]Accuracy-Flip: 0.99617+-0.00334 Training: 2021-03-15 06:45:53,791-[lfw][114000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:46:28,964-[cfp_fp][114000]XNorm: 19.919824 Training: 2021-03-15 06:46:28,964-[cfp_fp][114000]Accuracy-Flip: 0.93000+-0.01239 Training: 2021-03-15 06:46:28,964-[cfp_fp][114000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:46:59,275-[agedb_30][114000]XNorm: 22.933715 Training: 2021-03-15 06:46:59,275-[agedb_30][114000]Accuracy-Flip: 0.95483+-0.00864 Training: 2021-03-15 06:46:59,275-[agedb_30][114000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:47:10,526-Speed 478.17 samples/sec Loss 6.3326 Epoch: 6 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:47:21,762-Speed 4557.29 samples/sec Loss 6.3579 Epoch: 6 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:47:32,996-Speed 4557.45 samples/sec Loss 6.2779 Epoch: 6 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:47:45,021-Speed 4258.09 samples/sec Loss 6.2802 Epoch: 6 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:47:56,282-Speed 4546.85 samples/sec Loss 6.3103 Epoch: 6 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:48:07,761-Speed 4460.67 samples/sec Loss 6.3126 Epoch: 6 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:48:19,303-Speed 4436.20 samples/sec Loss 6.2945 Epoch: 6 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:48:30,737-Speed 4477.92 samples/sec Loss 6.2717 Epoch: 6 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:48:42,275-Speed 4438.03 samples/sec Loss 6.3098 Epoch: 6 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:48:53,765-Speed 4456.05 samples/sec Loss 6.2887 Epoch: 6 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:49:05,066-Speed 4530.87 samples/sec Loss 6.2273 Epoch: 6 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:49:16,260-Speed 4574.08 samples/sec Loss 6.1973 Epoch: 6 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:49:27,580-Speed 4523.28 samples/sec Loss 6.3406 Epoch: 6 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:49:40,135-Speed 4078.14 samples/sec Loss 6.3035 Epoch: 6 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:49:51,428-Speed 4533.96 samples/sec Loss 6.3559 Epoch: 6 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:50:02,832-Speed 4489.82 samples/sec Loss 6.2641 Epoch: 6 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:50:14,090-Speed 4548.20 samples/sec Loss 6.3143 Epoch: 6 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:50:27,089-Speed 3938.80 samples/sec Loss 6.3387 Epoch: 6 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:50:38,347-Speed 4548.14 samples/sec Loss 6.3091 Epoch: 6 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:50:49,483-Speed 4598.14 samples/sec Loss 6.3251 Epoch: 6 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:00,856-Speed 4502.18 samples/sec Loss 6.2816 Epoch: 6 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:12,287-Speed 4479.27 samples/sec Loss 6.3128 Epoch: 6 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:23,678-Speed 4495.21 samples/sec Loss 6.3377 Epoch: 6 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:34,894-Speed 4564.98 samples/sec Loss 6.2981 Epoch: 6 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:46,109-Speed 4565.30 samples/sec Loss 6.3250 Epoch: 6 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:51:57,387-Speed 4540.10 samples/sec Loss 6.3337 Epoch: 6 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:52:08,666-Speed 4539.91 samples/sec Loss 6.2807 Epoch: 6 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:52:19,970-Speed 4529.27 samples/sec Loss 6.2501 Epoch: 6 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:52:31,304-Speed 4517.85 samples/sec Loss 6.2897 Epoch: 6 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:52:42,571-Speed 4544.51 samples/sec Loss 6.3026 Epoch: 6 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:52:54,113-Speed 4436.11 samples/sec Loss 6.3054 Epoch: 6 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:53:05,701-Speed 4418.52 samples/sec Loss 6.2194 Epoch: 6 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:53:16,934-Speed 4558.16 samples/sec Loss 6.3401 Epoch: 6 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:53:28,549-Speed 4408.17 samples/sec Loss 6.2871 Epoch: 6 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:53:40,709-Speed 4210.92 samples/sec Loss 6.2910 Epoch: 6 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:53:52,079-Speed 4503.35 samples/sec Loss 6.3439 Epoch: 6 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:54:04,481-Speed 4128.55 samples/sec Loss 6.2901 Epoch: 6 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:54:16,039-Speed 4429.96 samples/sec Loss 6.2838 Epoch: 6 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:54:27,191-Speed 4591.43 samples/sec Loss 6.3297 Epoch: 6 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:54:38,650-Speed 4468.26 samples/sec Loss 6.3014 Epoch: 6 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:55:08,918-[lfw][116000]XNorm: 24.880650 Training: 2021-03-15 06:55:08,919-[lfw][116000]Accuracy-Flip: 0.99633+-0.00287 Training: 2021-03-15 06:55:08,919-[lfw][116000]Accuracy-Highest: 0.99683 Training: 2021-03-15 06:55:44,105-[cfp_fp][116000]XNorm: 21.263052 Training: 2021-03-15 06:55:44,105-[cfp_fp][116000]Accuracy-Flip: 0.93757+-0.00769 Training: 2021-03-15 06:55:44,105-[cfp_fp][116000]Accuracy-Highest: 0.94129 Training: 2021-03-15 06:56:14,394-[agedb_30][116000]XNorm: 23.907637 Training: 2021-03-15 06:56:14,395-[agedb_30][116000]Accuracy-Flip: 0.95850+-0.01071 Training: 2021-03-15 06:56:14,395-[agedb_30][116000]Accuracy-Highest: 0.96083 Training: 2021-03-15 06:56:25,847-Speed 477.63 samples/sec Loss 6.3357 Epoch: 6 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:56:37,415-Speed 4426.22 samples/sec Loss 6.2522 Epoch: 6 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:56:48,637-Speed 4562.73 samples/sec Loss 6.2775 Epoch: 6 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:56:59,888-Speed 4550.61 samples/sec Loss 6.2803 Epoch: 6 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:57:11,407-Speed 4445.05 samples/sec Loss 6.3017 Epoch: 6 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:57:22,742-Speed 4517.44 samples/sec Loss 6.2913 Epoch: 6 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:57:35,131-Speed 4132.95 samples/sec Loss 6.2186 Epoch: 6 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:57:47,350-Speed 4190.39 samples/sec Loss 6.3601 Epoch: 6 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:57:58,805-Speed 4469.80 samples/sec Loss 6.2915 Epoch: 6 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:58:10,211-Speed 4488.85 samples/sec Loss 6.2545 Epoch: 6 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:58:21,666-Speed 4470.15 samples/sec Loss 6.3318 Epoch: 6 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:58:33,081-Speed 4485.55 samples/sec Loss 6.2992 Epoch: 6 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:58:44,480-Speed 4491.62 samples/sec Loss 6.3426 Epoch: 6 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:58:55,954-Speed 4462.54 samples/sec Loss 6.3465 Epoch: 6 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:59:08,131-Speed 4204.76 samples/sec Loss 6.2776 Epoch: 6 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:59:19,328-Speed 4572.94 samples/sec Loss 6.3362 Epoch: 6 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:59:44,254-Speed 2054.18 samples/sec Loss 6.0597 Epoch: 7 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 06:59:56,302-Speed 4249.97 samples/sec Loss 4.7174 Epoch: 7 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:00:07,906-Speed 4412.77 samples/sec Loss 4.3780 Epoch: 7 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:00:19,473-Speed 4426.69 samples/sec Loss 4.1462 Epoch: 7 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:00:30,990-Speed 4445.81 samples/sec Loss 4.1000 Epoch: 7 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:00:42,365-Speed 4501.42 samples/sec Loss 3.9759 Epoch: 7 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:00:53,978-Speed 4409.01 samples/sec Loss 3.9187 Epoch: 7 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:01:06,079-Speed 4231.36 samples/sec Loss 3.8648 Epoch: 7 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:01:17,379-Speed 4531.31 samples/sec Loss 3.7959 Epoch: 7 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:01:28,853-Speed 4462.35 samples/sec Loss 3.7764 Epoch: 7 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:01:40,221-Speed 4504.24 samples/sec Loss 3.7194 Epoch: 7 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:01:51,414-Speed 4574.45 samples/sec Loss 3.7414 Epoch: 7 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:02:03,684-Speed 4173.21 samples/sec Loss 3.6665 Epoch: 7 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:02:15,022-Speed 4515.95 samples/sec Loss 3.6000 Epoch: 7 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:02:26,391-Speed 4503.67 samples/sec Loss 3.6075 Epoch: 7 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:02:39,153-Speed 4012.10 samples/sec Loss 3.5641 Epoch: 7 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:02:50,444-Speed 4534.79 samples/sec Loss 3.5459 Epoch: 7 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:01,849-Speed 4489.56 samples/sec Loss 3.5370 Epoch: 7 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:13,254-Speed 4489.53 samples/sec Loss 3.5227 Epoch: 7 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:24,509-Speed 4549.44 samples/sec Loss 3.4703 Epoch: 7 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:35,849-Speed 4515.34 samples/sec Loss 3.5152 Epoch: 7 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:47,061-Speed 4566.54 samples/sec Loss 3.4503 Epoch: 7 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:03:58,283-Speed 4562.77 samples/sec Loss 3.4603 Epoch: 7 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:04:09,686-Speed 4490.36 samples/sec Loss 3.3664 Epoch: 7 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:04:39,768-[lfw][118000]XNorm: 22.735896 Training: 2021-03-15 07:04:39,768-[lfw][118000]Accuracy-Flip: 0.99700+-0.00306 Training: 2021-03-15 07:04:39,768-[lfw][118000]Accuracy-Highest: 0.99700 Training: 2021-03-15 07:05:15,149-[cfp_fp][118000]XNorm: 19.943242 Training: 2021-03-15 07:05:15,150-[cfp_fp][118000]Accuracy-Flip: 0.97214+-0.00836 Training: 2021-03-15 07:05:15,150-[cfp_fp][118000]Accuracy-Highest: 0.97214 Training: 2021-03-15 07:05:45,380-[agedb_30][118000]XNorm: 22.258552 Training: 2021-03-15 07:05:45,381-[agedb_30][118000]Accuracy-Flip: 0.97467+-0.00427 Training: 2021-03-15 07:05:45,381-[agedb_30][118000]Accuracy-Highest: 0.97467 Training: 2021-03-15 07:05:56,594-Speed 478.92 samples/sec Loss 3.3853 Epoch: 7 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:06:07,987-Speed 4494.14 samples/sec Loss 3.3940 Epoch: 7 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:06:19,689-Speed 4375.78 samples/sec Loss 3.3651 Epoch: 7 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:06:31,101-Speed 4486.74 samples/sec Loss 3.3478 Epoch: 7 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:06:42,600-Speed 4452.63 samples/sec Loss 3.3135 Epoch: 7 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:06:54,786-Speed 4201.89 samples/sec Loss 3.3075 Epoch: 7 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:07:06,953-Speed 4208.20 samples/sec Loss 3.3161 Epoch: 7 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:07:18,325-Speed 4502.74 samples/sec Loss 3.2624 Epoch: 7 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:07:29,651-Speed 4520.74 samples/sec Loss 3.2664 Epoch: 7 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:07:41,067-Speed 4485.30 samples/sec Loss 3.2252 Epoch: 7 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:07:52,452-Speed 4497.42 samples/sec Loss 3.2360 Epoch: 7 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:08:03,682-Speed 4559.47 samples/sec Loss 3.2440 Epoch: 7 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:08:15,021-Speed 4515.41 samples/sec Loss 3.1980 Epoch: 7 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:08:26,447-Speed 4481.37 samples/sec Loss 3.1727 Epoch: 7 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:08:38,205-Speed 4354.80 samples/sec Loss 3.1813 Epoch: 7 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:08:49,435-Speed 4559.23 samples/sec Loss 3.1703 Epoch: 7 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:01,776-Speed 4148.96 samples/sec Loss 3.1554 Epoch: 7 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:13,083-Speed 4528.30 samples/sec Loss 3.1761 Epoch: 7 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:24,408-Speed 4521.34 samples/sec Loss 3.1308 Epoch: 7 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:36,552-Speed 4216.21 samples/sec Loss 3.1496 Epoch: 7 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:47,953-Speed 4491.19 samples/sec Loss 3.1343 Epoch: 7 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:09:59,285-Speed 4518.47 samples/sec Loss 3.1428 Epoch: 7 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:10:10,806-Speed 4444.38 samples/sec Loss 3.0998 Epoch: 7 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:10:22,233-Speed 4480.95 samples/sec Loss 3.0809 Epoch: 7 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:10:34,448-Speed 4191.74 samples/sec Loss 3.1104 Epoch: 7 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:10:45,704-Speed 4548.93 samples/sec Loss 3.0753 Epoch: 7 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:10:57,204-Speed 4452.10 samples/sec Loss 3.0754 Epoch: 7 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:11:08,535-Speed 4518.80 samples/sec Loss 3.0323 Epoch: 7 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:11:19,980-Speed 4473.75 samples/sec Loss 3.0608 Epoch: 7 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:11:31,465-Speed 4458.36 samples/sec Loss 3.0628 Epoch: 7 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:11:42,885-Speed 4483.72 samples/sec Loss 3.0319 Epoch: 7 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:11:54,240-Speed 4509.30 samples/sec Loss 3.0091 Epoch: 7 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:12:05,730-Speed 4456.12 samples/sec Loss 2.9927 Epoch: 7 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:12:17,793-Speed 4244.45 samples/sec Loss 3.0349 Epoch: 7 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:12:29,156-Speed 4506.31 samples/sec Loss 2.9903 Epoch: 7 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:12:40,513-Speed 4508.40 samples/sec Loss 3.0159 Epoch: 7 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:12:52,079-Speed 4427.19 samples/sec Loss 2.9806 Epoch: 7 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:13:03,480-Speed 4490.84 samples/sec Loss 2.9636 Epoch: 7 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:13:15,833-Speed 4145.06 samples/sec Loss 2.9634 Epoch: 7 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:13:27,307-Speed 4462.26 samples/sec Loss 2.9420 Epoch: 7 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:13:57,764-[lfw][120000]XNorm: 22.883300 Training: 2021-03-15 07:13:57,764-[lfw][120000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-15 07:13:57,764-[lfw][120000]Accuracy-Highest: 0.99733 Training: 2021-03-15 07:14:33,091-[cfp_fp][120000]XNorm: 20.219918 Training: 2021-03-15 07:14:33,091-[cfp_fp][120000]Accuracy-Flip: 0.97643+-0.00665 Training: 2021-03-15 07:14:33,091-[cfp_fp][120000]Accuracy-Highest: 0.97643 Training: 2021-03-15 07:15:03,593-[agedb_30][120000]XNorm: 22.471483 Training: 2021-03-15 07:15:03,593-[agedb_30][120000]Accuracy-Flip: 0.97617+-0.00749 Training: 2021-03-15 07:15:03,593-[agedb_30][120000]Accuracy-Highest: 0.97617 Training: 2021-03-15 07:15:14,867-Speed 476.02 samples/sec Loss 2.9490 Epoch: 7 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:15:26,234-Speed 4504.60 samples/sec Loss 2.9318 Epoch: 7 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:15:37,717-Speed 4459.28 samples/sec Loss 2.9364 Epoch: 7 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:15:49,003-Speed 4536.78 samples/sec Loss 2.9449 Epoch: 7 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:00,154-Speed 4591.83 samples/sec Loss 2.9213 Epoch: 7 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:11,650-Speed 4453.99 samples/sec Loss 2.9073 Epoch: 7 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:23,530-Speed 4309.65 samples/sec Loss 2.9264 Epoch: 7 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:34,851-Speed 4523.06 samples/sec Loss 2.9343 Epoch: 7 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:46,136-Speed 4537.08 samples/sec Loss 2.9118 Epoch: 7 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:16:57,674-Speed 4437.98 samples/sec Loss 2.8980 Epoch: 7 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:17:09,148-Speed 4462.08 samples/sec Loss 2.8942 Epoch: 7 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:17:20,411-Speed 4546.46 samples/sec Loss 2.9121 Epoch: 7 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:17:31,674-Speed 4545.88 samples/sec Loss 2.9127 Epoch: 7 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:17:42,962-Speed 4535.94 samples/sec Loss 2.8691 Epoch: 7 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:17:54,311-Speed 4511.91 samples/sec Loss 2.8501 Epoch: 7 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:18:05,701-Speed 4495.05 samples/sec Loss 2.8812 Epoch: 7 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:18:17,961-Speed 4176.60 samples/sec Loss 2.8427 Epoch: 7 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:18:29,346-Speed 4497.08 samples/sec Loss 2.8618 Epoch: 7 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:18:41,617-Speed 4172.72 samples/sec Loss 2.8073 Epoch: 7 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:18:53,076-Speed 4468.37 samples/sec Loss 2.8134 Epoch: 7 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:19:04,539-Speed 4466.72 samples/sec Loss 2.8284 Epoch: 7 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:19:15,680-Speed 4596.25 samples/sec Loss 2.8566 Epoch: 7 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:19:26,987-Speed 4528.41 samples/sec Loss 2.8334 Epoch: 7 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:19:38,429-Speed 4474.67 samples/sec Loss 2.8348 Epoch: 7 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:19:49,672-Speed 4554.16 samples/sec Loss 2.8303 Epoch: 7 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:01,106-Speed 4478.06 samples/sec Loss 2.7996 Epoch: 7 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:13,246-Speed 4217.79 samples/sec Loss 2.8611 Epoch: 7 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:24,607-Speed 4506.71 samples/sec Loss 2.8291 Epoch: 7 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:35,970-Speed 4506.20 samples/sec Loss 2.8087 Epoch: 7 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:47,246-Speed 4540.88 samples/sec Loss 2.8000 Epoch: 7 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:20:58,584-Speed 4515.91 samples/sec Loss 2.8041 Epoch: 7 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:21:10,726-Speed 4217.16 samples/sec Loss 2.7878 Epoch: 7 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:21:21,974-Speed 4552.13 samples/sec Loss 2.7962 Epoch: 7 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:21:33,292-Speed 4523.94 samples/sec Loss 2.7884 Epoch: 7 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:21:44,616-Speed 4521.86 samples/sec Loss 2.7648 Epoch: 7 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:21:56,017-Speed 4490.89 samples/sec Loss 2.7711 Epoch: 7 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:22:08,231-Speed 4192.09 samples/sec Loss 2.7696 Epoch: 7 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:22:19,686-Speed 4470.02 samples/sec Loss 2.7878 Epoch: 7 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:22:31,053-Speed 4504.45 samples/sec Loss 2.7768 Epoch: 7 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:22:42,411-Speed 4507.90 samples/sec Loss 2.7762 Epoch: 7 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:23:12,862-[lfw][122000]XNorm: 22.984638 Training: 2021-03-15 07:23:12,863-[lfw][122000]Accuracy-Flip: 0.99783+-0.00279 Training: 2021-03-15 07:23:12,863-[lfw][122000]Accuracy-Highest: 0.99783 Training: 2021-03-15 07:23:48,190-[cfp_fp][122000]XNorm: 20.434638 Training: 2021-03-15 07:23:48,190-[cfp_fp][122000]Accuracy-Flip: 0.97814+-0.00761 Training: 2021-03-15 07:23:48,190-[cfp_fp][122000]Accuracy-Highest: 0.97814 Training: 2021-03-15 07:24:18,646-[agedb_30][122000]XNorm: 22.486365 Training: 2021-03-15 07:24:18,647-[agedb_30][122000]Accuracy-Flip: 0.97500+-0.00796 Training: 2021-03-15 07:24:18,647-[agedb_30][122000]Accuracy-Highest: 0.97617 Training: 2021-03-15 07:24:29,845-Speed 476.58 samples/sec Loss 2.7808 Epoch: 7 Global Step: 122050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:24:41,366-Speed 4444.42 samples/sec Loss 2.7610 Epoch: 7 Global Step: 122100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:24:52,769-Speed 4490.10 samples/sec Loss 2.7809 Epoch: 7 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:25:04,152-Speed 4498.29 samples/sec Loss 2.7446 Epoch: 7 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:25:16,273-Speed 4224.30 samples/sec Loss 2.7793 Epoch: 7 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:25:27,591-Speed 4523.85 samples/sec Loss 2.7373 Epoch: 7 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:25:38,968-Speed 4500.45 samples/sec Loss 2.7544 Epoch: 7 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:25:50,587-Speed 4407.06 samples/sec Loss 2.7645 Epoch: 7 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:01,790-Speed 4570.59 samples/sec Loss 2.7536 Epoch: 7 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:13,143-Speed 4509.79 samples/sec Loss 2.7350 Epoch: 7 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:25,465-Speed 4155.32 samples/sec Loss 2.7555 Epoch: 7 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:36,823-Speed 4508.24 samples/sec Loss 2.7247 Epoch: 7 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:48,237-Speed 4486.07 samples/sec Loss 2.7585 Epoch: 7 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:26:59,626-Speed 4495.91 samples/sec Loss 2.7331 Epoch: 7 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:27:11,056-Speed 4479.45 samples/sec Loss 2.7313 Epoch: 7 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:27:22,410-Speed 4509.66 samples/sec Loss 2.7059 Epoch: 7 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:27:33,707-Speed 4532.61 samples/sec Loss 2.6994 Epoch: 7 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:27:45,002-Speed 4533.16 samples/sec Loss 2.7313 Epoch: 7 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:27:56,476-Speed 4462.45 samples/sec Loss 2.6992 Epoch: 7 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:28:07,848-Speed 4502.35 samples/sec Loss 2.7078 Epoch: 7 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:28:20,021-Speed 4206.43 samples/sec Loss 2.7198 Epoch: 7 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:28:31,470-Speed 4472.21 samples/sec Loss 2.7278 Epoch: 7 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-15 07:28:42,738-Speed 4544.06 samples/sec Loss 2.7230 Epoch: 7 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:28:54,119-Speed 4499.00 samples/sec Loss 2.7042 Epoch: 7 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:29:05,358-Speed 4555.83 samples/sec Loss 2.7060 Epoch: 7 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:29:16,796-Speed 4476.63 samples/sec Loss 2.6616 Epoch: 7 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:29:28,257-Speed 4467.61 samples/sec Loss 2.7286 Epoch: 7 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:29:40,355-Speed 4232.06 samples/sec Loss 2.6926 Epoch: 7 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:29:51,819-Speed 4466.49 samples/sec Loss 2.6972 Epoch: 7 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:30:03,109-Speed 4535.03 samples/sec Loss 2.6866 Epoch: 7 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:30:15,346-Speed 4184.29 samples/sec Loss 2.6741 Epoch: 7 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:30:26,832-Speed 4458.11 samples/sec Loss 2.6954 Epoch: 7 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:30:38,256-Speed 4482.06 samples/sec Loss 2.6939 Epoch: 7 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:30:49,630-Speed 4501.80 samples/sec Loss 2.7128 Epoch: 7 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:01,095-Speed 4466.24 samples/sec Loss 2.6932 Epoch: 7 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:12,373-Speed 4539.92 samples/sec Loss 2.6881 Epoch: 7 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:24,703-Speed 4152.88 samples/sec Loss 2.6978 Epoch: 7 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:36,147-Speed 4474.37 samples/sec Loss 2.6631 Epoch: 7 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:47,575-Speed 4480.61 samples/sec Loss 2.6576 Epoch: 7 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:31:58,966-Speed 4494.85 samples/sec Loss 2.6863 Epoch: 7 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:32:28,973-[lfw][124000]XNorm: 22.576331 Training: 2021-03-15 07:32:28,973-[lfw][124000]Accuracy-Flip: 0.99767+-0.00238 Training: 2021-03-15 07:32:28,973-[lfw][124000]Accuracy-Highest: 0.99783 Training: 2021-03-15 07:33:03,807-[cfp_fp][124000]XNorm: 19.697786 Training: 2021-03-15 07:33:03,807-[cfp_fp][124000]Accuracy-Flip: 0.97829+-0.00666 Training: 2021-03-15 07:33:03,807-[cfp_fp][124000]Accuracy-Highest: 0.97829 Training: 2021-03-15 07:33:33,829-[agedb_30][124000]XNorm: 22.183692 Training: 2021-03-15 07:33:33,830-[agedb_30][124000]Accuracy-Flip: 0.97417+-0.00680 Training: 2021-03-15 07:33:33,830-[agedb_30][124000]Accuracy-Highest: 0.97617 Training: 2021-03-15 07:33:45,341-Speed 481.32 samples/sec Loss 2.6752 Epoch: 7 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:33:56,667-Speed 4520.70 samples/sec Loss 2.6507 Epoch: 7 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:34:08,998-Speed 4152.35 samples/sec Loss 2.6809 Epoch: 7 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:34:20,244-Speed 4552.78 samples/sec Loss 2.6896 Epoch: 7 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:34:31,534-Speed 4535.04 samples/sec Loss 2.6542 Epoch: 7 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:34:43,085-Speed 4432.95 samples/sec Loss 2.6841 Epoch: 7 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:34:54,527-Speed 4474.98 samples/sec Loss 2.6771 Epoch: 7 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:35:06,643-Speed 4226.12 samples/sec Loss 2.6647 Epoch: 7 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:35:17,905-Speed 4546.27 samples/sec Loss 2.6892 Epoch: 7 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:35:29,305-Speed 4491.33 samples/sec Loss 2.6488 Epoch: 7 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:35:40,796-Speed 4455.97 samples/sec Loss 2.6357 Epoch: 7 Global Step: 124550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:35:52,228-Speed 4479.25 samples/sec Loss 2.6200 Epoch: 7 Global Step: 124600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:36:03,558-Speed 4519.08 samples/sec Loss 2.6872 Epoch: 7 Global Step: 124650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:36:14,793-Speed 4557.25 samples/sec Loss 2.6286 Epoch: 7 Global Step: 124700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:36:27,170-Speed 4136.98 samples/sec Loss 2.6438 Epoch: 7 Global Step: 124750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:36:38,522-Speed 4510.53 samples/sec Loss 2.6322 Epoch: 7 Global Step: 124800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:36:49,905-Speed 4498.24 samples/sec Loss 2.6261 Epoch: 7 Global Step: 124850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:01,595-Speed 4379.77 samples/sec Loss 2.6466 Epoch: 7 Global Step: 124900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:12,936-Speed 4514.85 samples/sec Loss 2.6230 Epoch: 7 Global Step: 124950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:24,347-Speed 4487.18 samples/sec Loss 2.6112 Epoch: 7 Global Step: 125000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:35,544-Speed 4572.71 samples/sec Loss 2.6698 Epoch: 7 Global Step: 125050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:47,660-Speed 4226.00 samples/sec Loss 2.6446 Epoch: 7 Global Step: 125100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:37:59,194-Speed 4439.53 samples/sec Loss 2.6700 Epoch: 7 Global Step: 125150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:38:10,653-Speed 4468.04 samples/sec Loss 2.6000 Epoch: 7 Global Step: 125200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:38:21,892-Speed 4555.95 samples/sec Loss 2.6396 Epoch: 7 Global Step: 125250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:38:33,332-Speed 4475.55 samples/sec Loss 2.6599 Epoch: 7 Global Step: 125300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:38:44,789-Speed 4469.30 samples/sec Loss 2.6401 Epoch: 7 Global Step: 125350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:38:56,107-Speed 4523.96 samples/sec Loss 2.6144 Epoch: 7 Global Step: 125400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:39:07,374-Speed 4544.58 samples/sec Loss 2.6085 Epoch: 7 Global Step: 125450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:39:18,685-Speed 4526.70 samples/sec Loss 2.5899 Epoch: 7 Global Step: 125500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:39:30,032-Speed 4512.54 samples/sec Loss 2.6089 Epoch: 7 Global Step: 125550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:39:41,435-Speed 4490.11 samples/sec Loss 2.6242 Epoch: 7 Global Step: 125600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:39:52,912-Speed 4461.69 samples/sec Loss 2.6507 Epoch: 7 Global Step: 125650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:40:04,293-Speed 4498.82 samples/sec Loss 2.5960 Epoch: 7 Global Step: 125700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:40:15,517-Speed 4561.78 samples/sec Loss 2.6011 Epoch: 7 Global Step: 125750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:40:26,768-Speed 4550.89 samples/sec Loss 2.5753 Epoch: 7 Global Step: 125800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:40:39,147-Speed 4136.40 samples/sec Loss 2.5959 Epoch: 7 Global Step: 125850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:40:50,448-Speed 4530.88 samples/sec Loss 2.6275 Epoch: 7 Global Step: 125900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:41:02,529-Speed 4238.03 samples/sec Loss 2.6520 Epoch: 7 Global Step: 125950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:41:13,822-Speed 4534.16 samples/sec Loss 2.6424 Epoch: 7 Global Step: 126000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:41:44,075-[lfw][126000]XNorm: 22.198106 Training: 2021-03-15 07:41:44,075-[lfw][126000]Accuracy-Flip: 0.99783+-0.00224 Training: 2021-03-15 07:41:44,075-[lfw][126000]Accuracy-Highest: 0.99783 Training: 2021-03-15 07:42:19,009-[cfp_fp][126000]XNorm: 20.004469 Training: 2021-03-15 07:42:19,010-[cfp_fp][126000]Accuracy-Flip: 0.97714+-0.00906 Training: 2021-03-15 07:42:19,010-[cfp_fp][126000]Accuracy-Highest: 0.97829 Training: 2021-03-15 07:42:49,220-[agedb_30][126000]XNorm: 21.736064 Training: 2021-03-15 07:42:49,221-[agedb_30][126000]Accuracy-Flip: 0.97483+-0.00861 Training: 2021-03-15 07:42:49,221-[agedb_30][126000]Accuracy-Highest: 0.97617 Training: 2021-03-15 07:43:00,597-Speed 479.51 samples/sec Loss 2.6096 Epoch: 7 Global Step: 126050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:43:12,877-Speed 4169.87 samples/sec Loss 2.6152 Epoch: 7 Global Step: 126100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:43:24,063-Speed 4577.30 samples/sec Loss 2.6044 Epoch: 7 Global Step: 126150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:43:35,450-Speed 4496.44 samples/sec Loss 2.6272 Epoch: 7 Global Step: 126200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:43:46,807-Speed 4508.56 samples/sec Loss 2.6192 Epoch: 7 Global Step: 126250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:43:58,217-Speed 4487.78 samples/sec Loss 2.6022 Epoch: 7 Global Step: 126300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:44:09,471-Speed 4549.72 samples/sec Loss 2.6031 Epoch: 7 Global Step: 126350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:44:21,726-Speed 4178.10 samples/sec Loss 2.6056 Epoch: 7 Global Step: 126400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:44:32,959-Speed 4558.36 samples/sec Loss 2.5878 Epoch: 7 Global Step: 126450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:44:44,184-Speed 4561.50 samples/sec Loss 2.6091 Epoch: 7 Global Step: 126500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:44:55,474-Speed 4535.37 samples/sec Loss 2.5669 Epoch: 7 Global Step: 126550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:45:06,804-Speed 4519.11 samples/sec Loss 2.6003 Epoch: 7 Global Step: 126600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:45:18,043-Speed 4556.05 samples/sec Loss 2.5843 Epoch: 7 Global Step: 126650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:45:29,435-Speed 4494.35 samples/sec Loss 2.5796 Epoch: 7 Global Step: 126700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:45:41,474-Speed 4253.18 samples/sec Loss 2.5863 Epoch: 7 Global Step: 126750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:45:52,765-Speed 4534.88 samples/sec Loss 2.5791 Epoch: 7 Global Step: 126800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:46:03,962-Speed 4572.71 samples/sec Loss 2.5916 Epoch: 7 Global Step: 126850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:46:15,609-Speed 4396.17 samples/sec Loss 2.5731 Epoch: 7 Global Step: 126900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:46:27,636-Speed 4257.37 samples/sec Loss 2.5713 Epoch: 7 Global Step: 126950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:46:38,888-Speed 4550.64 samples/sec Loss 2.5566 Epoch: 7 Global Step: 127000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:46:50,300-Speed 4486.52 samples/sec Loss 2.5720 Epoch: 7 Global Step: 127050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:01,537-Speed 4556.90 samples/sec Loss 2.5728 Epoch: 7 Global Step: 127100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:12,741-Speed 4570.19 samples/sec Loss 2.5681 Epoch: 7 Global Step: 127150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:24,107-Speed 4505.01 samples/sec Loss 2.6058 Epoch: 7 Global Step: 127200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:35,415-Speed 4527.72 samples/sec Loss 2.5550 Epoch: 7 Global Step: 127250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:47,705-Speed 4166.16 samples/sec Loss 2.5457 Epoch: 7 Global Step: 127300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:47:59,093-Speed 4496.38 samples/sec Loss 2.6025 Epoch: 7 Global Step: 127350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:48:10,514-Speed 4483.12 samples/sec Loss 2.5543 Epoch: 7 Global Step: 127400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:48:22,016-Speed 4451.70 samples/sec Loss 2.5449 Epoch: 7 Global Step: 127450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:48:33,334-Speed 4524.15 samples/sec Loss 2.5689 Epoch: 7 Global Step: 127500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:48:44,656-Speed 4522.29 samples/sec Loss 2.5404 Epoch: 7 Global Step: 127550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:48:56,224-Speed 4426.37 samples/sec Loss 2.5462 Epoch: 7 Global Step: 127600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:49:08,423-Speed 4197.12 samples/sec Loss 2.5574 Epoch: 7 Global Step: 127650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:49:19,771-Speed 4512.16 samples/sec Loss 2.5531 Epoch: 7 Global Step: 127700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:49:31,347-Speed 4423.06 samples/sec Loss 2.5707 Epoch: 7 Global Step: 127750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:49:42,885-Speed 4437.96 samples/sec Loss 2.5398 Epoch: 7 Global Step: 127800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:49:54,355-Speed 4463.94 samples/sec Loss 2.5434 Epoch: 7 Global Step: 127850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:50:05,623-Speed 4543.83 samples/sec Loss 2.5601 Epoch: 7 Global Step: 127900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:50:17,149-Speed 4442.47 samples/sec Loss 2.5392 Epoch: 7 Global Step: 127950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:50:28,369-Speed 4563.45 samples/sec Loss 2.5266 Epoch: 7 Global Step: 128000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:50:58,703-[lfw][128000]XNorm: 21.944622 Training: 2021-03-15 07:50:58,703-[lfw][128000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-15 07:50:58,703-[lfw][128000]Accuracy-Highest: 0.99783 Training: 2021-03-15 07:51:33,882-[cfp_fp][128000]XNorm: 19.619048 Training: 2021-03-15 07:51:33,882-[cfp_fp][128000]Accuracy-Flip: 0.97971+-0.00736 Training: 2021-03-15 07:51:33,882-[cfp_fp][128000]Accuracy-Highest: 0.97971 Training: 2021-03-15 07:52:04,222-[agedb_30][128000]XNorm: 21.851536 Training: 2021-03-15 07:52:04,222-[agedb_30][128000]Accuracy-Flip: 0.97583+-0.00761 Training: 2021-03-15 07:52:04,222-[agedb_30][128000]Accuracy-Highest: 0.97617 Training: 2021-03-15 07:52:15,513-Speed 477.87 samples/sec Loss 2.5171 Epoch: 7 Global Step: 128050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:52:26,849-Speed 4516.64 samples/sec Loss 2.5264 Epoch: 7 Global Step: 128100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:52:38,345-Speed 4453.89 samples/sec Loss 2.5399 Epoch: 7 Global Step: 128150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:52:49,923-Speed 4422.34 samples/sec Loss 2.5099 Epoch: 7 Global Step: 128200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:01,265-Speed 4514.55 samples/sec Loss 2.5776 Epoch: 7 Global Step: 128250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:12,581-Speed 4524.98 samples/sec Loss 2.5332 Epoch: 7 Global Step: 128300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:23,986-Speed 4489.51 samples/sec Loss 2.5203 Epoch: 7 Global Step: 128350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:35,504-Speed 4445.38 samples/sec Loss 2.5557 Epoch: 7 Global Step: 128400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:46,858-Speed 4509.73 samples/sec Loss 2.5252 Epoch: 7 Global Step: 128450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:53:59,071-Speed 4192.29 samples/sec Loss 2.5099 Epoch: 7 Global Step: 128500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:54:10,516-Speed 4473.68 samples/sec Loss 2.5252 Epoch: 7 Global Step: 128550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:54:22,717-Speed 4196.89 samples/sec Loss 2.5450 Epoch: 7 Global Step: 128600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:54:35,149-Speed 4118.76 samples/sec Loss 2.5028 Epoch: 7 Global Step: 128650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:54:46,436-Speed 4536.09 samples/sec Loss 2.5157 Epoch: 7 Global Step: 128700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:54:57,823-Speed 4496.75 samples/sec Loss 2.5298 Epoch: 7 Global Step: 128750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:55:09,079-Speed 4548.81 samples/sec Loss 2.5039 Epoch: 7 Global Step: 128800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:55:20,416-Speed 4516.40 samples/sec Loss 2.5052 Epoch: 7 Global Step: 128850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:55:31,840-Speed 4482.22 samples/sec Loss 2.5240 Epoch: 7 Global Step: 128900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:55:44,090-Speed 4179.73 samples/sec Loss 2.5358 Epoch: 7 Global Step: 128950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:55:55,544-Speed 4470.48 samples/sec Loss 2.4968 Epoch: 7 Global Step: 129000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:56:06,996-Speed 4470.75 samples/sec Loss 2.5137 Epoch: 7 Global Step: 129050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:56:18,433-Speed 4477.24 samples/sec Loss 2.5152 Epoch: 7 Global Step: 129100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:56:29,524-Speed 4616.62 samples/sec Loss 2.5143 Epoch: 7 Global Step: 129150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:56:41,025-Speed 4451.99 samples/sec Loss 2.4987 Epoch: 7 Global Step: 129200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:56:52,474-Speed 4472.47 samples/sec Loss 2.5234 Epoch: 7 Global Step: 129250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:57:03,888-Speed 4485.83 samples/sec Loss 2.5163 Epoch: 7 Global Step: 129300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:57:16,121-Speed 4185.60 samples/sec Loss 2.4926 Epoch: 7 Global Step: 129350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:57:27,588-Speed 4465.38 samples/sec Loss 2.5033 Epoch: 7 Global Step: 129400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:57:38,938-Speed 4510.94 samples/sec Loss 2.4907 Epoch: 7 Global Step: 129450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:57:50,359-Speed 4483.45 samples/sec Loss 2.4974 Epoch: 7 Global Step: 129500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:02,835-Speed 4103.97 samples/sec Loss 2.4848 Epoch: 7 Global Step: 129550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:14,437-Speed 4413.06 samples/sec Loss 2.4688 Epoch: 7 Global Step: 129600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:25,722-Speed 4537.32 samples/sec Loss 2.4703 Epoch: 7 Global Step: 129650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:37,074-Speed 4510.62 samples/sec Loss 2.4952 Epoch: 7 Global Step: 129700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:48,532-Speed 4468.70 samples/sec Loss 2.4888 Epoch: 7 Global Step: 129750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:58:59,863-Speed 4518.93 samples/sec Loss 2.4800 Epoch: 7 Global Step: 129800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:59:12,219-Speed 4143.66 samples/sec Loss 2.4821 Epoch: 7 Global Step: 129850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:59:23,537-Speed 4524.16 samples/sec Loss 2.4977 Epoch: 7 Global Step: 129900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:59:35,029-Speed 4455.51 samples/sec Loss 2.4612 Epoch: 7 Global Step: 129950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 07:59:46,398-Speed 4503.39 samples/sec Loss 2.4709 Epoch: 7 Global Step: 130000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:00:16,510-[lfw][130000]XNorm: 23.860656 Training: 2021-03-15 08:00:16,511-[lfw][130000]Accuracy-Flip: 0.99783+-0.00224 Training: 2021-03-15 08:00:16,511-[lfw][130000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:00:51,628-[cfp_fp][130000]XNorm: 21.574816 Training: 2021-03-15 08:00:51,629-[cfp_fp][130000]Accuracy-Flip: 0.97571+-0.00818 Training: 2021-03-15 08:00:51,629-[cfp_fp][130000]Accuracy-Highest: 0.97971 Training: 2021-03-15 08:01:22,087-[agedb_30][130000]XNorm: 23.380074 Training: 2021-03-15 08:01:22,088-[agedb_30][130000]Accuracy-Flip: 0.97750+-0.00797 Training: 2021-03-15 08:01:22,088-[agedb_30][130000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:01:33,440-Speed 478.32 samples/sec Loss 2.4598 Epoch: 7 Global Step: 130050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:01:44,867-Speed 4481.13 samples/sec Loss 2.4758 Epoch: 7 Global Step: 130100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:01:56,113-Speed 4553.00 samples/sec Loss 2.4571 Epoch: 7 Global Step: 130150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:02:07,715-Speed 4412.99 samples/sec Loss 2.4526 Epoch: 7 Global Step: 130200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:02:19,142-Speed 4481.11 samples/sec Loss 2.4591 Epoch: 7 Global Step: 130250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:02:31,452-Speed 4159.32 samples/sec Loss 2.4455 Epoch: 7 Global Step: 130300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:02:42,913-Speed 4467.42 samples/sec Loss 2.4349 Epoch: 7 Global Step: 130350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:02:54,357-Speed 4474.29 samples/sec Loss 2.4302 Epoch: 7 Global Step: 130400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:03:05,717-Speed 4507.55 samples/sec Loss 2.4557 Epoch: 7 Global Step: 130450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:03:17,082-Speed 4505.47 samples/sec Loss 2.4769 Epoch: 7 Global Step: 130500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:03:28,339-Speed 4548.12 samples/sec Loss 2.4340 Epoch: 7 Global Step: 130550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:03:39,714-Speed 4501.57 samples/sec Loss 2.4857 Epoch: 7 Global Step: 130600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:03:51,095-Speed 4498.94 samples/sec Loss 2.4441 Epoch: 7 Global Step: 130650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:02,750-Speed 4393.26 samples/sec Loss 2.4605 Epoch: 7 Global Step: 130700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:14,276-Speed 4442.17 samples/sec Loss 2.4552 Epoch: 7 Global Step: 130750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:25,775-Speed 4452.69 samples/sec Loss 2.4496 Epoch: 7 Global Step: 130800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:37,233-Speed 4468.87 samples/sec Loss 2.4510 Epoch: 7 Global Step: 130850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:48,602-Speed 4503.92 samples/sec Loss 2.4150 Epoch: 7 Global Step: 130900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:04:59,786-Speed 4577.88 samples/sec Loss 2.4229 Epoch: 7 Global Step: 130950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:05:11,418-Speed 4401.87 samples/sec Loss 2.4533 Epoch: 7 Global Step: 131000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:05:23,877-Speed 4109.88 samples/sec Loss 2.4186 Epoch: 7 Global Step: 131050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:05:35,401-Speed 4443.02 samples/sec Loss 2.4313 Epoch: 7 Global Step: 131100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:05:46,785-Speed 4498.00 samples/sec Loss 2.4240 Epoch: 7 Global Step: 131150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:05:59,925-Speed 3896.47 samples/sec Loss 2.4445 Epoch: 7 Global Step: 131200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:06:11,322-Speed 4492.88 samples/sec Loss 2.4062 Epoch: 7 Global Step: 131250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:06:22,813-Speed 4455.56 samples/sec Loss 2.4464 Epoch: 7 Global Step: 131300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:06:34,311-Speed 4453.31 samples/sec Loss 2.4476 Epoch: 7 Global Step: 131350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:06:46,605-Speed 4165.07 samples/sec Loss 2.4090 Epoch: 7 Global Step: 131400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:06:57,988-Speed 4497.84 samples/sec Loss 2.4297 Epoch: 7 Global Step: 131450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:07:09,564-Speed 4423.34 samples/sec Loss 2.4243 Epoch: 7 Global Step: 131500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:07:20,883-Speed 4523.42 samples/sec Loss 2.4347 Epoch: 7 Global Step: 131550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:07:32,411-Speed 4441.40 samples/sec Loss 2.4514 Epoch: 7 Global Step: 131600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:07:43,749-Speed 4516.07 samples/sec Loss 2.4250 Epoch: 7 Global Step: 131650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:07:55,201-Speed 4470.95 samples/sec Loss 2.4417 Epoch: 7 Global Step: 131700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:08:06,586-Speed 4497.48 samples/sec Loss 2.4285 Epoch: 7 Global Step: 131750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:08:18,035-Speed 4472.16 samples/sec Loss 2.4042 Epoch: 7 Global Step: 131800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:08:29,462-Speed 4481.14 samples/sec Loss 2.4212 Epoch: 7 Global Step: 131850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:08:40,973-Speed 4448.11 samples/sec Loss 2.4151 Epoch: 7 Global Step: 131900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:08:52,437-Speed 4466.28 samples/sec Loss 2.4421 Epoch: 7 Global Step: 131950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:09:04,723-Speed 4167.72 samples/sec Loss 2.4137 Epoch: 7 Global Step: 132000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:09:35,194-[lfw][132000]XNorm: 22.206704 Training: 2021-03-15 08:09:35,194-[lfw][132000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-15 08:09:35,194-[lfw][132000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:10:10,365-[cfp_fp][132000]XNorm: 19.960813 Training: 2021-03-15 08:10:10,365-[cfp_fp][132000]Accuracy-Flip: 0.97800+-0.00466 Training: 2021-03-15 08:10:10,365-[cfp_fp][132000]Accuracy-Highest: 0.97971 Training: 2021-03-15 08:10:40,722-[agedb_30][132000]XNorm: 22.050293 Training: 2021-03-15 08:10:40,723-[agedb_30][132000]Accuracy-Flip: 0.97417+-0.00775 Training: 2021-03-15 08:10:40,723-[agedb_30][132000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:10:52,870-Speed 473.43 samples/sec Loss 2.4081 Epoch: 7 Global Step: 132050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:11:04,291-Speed 4483.06 samples/sec Loss 2.4363 Epoch: 7 Global Step: 132100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:11:15,883-Speed 4417.26 samples/sec Loss 2.4041 Epoch: 7 Global Step: 132150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:11:27,178-Speed 4533.33 samples/sec Loss 2.4100 Epoch: 7 Global Step: 132200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:11:38,635-Speed 4469.01 samples/sec Loss 2.3790 Epoch: 7 Global Step: 132250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:11:49,999-Speed 4505.70 samples/sec Loss 2.4047 Epoch: 7 Global Step: 132300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:01,420-Speed 4483.12 samples/sec Loss 2.4160 Epoch: 7 Global Step: 132350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:12,799-Speed 4499.84 samples/sec Loss 2.4129 Epoch: 7 Global Step: 132400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:24,186-Speed 4496.36 samples/sec Loss 2.4034 Epoch: 7 Global Step: 132450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:36,370-Speed 4202.45 samples/sec Loss 2.3872 Epoch: 7 Global Step: 132500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:47,799-Speed 4480.28 samples/sec Loss 2.3870 Epoch: 7 Global Step: 132550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:12:59,184-Speed 4497.12 samples/sec Loss 2.4187 Epoch: 7 Global Step: 132600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:13:10,702-Speed 4445.47 samples/sec Loss 2.3864 Epoch: 7 Global Step: 132650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:13:22,066-Speed 4505.64 samples/sec Loss 2.4008 Epoch: 7 Global Step: 132700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:13:33,637-Speed 4425.16 samples/sec Loss 2.3882 Epoch: 7 Global Step: 132750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:13:45,008-Speed 4502.79 samples/sec Loss 2.3733 Epoch: 7 Global Step: 132800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:13:57,534-Speed 4087.83 samples/sec Loss 2.3939 Epoch: 7 Global Step: 132850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:14:08,986-Speed 4471.04 samples/sec Loss 2.3714 Epoch: 7 Global Step: 132900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:14:20,525-Speed 4437.05 samples/sec Loss 2.3985 Epoch: 7 Global Step: 132950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:14:32,011-Speed 4457.81 samples/sec Loss 2.3859 Epoch: 7 Global Step: 133000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:14:43,453-Speed 4475.07 samples/sec Loss 2.3501 Epoch: 7 Global Step: 133050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:14:54,930-Speed 4461.52 samples/sec Loss 2.3936 Epoch: 7 Global Step: 133100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:15:06,453-Speed 4443.25 samples/sec Loss 2.4335 Epoch: 7 Global Step: 133150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:15:17,989-Speed 4438.87 samples/sec Loss 2.3491 Epoch: 7 Global Step: 133200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:15:29,544-Speed 4431.01 samples/sec Loss 2.3822 Epoch: 7 Global Step: 133250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:15:40,849-Speed 4529.16 samples/sec Loss 2.3534 Epoch: 7 Global Step: 133300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:15:52,498-Speed 4395.37 samples/sec Loss 2.3754 Epoch: 7 Global Step: 133350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:16:03,782-Speed 4537.69 samples/sec Loss 2.3594 Epoch: 7 Global Step: 133400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:16:15,319-Speed 4438.24 samples/sec Loss 2.3737 Epoch: 7 Global Step: 133450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:16:26,792-Speed 4462.95 samples/sec Loss 2.3347 Epoch: 7 Global Step: 133500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:16:51,399-Speed 2080.79 samples/sec Loss 2.2476 Epoch: 8 Global Step: 133550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:17:03,223-Speed 4330.44 samples/sec Loss 2.0462 Epoch: 8 Global Step: 133600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:17:14,974-Speed 4357.78 samples/sec Loss 2.0491 Epoch: 8 Global Step: 133650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:17:28,895-Speed 3677.89 samples/sec Loss 2.0580 Epoch: 8 Global Step: 133700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:17:40,622-Speed 4366.40 samples/sec Loss 2.0413 Epoch: 8 Global Step: 133750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:17:53,239-Speed 4058.34 samples/sec Loss 2.0298 Epoch: 8 Global Step: 133800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:18:04,705-Speed 4465.25 samples/sec Loss 2.0383 Epoch: 8 Global Step: 133850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:18:16,274-Speed 4425.96 samples/sec Loss 2.0452 Epoch: 8 Global Step: 133900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:18:28,695-Speed 4122.30 samples/sec Loss 2.0516 Epoch: 8 Global Step: 133950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:18:39,883-Speed 4576.46 samples/sec Loss 2.0508 Epoch: 8 Global Step: 134000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:19:10,103-[lfw][134000]XNorm: 22.636900 Training: 2021-03-15 08:19:10,103-[lfw][134000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-15 08:19:10,104-[lfw][134000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:19:45,430-[cfp_fp][134000]XNorm: 20.093986 Training: 2021-03-15 08:19:45,430-[cfp_fp][134000]Accuracy-Flip: 0.97986+-0.00544 Training: 2021-03-15 08:19:45,430-[cfp_fp][134000]Accuracy-Highest: 0.97986 Training: 2021-03-15 08:20:15,853-[agedb_30][134000]XNorm: 22.410643 Training: 2021-03-15 08:20:15,853-[agedb_30][134000]Accuracy-Flip: 0.97567+-0.00863 Training: 2021-03-15 08:20:15,853-[agedb_30][134000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:20:27,262-Speed 476.82 samples/sec Loss 2.0183 Epoch: 8 Global Step: 134050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:20:38,553-Speed 4534.94 samples/sec Loss 2.0345 Epoch: 8 Global Step: 134100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:20:49,990-Speed 4476.75 samples/sec Loss 2.0751 Epoch: 8 Global Step: 134150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:01,410-Speed 4483.74 samples/sec Loss 2.0681 Epoch: 8 Global Step: 134200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:13,001-Speed 4417.38 samples/sec Loss 2.0602 Epoch: 8 Global Step: 134250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:24,567-Speed 4426.89 samples/sec Loss 2.0558 Epoch: 8 Global Step: 134300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:36,039-Speed 4463.46 samples/sec Loss 2.0626 Epoch: 8 Global Step: 134350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:47,424-Speed 4497.38 samples/sec Loss 2.0228 Epoch: 8 Global Step: 134400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:21:58,967-Speed 4435.56 samples/sec Loss 2.0359 Epoch: 8 Global Step: 134450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:22:10,354-Speed 4496.52 samples/sec Loss 2.0653 Epoch: 8 Global Step: 134500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:22:23,466-Speed 3905.33 samples/sec Loss 2.0692 Epoch: 8 Global Step: 134550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:22:34,739-Speed 4541.89 samples/sec Loss 2.0573 Epoch: 8 Global Step: 134600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:22:46,016-Speed 4540.38 samples/sec Loss 2.0754 Epoch: 8 Global Step: 134650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:22:57,495-Speed 4460.69 samples/sec Loss 2.0766 Epoch: 8 Global Step: 134700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:23:08,972-Speed 4461.41 samples/sec Loss 2.0380 Epoch: 8 Global Step: 134750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:23:20,233-Speed 4547.02 samples/sec Loss 2.0852 Epoch: 8 Global Step: 134800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:23:31,707-Speed 4462.45 samples/sec Loss 2.0775 Epoch: 8 Global Step: 134850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:23:43,290-Speed 4420.44 samples/sec Loss 2.0586 Epoch: 8 Global Step: 134900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:23:54,640-Speed 4511.09 samples/sec Loss 2.0610 Epoch: 8 Global Step: 134950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:24:06,119-Speed 4460.50 samples/sec Loss 2.0751 Epoch: 8 Global Step: 135000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:24:18,329-Speed 4193.67 samples/sec Loss 2.0393 Epoch: 8 Global Step: 135050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:24:29,613-Speed 4537.32 samples/sec Loss 2.0576 Epoch: 8 Global Step: 135100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:24:40,959-Speed 4512.96 samples/sec Loss 2.0767 Epoch: 8 Global Step: 135150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:24:52,346-Speed 4496.58 samples/sec Loss 2.1043 Epoch: 8 Global Step: 135200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:25:03,581-Speed 4557.57 samples/sec Loss 2.0948 Epoch: 8 Global Step: 135250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:25:15,144-Speed 4427.90 samples/sec Loss 2.0957 Epoch: 8 Global Step: 135300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:25:26,545-Speed 4491.17 samples/sec Loss 2.0752 Epoch: 8 Global Step: 135350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:25:38,805-Speed 4176.48 samples/sec Loss 2.0732 Epoch: 8 Global Step: 135400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:25:50,003-Speed 4572.51 samples/sec Loss 2.0723 Epoch: 8 Global Step: 135450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:26:01,317-Speed 4525.46 samples/sec Loss 2.0870 Epoch: 8 Global Step: 135500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:26:12,668-Speed 4510.89 samples/sec Loss 2.0570 Epoch: 8 Global Step: 135550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:26:24,071-Speed 4490.57 samples/sec Loss 2.0804 Epoch: 8 Global Step: 135600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-15 08:26:35,366-Speed 4533.24 samples/sec Loss 2.0913 Epoch: 8 Global Step: 135650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:26:46,822-Speed 4469.57 samples/sec Loss 2.1096 Epoch: 8 Global Step: 135700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:26:58,247-Speed 4481.40 samples/sec Loss 2.0816 Epoch: 8 Global Step: 135750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:27:09,712-Speed 4466.09 samples/sec Loss 2.0582 Epoch: 8 Global Step: 135800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:27:21,189-Speed 4461.30 samples/sec Loss 2.0849 Epoch: 8 Global Step: 135850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:27:32,645-Speed 4469.56 samples/sec Loss 2.0670 Epoch: 8 Global Step: 135900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:27:44,156-Speed 4447.85 samples/sec Loss 2.1014 Epoch: 8 Global Step: 135950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:27:55,687-Speed 4440.50 samples/sec Loss 2.0974 Epoch: 8 Global Step: 136000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:28:25,909-[lfw][136000]XNorm: 23.197267 Training: 2021-03-15 08:28:25,909-[lfw][136000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-15 08:28:25,909-[lfw][136000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:29:01,110-[cfp_fp][136000]XNorm: 20.698491 Training: 2021-03-15 08:29:01,110-[cfp_fp][136000]Accuracy-Flip: 0.97957+-0.00443 Training: 2021-03-15 08:29:01,110-[cfp_fp][136000]Accuracy-Highest: 0.97986 Training: 2021-03-15 08:29:31,384-[agedb_30][136000]XNorm: 22.938496 Training: 2021-03-15 08:29:31,384-[agedb_30][136000]Accuracy-Flip: 0.97350+-0.00724 Training: 2021-03-15 08:29:31,384-[agedb_30][136000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:29:42,801-Speed 478.00 samples/sec Loss 2.0824 Epoch: 8 Global Step: 136050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:29:54,040-Speed 4555.90 samples/sec Loss 2.0856 Epoch: 8 Global Step: 136100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:30:05,438-Speed 4492.46 samples/sec Loss 2.0937 Epoch: 8 Global Step: 136150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:30:16,882-Speed 4474.18 samples/sec Loss 2.0563 Epoch: 8 Global Step: 136200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:30:28,176-Speed 4533.41 samples/sec Loss 2.0813 Epoch: 8 Global Step: 136250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:30:40,802-Speed 4055.43 samples/sec Loss 2.0800 Epoch: 8 Global Step: 136300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:30:52,826-Speed 4258.13 samples/sec Loss 2.0466 Epoch: 8 Global Step: 136350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:31:05,014-Speed 4201.29 samples/sec Loss 2.0419 Epoch: 8 Global Step: 136400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:31:16,205-Speed 4575.30 samples/sec Loss 2.0685 Epoch: 8 Global Step: 136450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:31:27,498-Speed 4534.05 samples/sec Loss 2.0740 Epoch: 8 Global Step: 136500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:31:38,886-Speed 4496.27 samples/sec Loss 2.0647 Epoch: 8 Global Step: 136550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:31:51,082-Speed 4198.27 samples/sec Loss 2.0847 Epoch: 8 Global Step: 136600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:02,519-Speed 4476.88 samples/sec Loss 2.1079 Epoch: 8 Global Step: 136650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:14,080-Speed 4428.79 samples/sec Loss 2.0969 Epoch: 8 Global Step: 136700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:25,473-Speed 4494.12 samples/sec Loss 2.0683 Epoch: 8 Global Step: 136750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:36,860-Speed 4496.78 samples/sec Loss 2.0662 Epoch: 8 Global Step: 136800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:48,310-Speed 4471.87 samples/sec Loss 2.0918 Epoch: 8 Global Step: 136850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:32:59,772-Speed 4467.24 samples/sec Loss 2.0832 Epoch: 8 Global Step: 136900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:33:11,276-Speed 4450.95 samples/sec Loss 2.1065 Epoch: 8 Global Step: 136950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:33:22,739-Speed 4466.54 samples/sec Loss 2.1008 Epoch: 8 Global Step: 137000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:33:34,477-Speed 4361.98 samples/sec Loss 2.1244 Epoch: 8 Global Step: 137050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:33:46,755-Speed 4170.45 samples/sec Loss 2.0905 Epoch: 8 Global Step: 137100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:33:59,104-Speed 4146.17 samples/sec Loss 2.1180 Epoch: 8 Global Step: 137150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:34:10,449-Speed 4513.27 samples/sec Loss 2.0950 Epoch: 8 Global Step: 137200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:34:21,890-Speed 4475.42 samples/sec Loss 2.1131 Epoch: 8 Global Step: 137250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:34:33,312-Speed 4482.84 samples/sec Loss 2.0870 Epoch: 8 Global Step: 137300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:34:44,870-Speed 4429.84 samples/sec Loss 2.1053 Epoch: 8 Global Step: 137350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:34:56,437-Speed 4426.63 samples/sec Loss 2.0938 Epoch: 8 Global Step: 137400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:35:07,925-Speed 4457.12 samples/sec Loss 2.0837 Epoch: 8 Global Step: 137450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:35:19,554-Speed 4402.86 samples/sec Loss 2.0866 Epoch: 8 Global Step: 137500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:35:31,271-Speed 4370.17 samples/sec Loss 2.0858 Epoch: 8 Global Step: 137550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:35:42,742-Speed 4463.60 samples/sec Loss 2.0936 Epoch: 8 Global Step: 137600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:35:55,079-Speed 4150.42 samples/sec Loss 2.0632 Epoch: 8 Global Step: 137650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:36:06,376-Speed 4532.56 samples/sec Loss 2.0796 Epoch: 8 Global Step: 137700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:36:17,762-Speed 4496.64 samples/sec Loss 2.0813 Epoch: 8 Global Step: 137750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:36:29,136-Speed 4502.03 samples/sec Loss 2.0671 Epoch: 8 Global Step: 137800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:36:40,556-Speed 4483.45 samples/sec Loss 2.0749 Epoch: 8 Global Step: 137850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:36:51,957-Speed 4491.13 samples/sec Loss 2.0858 Epoch: 8 Global Step: 137900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:37:03,393-Speed 4477.40 samples/sec Loss 2.0983 Epoch: 8 Global Step: 137950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:37:15,792-Speed 4129.54 samples/sec Loss 2.0949 Epoch: 8 Global Step: 138000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:37:46,025-[lfw][138000]XNorm: 21.939230 Training: 2021-03-15 08:37:46,025-[lfw][138000]Accuracy-Flip: 0.99700+-0.00267 Training: 2021-03-15 08:37:46,026-[lfw][138000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:38:21,178-[cfp_fp][138000]XNorm: 19.572365 Training: 2021-03-15 08:38:21,179-[cfp_fp][138000]Accuracy-Flip: 0.98071+-0.00610 Training: 2021-03-15 08:38:21,179-[cfp_fp][138000]Accuracy-Highest: 0.98071 Training: 2021-03-15 08:38:51,520-[agedb_30][138000]XNorm: 21.558992 Training: 2021-03-15 08:38:51,520-[agedb_30][138000]Accuracy-Flip: 0.97667+-0.00810 Training: 2021-03-15 08:38:51,520-[agedb_30][138000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:39:02,899-Speed 478.03 samples/sec Loss 2.1069 Epoch: 8 Global Step: 138050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:39:14,272-Speed 4502.20 samples/sec Loss 2.0802 Epoch: 8 Global Step: 138100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:39:25,669-Speed 4492.54 samples/sec Loss 2.0821 Epoch: 8 Global Step: 138150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:39:37,096-Speed 4481.02 samples/sec Loss 2.0817 Epoch: 8 Global Step: 138200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:39:48,589-Speed 4454.91 samples/sec Loss 2.0936 Epoch: 8 Global Step: 138250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:39:59,940-Speed 4510.94 samples/sec Loss 2.0812 Epoch: 8 Global Step: 138300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:40:11,734-Speed 4341.29 samples/sec Loss 2.1406 Epoch: 8 Global Step: 138350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:40:23,271-Speed 4438.32 samples/sec Loss 2.0912 Epoch: 8 Global Step: 138400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:40:34,871-Speed 4413.80 samples/sec Loss 2.0777 Epoch: 8 Global Step: 138450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:40:46,489-Speed 4407.44 samples/sec Loss 2.0664 Epoch: 8 Global Step: 138500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:40:57,959-Speed 4463.70 samples/sec Loss 2.1041 Epoch: 8 Global Step: 138550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:41:09,518-Speed 4430.01 samples/sec Loss 2.0977 Epoch: 8 Global Step: 138600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:41:21,034-Speed 4446.12 samples/sec Loss 2.0454 Epoch: 8 Global Step: 138650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:41:32,308-Speed 4541.60 samples/sec Loss 2.1003 Epoch: 8 Global Step: 138700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:41:43,801-Speed 4455.22 samples/sec Loss 2.0889 Epoch: 8 Global Step: 138750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:41:55,119-Speed 4523.67 samples/sec Loss 2.0754 Epoch: 8 Global Step: 138800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:42:07,802-Speed 4037.10 samples/sec Loss 2.1006 Epoch: 8 Global Step: 138850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:42:19,159-Speed 4508.50 samples/sec Loss 2.0626 Epoch: 8 Global Step: 138900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:42:31,601-Speed 4115.44 samples/sec Loss 2.0516 Epoch: 8 Global Step: 138950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:42:43,920-Speed 4156.18 samples/sec Loss 2.0735 Epoch: 8 Global Step: 139000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:42:55,306-Speed 4497.04 samples/sec Loss 2.0931 Epoch: 8 Global Step: 139050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:43:06,875-Speed 4425.86 samples/sec Loss 2.0815 Epoch: 8 Global Step: 139100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:43:19,212-Speed 4150.41 samples/sec Loss 2.0823 Epoch: 8 Global Step: 139150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:43:30,792-Speed 4421.55 samples/sec Loss 2.0583 Epoch: 8 Global Step: 139200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:43:42,251-Speed 4468.36 samples/sec Loss 2.0914 Epoch: 8 Global Step: 139250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:43:53,705-Speed 4470.32 samples/sec Loss 2.1001 Epoch: 8 Global Step: 139300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:44:05,184-Speed 4460.47 samples/sec Loss 2.0719 Epoch: 8 Global Step: 139350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:44:16,564-Speed 4499.19 samples/sec Loss 2.0671 Epoch: 8 Global Step: 139400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:44:28,094-Speed 4441.09 samples/sec Loss 2.0657 Epoch: 8 Global Step: 139450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:44:39,542-Speed 4472.40 samples/sec Loss 2.1040 Epoch: 8 Global Step: 139500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:44:50,861-Speed 4523.76 samples/sec Loss 2.0795 Epoch: 8 Global Step: 139550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:45:02,485-Speed 4404.89 samples/sec Loss 2.1078 Epoch: 8 Global Step: 139600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:45:13,914-Speed 4480.11 samples/sec Loss 2.0742 Epoch: 8 Global Step: 139650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:45:26,151-Speed 4184.40 samples/sec Loss 2.0600 Epoch: 8 Global Step: 139700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:45:37,591-Speed 4475.52 samples/sec Loss 2.0630 Epoch: 8 Global Step: 139750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:45:50,141-Speed 4080.06 samples/sec Loss 2.0755 Epoch: 8 Global Step: 139800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:46:01,497-Speed 4508.84 samples/sec Loss 2.0666 Epoch: 8 Global Step: 139850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:46:13,071-Speed 4423.71 samples/sec Loss 2.0817 Epoch: 8 Global Step: 139900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:46:24,611-Speed 4437.16 samples/sec Loss 2.0694 Epoch: 8 Global Step: 139950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:46:36,140-Speed 4441.15 samples/sec Loss 2.0974 Epoch: 8 Global Step: 140000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:47:06,454-[lfw][140000]XNorm: 22.568684 Training: 2021-03-15 08:47:06,454-[lfw][140000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-15 08:47:06,454-[lfw][140000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:47:41,737-[cfp_fp][140000]XNorm: 20.209167 Training: 2021-03-15 08:47:41,737-[cfp_fp][140000]Accuracy-Flip: 0.97700+-0.00580 Training: 2021-03-15 08:47:41,737-[cfp_fp][140000]Accuracy-Highest: 0.98071 Training: 2021-03-15 08:48:12,099-[agedb_30][140000]XNorm: 21.942189 Training: 2021-03-15 08:48:12,099-[agedb_30][140000]Accuracy-Flip: 0.97750+-0.00743 Training: 2021-03-15 08:48:12,099-[agedb_30][140000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:48:23,435-Speed 477.19 samples/sec Loss 2.0880 Epoch: 8 Global Step: 140050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:48:34,785-Speed 4511.22 samples/sec Loss 2.0813 Epoch: 8 Global Step: 140100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:48:46,280-Speed 4454.45 samples/sec Loss 2.0952 Epoch: 8 Global Step: 140150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:48:58,630-Speed 4145.81 samples/sec Loss 2.1057 Epoch: 8 Global Step: 140200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:49:10,236-Speed 4411.99 samples/sec Loss 2.1204 Epoch: 8 Global Step: 140250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:49:21,818-Speed 4420.80 samples/sec Loss 2.1008 Epoch: 8 Global Step: 140300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:49:33,302-Speed 4458.71 samples/sec Loss 2.1112 Epoch: 8 Global Step: 140350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:49:44,844-Speed 4436.14 samples/sec Loss 2.0924 Epoch: 8 Global Step: 140400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:49:56,255-Speed 4487.03 samples/sec Loss 2.1224 Epoch: 8 Global Step: 140450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:50:07,698-Speed 4474.66 samples/sec Loss 2.0758 Epoch: 8 Global Step: 140500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:50:19,940-Speed 4182.50 samples/sec Loss 2.0938 Epoch: 8 Global Step: 140550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:50:31,496-Speed 4430.61 samples/sec Loss 2.0896 Epoch: 8 Global Step: 140600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:50:43,021-Speed 4442.99 samples/sec Loss 2.1109 Epoch: 8 Global Step: 140650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:50:54,488-Speed 4465.09 samples/sec Loss 2.0599 Epoch: 8 Global Step: 140700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:51:05,867-Speed 4499.84 samples/sec Loss 2.0717 Epoch: 8 Global Step: 140750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:51:17,273-Speed 4488.87 samples/sec Loss 2.0918 Epoch: 8 Global Step: 140800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:51:28,693-Speed 4483.87 samples/sec Loss 2.0552 Epoch: 8 Global Step: 140850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:51:40,130-Speed 4476.63 samples/sec Loss 2.0937 Epoch: 8 Global Step: 140900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:51:51,367-Speed 4556.88 samples/sec Loss 2.0517 Epoch: 8 Global Step: 140950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:52:02,944-Speed 4422.69 samples/sec Loss 2.0739 Epoch: 8 Global Step: 141000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:52:14,395-Speed 4471.26 samples/sec Loss 2.0851 Epoch: 8 Global Step: 141050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:52:25,828-Speed 4478.52 samples/sec Loss 2.0390 Epoch: 8 Global Step: 141100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:52:37,406-Speed 4422.49 samples/sec Loss 2.0776 Epoch: 8 Global Step: 141150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:52:49,220-Speed 4334.03 samples/sec Loss 2.0873 Epoch: 8 Global Step: 141200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:01,107-Speed 4307.41 samples/sec Loss 2.0839 Epoch: 8 Global Step: 141250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:12,587-Speed 4460.08 samples/sec Loss 2.0793 Epoch: 8 Global Step: 141300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:24,085-Speed 4453.10 samples/sec Loss 2.0805 Epoch: 8 Global Step: 141350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:35,460-Speed 4501.29 samples/sec Loss 2.0624 Epoch: 8 Global Step: 141400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:47,636-Speed 4205.17 samples/sec Loss 2.0612 Epoch: 8 Global Step: 141450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:53:59,767-Speed 4220.93 samples/sec Loss 2.0590 Epoch: 8 Global Step: 141500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:54:11,190-Speed 4482.57 samples/sec Loss 2.0888 Epoch: 8 Global Step: 141550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:54:23,527-Speed 4150.23 samples/sec Loss 2.1148 Epoch: 8 Global Step: 141600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:54:35,090-Speed 4428.22 samples/sec Loss 2.0708 Epoch: 8 Global Step: 141650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:54:46,589-Speed 4452.62 samples/sec Loss 2.0620 Epoch: 8 Global Step: 141700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:54:58,869-Speed 4169.81 samples/sec Loss 2.0675 Epoch: 8 Global Step: 141750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:55:10,428-Speed 4429.42 samples/sec Loss 2.0780 Epoch: 8 Global Step: 141800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:55:21,805-Speed 4500.63 samples/sec Loss 2.0760 Epoch: 8 Global Step: 141850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:55:33,330-Speed 4442.48 samples/sec Loss 2.0908 Epoch: 8 Global Step: 141900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:55:44,857-Speed 4442.26 samples/sec Loss 2.1048 Epoch: 8 Global Step: 141950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:55:56,448-Speed 4417.38 samples/sec Loss 2.0873 Epoch: 8 Global Step: 142000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:56:26,781-[lfw][142000]XNorm: 22.717417 Training: 2021-03-15 08:56:26,782-[lfw][142000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-15 08:56:26,782-[lfw][142000]Accuracy-Highest: 0.99783 Training: 2021-03-15 08:57:02,269-[cfp_fp][142000]XNorm: 20.716081 Training: 2021-03-15 08:57:02,269-[cfp_fp][142000]Accuracy-Flip: 0.97700+-0.00573 Training: 2021-03-15 08:57:02,269-[cfp_fp][142000]Accuracy-Highest: 0.98071 Training: 2021-03-15 08:57:32,601-[agedb_30][142000]XNorm: 22.463886 Training: 2021-03-15 08:57:32,601-[agedb_30][142000]Accuracy-Flip: 0.97450+-0.00738 Training: 2021-03-15 08:57:32,601-[agedb_30][142000]Accuracy-Highest: 0.97750 Training: 2021-03-15 08:57:44,042-Speed 475.86 samples/sec Loss 2.0871 Epoch: 8 Global Step: 142050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:57:55,522-Speed 4460.08 samples/sec Loss 2.0934 Epoch: 8 Global Step: 142100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:58:07,073-Speed 4432.80 samples/sec Loss 2.0767 Epoch: 8 Global Step: 142150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:58:18,340-Speed 4544.59 samples/sec Loss 2.0633 Epoch: 8 Global Step: 142200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:58:29,856-Speed 4446.08 samples/sec Loss 2.0744 Epoch: 8 Global Step: 142250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:58:41,517-Speed 4390.97 samples/sec Loss 2.0793 Epoch: 8 Global Step: 142300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:58:53,978-Speed 4109.07 samples/sec Loss 2.0809 Epoch: 8 Global Step: 142350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:59:06,482-Speed 4095.05 samples/sec Loss 2.0651 Epoch: 8 Global Step: 142400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:59:17,876-Speed 4493.81 samples/sec Loss 2.0765 Epoch: 8 Global Step: 142450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:59:29,394-Speed 4445.37 samples/sec Loss 2.0869 Epoch: 8 Global Step: 142500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:59:40,875-Speed 4459.57 samples/sec Loss 2.0851 Epoch: 8 Global Step: 142550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 08:59:52,387-Speed 4447.83 samples/sec Loss 2.0683 Epoch: 8 Global Step: 142600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:00:03,597-Speed 4567.46 samples/sec Loss 2.0729 Epoch: 8 Global Step: 142650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:00:15,841-Speed 4181.79 samples/sec Loss 2.0910 Epoch: 8 Global Step: 142700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:00:27,486-Speed 4397.00 samples/sec Loss 2.0600 Epoch: 8 Global Step: 142750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:00:39,090-Speed 4412.50 samples/sec Loss 2.0638 Epoch: 8 Global Step: 142800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:00:50,801-Speed 4372.39 samples/sec Loss 2.0877 Epoch: 8 Global Step: 142850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:01:02,436-Speed 4400.47 samples/sec Loss 2.0787 Epoch: 8 Global Step: 142900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:01:13,941-Speed 4450.47 samples/sec Loss 2.0835 Epoch: 8 Global Step: 142950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:01:25,328-Speed 4496.68 samples/sec Loss 2.0512 Epoch: 8 Global Step: 143000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:01:36,839-Speed 4448.00 samples/sec Loss 2.0703 Epoch: 8 Global Step: 143050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:01:49,170-Speed 4152.40 samples/sec Loss 2.0881 Epoch: 8 Global Step: 143100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:00,578-Speed 4488.40 samples/sec Loss 2.0560 Epoch: 8 Global Step: 143150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:12,110-Speed 4440.23 samples/sec Loss 2.0431 Epoch: 8 Global Step: 143200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:23,664-Speed 4431.75 samples/sec Loss 2.1059 Epoch: 8 Global Step: 143250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:35,320-Speed 4392.54 samples/sec Loss 2.0603 Epoch: 8 Global Step: 143300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:46,814-Speed 4454.91 samples/sec Loss 2.0815 Epoch: 8 Global Step: 143350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:02:58,342-Speed 4441.56 samples/sec Loss 2.0718 Epoch: 8 Global Step: 143400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:03:10,061-Speed 4369.24 samples/sec Loss 2.0734 Epoch: 8 Global Step: 143450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:03:21,455-Speed 4493.83 samples/sec Loss 2.0805 Epoch: 8 Global Step: 143500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:03:33,259-Speed 4337.88 samples/sec Loss 2.0870 Epoch: 8 Global Step: 143550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:03:44,744-Speed 4457.95 samples/sec Loss 2.0760 Epoch: 8 Global Step: 143600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:03:56,034-Speed 4535.41 samples/sec Loss 2.0746 Epoch: 8 Global Step: 143650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:04:07,605-Speed 4424.90 samples/sec Loss 2.0918 Epoch: 8 Global Step: 143700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:04:19,208-Speed 4412.73 samples/sec Loss 2.0991 Epoch: 8 Global Step: 143750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:04:30,781-Speed 4424.40 samples/sec Loss 2.0637 Epoch: 8 Global Step: 143800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:04:42,309-Speed 4441.58 samples/sec Loss 2.0561 Epoch: 8 Global Step: 143850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:04:53,824-Speed 4446.49 samples/sec Loss 2.0490 Epoch: 8 Global Step: 143900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:05:05,277-Speed 4470.69 samples/sec Loss 2.0468 Epoch: 8 Global Step: 143950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:05:16,749-Speed 4463.34 samples/sec Loss 2.0675 Epoch: 8 Global Step: 144000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:05:46,890-[lfw][144000]XNorm: 21.662423 Training: 2021-03-15 09:05:46,890-[lfw][144000]Accuracy-Flip: 0.99750+-0.00291 Training: 2021-03-15 09:05:46,890-[lfw][144000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:06:21,858-[cfp_fp][144000]XNorm: 19.268890 Training: 2021-03-15 09:06:21,859-[cfp_fp][144000]Accuracy-Flip: 0.97900+-0.00610 Training: 2021-03-15 09:06:21,859-[cfp_fp][144000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:06:52,017-[agedb_30][144000]XNorm: 21.357628 Training: 2021-03-15 09:06:52,018-[agedb_30][144000]Accuracy-Flip: 0.97917+-0.00597 Training: 2021-03-15 09:06:52,018-[agedb_30][144000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:07:04,454-Speed 475.37 samples/sec Loss 2.0756 Epoch: 8 Global Step: 144050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:07:15,813-Speed 4507.76 samples/sec Loss 2.0622 Epoch: 8 Global Step: 144100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:07:28,181-Speed 4139.87 samples/sec Loss 2.0801 Epoch: 8 Global Step: 144150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:07:40,757-Speed 4071.53 samples/sec Loss 2.0991 Epoch: 8 Global Step: 144200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:07:52,225-Speed 4464.74 samples/sec Loss 2.0391 Epoch: 8 Global Step: 144250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:08:04,107-Speed 4309.20 samples/sec Loss 2.0708 Epoch: 8 Global Step: 144300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:08:16,422-Speed 4157.54 samples/sec Loss 2.0869 Epoch: 8 Global Step: 144350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:08:27,954-Speed 4440.04 samples/sec Loss 2.0703 Epoch: 8 Global Step: 144400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:08:39,188-Speed 4557.96 samples/sec Loss 2.0736 Epoch: 8 Global Step: 144450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:08:50,549-Speed 4506.86 samples/sec Loss 2.0801 Epoch: 8 Global Step: 144500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:02,006-Speed 4468.94 samples/sec Loss 2.0840 Epoch: 8 Global Step: 144550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:13,455-Speed 4472.39 samples/sec Loss 2.0560 Epoch: 8 Global Step: 144600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:25,136-Speed 4383.35 samples/sec Loss 2.0775 Epoch: 8 Global Step: 144650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:36,715-Speed 4421.86 samples/sec Loss 2.0683 Epoch: 8 Global Step: 144700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:48,288-Speed 4424.45 samples/sec Loss 2.0535 Epoch: 8 Global Step: 144750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:09:59,784-Speed 4454.00 samples/sec Loss 2.0933 Epoch: 8 Global Step: 144800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:10:11,429-Speed 4396.98 samples/sec Loss 2.0647 Epoch: 8 Global Step: 144850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:10:23,940-Speed 4092.45 samples/sec Loss 2.0942 Epoch: 8 Global Step: 144900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:10:35,564-Speed 4405.02 samples/sec Loss 2.0684 Epoch: 8 Global Step: 144950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:10:47,755-Speed 4199.76 samples/sec Loss 2.0427 Epoch: 8 Global Step: 145000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:10:59,213-Speed 4468.87 samples/sec Loss 2.0595 Epoch: 8 Global Step: 145050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:11:10,862-Speed 4395.57 samples/sec Loss 2.0592 Epoch: 8 Global Step: 145100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:11:22,258-Speed 4492.84 samples/sec Loss 2.0641 Epoch: 8 Global Step: 145150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:11:34,565-Speed 4160.48 samples/sec Loss 2.0422 Epoch: 8 Global Step: 145200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:11:46,026-Speed 4467.67 samples/sec Loss 2.0896 Epoch: 8 Global Step: 145250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:11:57,518-Speed 4455.32 samples/sec Loss 2.0675 Epoch: 8 Global Step: 145300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:12:08,992-Speed 4462.58 samples/sec Loss 2.0578 Epoch: 8 Global Step: 145350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:12:20,506-Speed 4447.01 samples/sec Loss 2.0566 Epoch: 8 Global Step: 145400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:12:32,158-Speed 4394.03 samples/sec Loss 2.0669 Epoch: 8 Global Step: 145450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:12:43,644-Speed 4457.87 samples/sec Loss 2.0467 Epoch: 8 Global Step: 145500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:12:55,324-Speed 4383.88 samples/sec Loss 2.0730 Epoch: 8 Global Step: 145550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:13:06,837-Speed 4447.51 samples/sec Loss 2.0687 Epoch: 8 Global Step: 145600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:13:19,141-Speed 4161.21 samples/sec Loss 2.0622 Epoch: 8 Global Step: 145650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:13:30,753-Speed 4409.40 samples/sec Loss 2.0818 Epoch: 8 Global Step: 145700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:13:42,149-Speed 4493.26 samples/sec Loss 2.0306 Epoch: 8 Global Step: 145750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:13:53,762-Speed 4408.96 samples/sec Loss 2.0491 Epoch: 8 Global Step: 145800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:14:05,489-Speed 4366.33 samples/sec Loss 2.0611 Epoch: 8 Global Step: 145850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:14:16,924-Speed 4477.37 samples/sec Loss 2.0726 Epoch: 8 Global Step: 145900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:14:28,290-Speed 4504.88 samples/sec Loss 2.0732 Epoch: 8 Global Step: 145950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:14:39,691-Speed 4491.13 samples/sec Loss 2.0707 Epoch: 8 Global Step: 146000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:15:10,096-[lfw][146000]XNorm: 23.160083 Training: 2021-03-15 09:15:10,096-[lfw][146000]Accuracy-Flip: 0.99717+-0.00308 Training: 2021-03-15 09:15:10,096-[lfw][146000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:15:45,323-[cfp_fp][146000]XNorm: 21.124731 Training: 2021-03-15 09:15:45,324-[cfp_fp][146000]Accuracy-Flip: 0.97829+-0.00717 Training: 2021-03-15 09:15:45,324-[cfp_fp][146000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:16:15,694-[agedb_30][146000]XNorm: 22.688190 Training: 2021-03-15 09:16:15,694-[agedb_30][146000]Accuracy-Flip: 0.97533+-0.00795 Training: 2021-03-15 09:16:15,694-[agedb_30][146000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:16:27,175-Speed 476.35 samples/sec Loss 2.1269 Epoch: 8 Global Step: 146050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:16:38,908-Speed 4364.02 samples/sec Loss 2.0628 Epoch: 8 Global Step: 146100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:16:50,335-Speed 4481.01 samples/sec Loss 2.0561 Epoch: 8 Global Step: 146150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:17:02,122-Speed 4344.00 samples/sec Loss 2.0801 Epoch: 8 Global Step: 146200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:17:13,707-Speed 4419.85 samples/sec Loss 2.0766 Epoch: 8 Global Step: 146250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:17:25,257-Speed 4433.21 samples/sec Loss 2.0767 Epoch: 8 Global Step: 146300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:17:37,017-Speed 4353.87 samples/sec Loss 2.0479 Epoch: 8 Global Step: 146350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:17:48,607-Speed 4417.93 samples/sec Loss 2.0541 Epoch: 8 Global Step: 146400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:18:00,346-Speed 4361.67 samples/sec Loss 2.0880 Epoch: 8 Global Step: 146450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:18:11,946-Speed 4414.01 samples/sec Loss 2.0449 Epoch: 8 Global Step: 146500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:18:23,436-Speed 4456.18 samples/sec Loss 2.0503 Epoch: 8 Global Step: 146550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:18:35,905-Speed 4106.64 samples/sec Loss 2.0744 Epoch: 8 Global Step: 146600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:18:48,560-Speed 4045.94 samples/sec Loss 2.0693 Epoch: 8 Global Step: 146650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:00,209-Speed 4395.46 samples/sec Loss 2.0683 Epoch: 8 Global Step: 146700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:11,883-Speed 4385.74 samples/sec Loss 2.0682 Epoch: 8 Global Step: 146750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:24,421-Speed 4083.94 samples/sec Loss 2.0524 Epoch: 8 Global Step: 146800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:36,074-Speed 4393.94 samples/sec Loss 2.0441 Epoch: 8 Global Step: 146850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:47,413-Speed 4515.71 samples/sec Loss 2.0548 Epoch: 8 Global Step: 146900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:19:59,690-Speed 4170.36 samples/sec Loss 2.0736 Epoch: 8 Global Step: 146950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:20:11,132-Speed 4475.11 samples/sec Loss 2.0647 Epoch: 8 Global Step: 147000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:20:22,735-Speed 4412.78 samples/sec Loss 2.0510 Epoch: 8 Global Step: 147050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:20:34,238-Speed 4451.33 samples/sec Loss 2.0610 Epoch: 8 Global Step: 147100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:20:45,725-Speed 4457.29 samples/sec Loss 2.0640 Epoch: 8 Global Step: 147150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:20:57,240-Speed 4446.43 samples/sec Loss 2.0586 Epoch: 8 Global Step: 147200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:21:08,683-Speed 4474.90 samples/sec Loss 2.0908 Epoch: 8 Global Step: 147250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:21:20,129-Speed 4473.22 samples/sec Loss 2.0800 Epoch: 8 Global Step: 147300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:21:31,699-Speed 4425.58 samples/sec Loss 2.0644 Epoch: 8 Global Step: 147350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:21:44,364-Speed 4042.77 samples/sec Loss 2.0427 Epoch: 8 Global Step: 147400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:21:56,091-Speed 4366.17 samples/sec Loss 2.0800 Epoch: 8 Global Step: 147450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:22:07,614-Speed 4443.53 samples/sec Loss 2.0492 Epoch: 8 Global Step: 147500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:22:19,358-Speed 4359.76 samples/sec Loss 2.0591 Epoch: 8 Global Step: 147550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:22:31,790-Speed 4118.67 samples/sec Loss 2.0778 Epoch: 8 Global Step: 147600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:22:43,408-Speed 4407.08 samples/sec Loss 2.0751 Epoch: 8 Global Step: 147650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:22:54,824-Speed 4485.10 samples/sec Loss 2.0595 Epoch: 8 Global Step: 147700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:23:07,233-Speed 4126.18 samples/sec Loss 2.0441 Epoch: 8 Global Step: 147750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:23:18,922-Speed 4380.35 samples/sec Loss 2.0574 Epoch: 8 Global Step: 147800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:23:30,283-Speed 4507.08 samples/sec Loss 2.0456 Epoch: 8 Global Step: 147850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:23:41,943-Speed 4391.01 samples/sec Loss 2.0392 Epoch: 8 Global Step: 147900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:23:53,508-Speed 4427.37 samples/sec Loss 2.0533 Epoch: 8 Global Step: 147950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:24:04,976-Speed 4464.87 samples/sec Loss 2.0452 Epoch: 8 Global Step: 148000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:24:35,300-[lfw][148000]XNorm: 22.783906 Training: 2021-03-15 09:24:35,300-[lfw][148000]Accuracy-Flip: 0.99750+-0.00261 Training: 2021-03-15 09:24:35,300-[lfw][148000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:25:10,453-[cfp_fp][148000]XNorm: 20.396826 Training: 2021-03-15 09:25:10,453-[cfp_fp][148000]Accuracy-Flip: 0.97643+-0.00963 Training: 2021-03-15 09:25:10,453-[cfp_fp][148000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:25:40,522-[agedb_30][148000]XNorm: 22.527292 Training: 2021-03-15 09:25:40,523-[agedb_30][148000]Accuracy-Flip: 0.97717+-0.00641 Training: 2021-03-15 09:25:40,525-[agedb_30][148000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:25:52,177-Speed 477.61 samples/sec Loss 2.0354 Epoch: 8 Global Step: 148050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:26:03,910-Speed 4363.92 samples/sec Loss 2.0505 Epoch: 8 Global Step: 148100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:26:15,364-Speed 4470.35 samples/sec Loss 2.0425 Epoch: 8 Global Step: 148150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:26:27,011-Speed 4395.91 samples/sec Loss 2.0634 Epoch: 8 Global Step: 148200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:26:39,534-Speed 4088.60 samples/sec Loss 2.0529 Epoch: 8 Global Step: 148250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:26:50,999-Speed 4466.07 samples/sec Loss 2.0628 Epoch: 8 Global Step: 148300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:27:02,651-Speed 4394.41 samples/sec Loss 2.0524 Epoch: 8 Global Step: 148350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:27:14,343-Speed 4379.21 samples/sec Loss 2.0475 Epoch: 8 Global Step: 148400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:27:26,141-Speed 4340.01 samples/sec Loss 2.0638 Epoch: 8 Global Step: 148450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:27:37,734-Speed 4416.76 samples/sec Loss 2.0554 Epoch: 8 Global Step: 148500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:27:49,404-Speed 4387.54 samples/sec Loss 2.0724 Epoch: 8 Global Step: 148550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-15 09:28:00,820-Speed 4485.09 samples/sec Loss 2.0512 Epoch: 8 Global Step: 148600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:28:12,203-Speed 4498.00 samples/sec Loss 2.0385 Epoch: 8 Global Step: 148650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:28:23,687-Speed 4458.86 samples/sec Loss 2.0566 Epoch: 8 Global Step: 148700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:28:35,533-Speed 4322.41 samples/sec Loss 2.0224 Epoch: 8 Global Step: 148750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:28:47,385-Speed 4320.17 samples/sec Loss 2.0579 Epoch: 8 Global Step: 148800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:28:59,138-Speed 4356.45 samples/sec Loss 2.0786 Epoch: 8 Global Step: 148850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:29:11,004-Speed 4314.98 samples/sec Loss 2.0587 Epoch: 8 Global Step: 148900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:29:22,464-Speed 4467.91 samples/sec Loss 2.0508 Epoch: 8 Global Step: 148950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:29:33,927-Speed 4466.99 samples/sec Loss 2.0246 Epoch: 8 Global Step: 149000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:29:45,404-Speed 4461.14 samples/sec Loss 2.0343 Epoch: 8 Global Step: 149050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:29:56,926-Speed 4444.06 samples/sec Loss 2.0474 Epoch: 8 Global Step: 149100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:30:09,472-Speed 4081.04 samples/sec Loss 2.0608 Epoch: 8 Global Step: 149150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:30:21,832-Speed 4142.50 samples/sec Loss 2.0398 Epoch: 8 Global Step: 149200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:30:33,515-Speed 4382.58 samples/sec Loss 2.0468 Epoch: 8 Global Step: 149250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:30:45,066-Speed 4433.01 samples/sec Loss 2.0404 Epoch: 8 Global Step: 149300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:30:57,541-Speed 4104.13 samples/sec Loss 2.0676 Epoch: 8 Global Step: 149350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:31:09,333-Speed 4342.08 samples/sec Loss 2.0596 Epoch: 8 Global Step: 149400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:31:20,714-Speed 4499.02 samples/sec Loss 2.0446 Epoch: 8 Global Step: 149450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:31:32,155-Speed 4475.58 samples/sec Loss 2.0355 Epoch: 8 Global Step: 149500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:31:44,496-Speed 4149.03 samples/sec Loss 2.0511 Epoch: 8 Global Step: 149550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:31:56,280-Speed 4344.92 samples/sec Loss 2.0258 Epoch: 8 Global Step: 149600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:32:07,822-Speed 4436.18 samples/sec Loss 2.0265 Epoch: 8 Global Step: 149650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:32:19,333-Speed 4448.17 samples/sec Loss 2.0521 Epoch: 8 Global Step: 149700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:32:30,880-Speed 4434.05 samples/sec Loss 2.0640 Epoch: 8 Global Step: 149750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:32:42,402-Speed 4443.88 samples/sec Loss 2.0377 Epoch: 8 Global Step: 149800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:32:54,080-Speed 4384.52 samples/sec Loss 2.0708 Epoch: 8 Global Step: 149850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:33:05,802-Speed 4368.24 samples/sec Loss 2.0554 Epoch: 8 Global Step: 149900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:33:18,246-Speed 4114.63 samples/sec Loss 2.0652 Epoch: 8 Global Step: 149950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:33:29,692-Speed 4473.18 samples/sec Loss 2.0321 Epoch: 8 Global Step: 150000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:34:00,033-[lfw][150000]XNorm: 23.254894 Training: 2021-03-15 09:34:00,033-[lfw][150000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 09:34:00,033-[lfw][150000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:34:35,303-[cfp_fp][150000]XNorm: 20.873925 Training: 2021-03-15 09:34:35,303-[cfp_fp][150000]Accuracy-Flip: 0.97757+-0.00623 Training: 2021-03-15 09:34:35,303-[cfp_fp][150000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:35:05,580-[agedb_30][150000]XNorm: 22.920157 Training: 2021-03-15 09:35:05,581-[agedb_30][150000]Accuracy-Flip: 0.97550+-0.00850 Training: 2021-03-15 09:35:05,581-[agedb_30][150000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:35:17,138-Speed 476.53 samples/sec Loss 2.0499 Epoch: 8 Global Step: 150050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:35:28,640-Speed 4451.63 samples/sec Loss 2.0525 Epoch: 8 Global Step: 150100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:35:40,177-Speed 4438.24 samples/sec Loss 2.0289 Epoch: 8 Global Step: 150150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:35:52,626-Speed 4112.73 samples/sec Loss 2.0536 Epoch: 8 Global Step: 150200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:36:17,165-Speed 2086.58 samples/sec Loss 1.8782 Epoch: 9 Global Step: 150250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:36:29,411-Speed 4181.39 samples/sec Loss 1.7527 Epoch: 9 Global Step: 150300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:36:42,546-Speed 3898.17 samples/sec Loss 1.7531 Epoch: 9 Global Step: 150350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:36:54,051-Speed 4450.55 samples/sec Loss 1.7462 Epoch: 9 Global Step: 150400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:37:05,502-Speed 4471.31 samples/sec Loss 1.7533 Epoch: 9 Global Step: 150450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:37:17,025-Speed 4443.68 samples/sec Loss 1.7372 Epoch: 9 Global Step: 150500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:37:28,424-Speed 4492.13 samples/sec Loss 1.7656 Epoch: 9 Global Step: 150550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:37:39,998-Speed 4424.01 samples/sec Loss 1.7313 Epoch: 9 Global Step: 150600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:37:51,428-Speed 4479.62 samples/sec Loss 1.7695 Epoch: 9 Global Step: 150650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:38:02,789-Speed 4506.97 samples/sec Loss 1.7397 Epoch: 9 Global Step: 150700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:38:14,050-Speed 4546.92 samples/sec Loss 1.7825 Epoch: 9 Global Step: 150750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:38:25,446-Speed 4493.08 samples/sec Loss 1.7836 Epoch: 9 Global Step: 150800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:38:37,566-Speed 4224.37 samples/sec Loss 1.7824 Epoch: 9 Global Step: 150850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:38:48,840-Speed 4541.74 samples/sec Loss 1.8007 Epoch: 9 Global Step: 150900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:00,118-Speed 4539.80 samples/sec Loss 1.7736 Epoch: 9 Global Step: 150950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:11,513-Speed 4493.62 samples/sec Loss 1.7837 Epoch: 9 Global Step: 151000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:22,861-Speed 4511.88 samples/sec Loss 1.7663 Epoch: 9 Global Step: 151050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:34,296-Speed 4477.60 samples/sec Loss 1.7963 Epoch: 9 Global Step: 151100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:45,930-Speed 4401.28 samples/sec Loss 1.7741 Epoch: 9 Global Step: 151150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:39:57,390-Speed 4467.94 samples/sec Loss 1.8012 Epoch: 9 Global Step: 151200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:40:08,736-Speed 4512.65 samples/sec Loss 1.8022 Epoch: 9 Global Step: 151250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:40:20,231-Speed 4454.41 samples/sec Loss 1.7982 Epoch: 9 Global Step: 151300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:40:31,748-Speed 4445.68 samples/sec Loss 1.7705 Epoch: 9 Global Step: 151350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:40:43,242-Speed 4454.68 samples/sec Loss 1.8138 Epoch: 9 Global Step: 151400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:40:54,861-Speed 4407.03 samples/sec Loss 1.7808 Epoch: 9 Global Step: 151450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:41:06,452-Speed 4417.37 samples/sec Loss 1.7763 Epoch: 9 Global Step: 151500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:41:18,117-Speed 4389.53 samples/sec Loss 1.8012 Epoch: 9 Global Step: 151550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:41:29,508-Speed 4495.11 samples/sec Loss 1.7881 Epoch: 9 Global Step: 151600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:41:41,118-Speed 4410.31 samples/sec Loss 1.8039 Epoch: 9 Global Step: 151650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:41:52,390-Speed 4542.39 samples/sec Loss 1.8024 Epoch: 9 Global Step: 151700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:42:04,675-Speed 4167.79 samples/sec Loss 1.7737 Epoch: 9 Global Step: 151750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:42:16,859-Speed 4202.54 samples/sec Loss 1.7969 Epoch: 9 Global Step: 151800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:42:28,462-Speed 4413.04 samples/sec Loss 1.8279 Epoch: 9 Global Step: 151850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:42:40,034-Speed 4424.47 samples/sec Loss 1.7966 Epoch: 9 Global Step: 151900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:42:52,682-Speed 4048.21 samples/sec Loss 1.8048 Epoch: 9 Global Step: 151950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:43:04,176-Speed 4454.80 samples/sec Loss 1.8067 Epoch: 9 Global Step: 152000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:43:34,445-[lfw][152000]XNorm: 22.276440 Training: 2021-03-15 09:43:34,446-[lfw][152000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-15 09:43:34,446-[lfw][152000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:44:09,587-[cfp_fp][152000]XNorm: 20.195879 Training: 2021-03-15 09:44:09,587-[cfp_fp][152000]Accuracy-Flip: 0.97971+-0.00949 Training: 2021-03-15 09:44:09,587-[cfp_fp][152000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:44:40,030-[agedb_30][152000]XNorm: 21.960151 Training: 2021-03-15 09:44:40,030-[agedb_30][152000]Accuracy-Flip: 0.97583+-0.00664 Training: 2021-03-15 09:44:40,030-[agedb_30][152000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:44:51,385-Speed 477.57 samples/sec Loss 1.7812 Epoch: 9 Global Step: 152050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:45:04,004-Speed 4057.77 samples/sec Loss 1.8282 Epoch: 9 Global Step: 152100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:45:15,425-Speed 4482.93 samples/sec Loss 1.8064 Epoch: 9 Global Step: 152150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:45:26,977-Speed 4432.35 samples/sec Loss 1.8389 Epoch: 9 Global Step: 152200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:45:38,433-Speed 4469.42 samples/sec Loss 1.8186 Epoch: 9 Global Step: 152250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:45:50,090-Speed 4392.69 samples/sec Loss 1.8453 Epoch: 9 Global Step: 152300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:01,489-Speed 4491.89 samples/sec Loss 1.8042 Epoch: 9 Global Step: 152350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:12,929-Speed 4475.40 samples/sec Loss 1.8497 Epoch: 9 Global Step: 152400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:24,456-Speed 4442.10 samples/sec Loss 1.8334 Epoch: 9 Global Step: 152450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:35,950-Speed 4454.47 samples/sec Loss 1.8321 Epoch: 9 Global Step: 152500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:48,218-Speed 4173.77 samples/sec Loss 1.8015 Epoch: 9 Global Step: 152550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:46:59,719-Speed 4451.97 samples/sec Loss 1.8218 Epoch: 9 Global Step: 152600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:47:11,200-Speed 4459.80 samples/sec Loss 1.8202 Epoch: 9 Global Step: 152650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:47:22,569-Speed 4503.85 samples/sec Loss 1.8316 Epoch: 9 Global Step: 152700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:47:34,067-Speed 4453.11 samples/sec Loss 1.8481 Epoch: 9 Global Step: 152750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:47:46,250-Speed 4202.55 samples/sec Loss 1.8454 Epoch: 9 Global Step: 152800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:47:58,830-Speed 4070.16 samples/sec Loss 1.8208 Epoch: 9 Global Step: 152850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:48:10,271-Speed 4475.52 samples/sec Loss 1.8299 Epoch: 9 Global Step: 152900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:48:21,728-Speed 4469.07 samples/sec Loss 1.8172 Epoch: 9 Global Step: 152950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:48:33,147-Speed 4483.74 samples/sec Loss 1.8336 Epoch: 9 Global Step: 153000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:48:44,641-Speed 4454.75 samples/sec Loss 1.8745 Epoch: 9 Global Step: 153050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:48:56,307-Speed 4389.20 samples/sec Loss 1.8329 Epoch: 9 Global Step: 153100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:49:07,876-Speed 4425.74 samples/sec Loss 1.8429 Epoch: 9 Global Step: 153150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:49:19,543-Speed 4388.59 samples/sec Loss 1.8574 Epoch: 9 Global Step: 153200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:49:31,130-Speed 4418.86 samples/sec Loss 1.8733 Epoch: 9 Global Step: 153250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:49:42,736-Speed 4411.75 samples/sec Loss 1.8361 Epoch: 9 Global Step: 153300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:49:54,170-Speed 4478.13 samples/sec Loss 1.8525 Epoch: 9 Global Step: 153350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:50:05,688-Speed 4445.29 samples/sec Loss 1.8530 Epoch: 9 Global Step: 153400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:50:18,004-Speed 4157.50 samples/sec Loss 1.8536 Epoch: 9 Global Step: 153450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:50:29,536-Speed 4440.04 samples/sec Loss 1.8492 Epoch: 9 Global Step: 153500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:50:41,173-Speed 4399.93 samples/sec Loss 1.8454 Epoch: 9 Global Step: 153550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:50:52,542-Speed 4503.63 samples/sec Loss 1.8379 Epoch: 9 Global Step: 153600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:51:03,928-Speed 4497.14 samples/sec Loss 1.8583 Epoch: 9 Global Step: 153650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:51:15,538-Speed 4410.25 samples/sec Loss 1.8582 Epoch: 9 Global Step: 153700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:51:27,302-Speed 4352.44 samples/sec Loss 1.8637 Epoch: 9 Global Step: 153750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:51:38,854-Speed 4432.18 samples/sec Loss 1.8326 Epoch: 9 Global Step: 153800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:51:50,498-Speed 4397.46 samples/sec Loss 1.8444 Epoch: 9 Global Step: 153850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:52:02,099-Speed 4413.60 samples/sec Loss 1.8680 Epoch: 9 Global Step: 153900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:52:13,491-Speed 4494.48 samples/sec Loss 1.8528 Epoch: 9 Global Step: 153950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:52:24,962-Speed 4463.82 samples/sec Loss 1.8291 Epoch: 9 Global Step: 154000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:52:55,254-[lfw][154000]XNorm: 23.081702 Training: 2021-03-15 09:52:55,254-[lfw][154000]Accuracy-Flip: 0.99717+-0.00289 Training: 2021-03-15 09:52:55,254-[lfw][154000]Accuracy-Highest: 0.99783 Training: 2021-03-15 09:53:30,397-[cfp_fp][154000]XNorm: 21.242781 Training: 2021-03-15 09:53:30,397-[cfp_fp][154000]Accuracy-Flip: 0.97643+-0.00788 Training: 2021-03-15 09:53:30,397-[cfp_fp][154000]Accuracy-Highest: 0.98071 Training: 2021-03-15 09:54:00,655-[agedb_30][154000]XNorm: 22.674052 Training: 2021-03-15 09:54:00,655-[agedb_30][154000]Accuracy-Flip: 0.97667+-0.00683 Training: 2021-03-15 09:54:00,655-[agedb_30][154000]Accuracy-Highest: 0.97917 Training: 2021-03-15 09:54:12,144-Speed 477.69 samples/sec Loss 1.8831 Epoch: 9 Global Step: 154050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:54:23,556-Speed 4487.04 samples/sec Loss 1.8401 Epoch: 9 Global Step: 154100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:54:34,870-Speed 4525.63 samples/sec Loss 1.8560 Epoch: 9 Global Step: 154150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:54:46,273-Speed 4490.19 samples/sec Loss 1.8450 Epoch: 9 Global Step: 154200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:54:57,717-Speed 4474.31 samples/sec Loss 1.8498 Epoch: 9 Global Step: 154250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:55:10,109-Speed 4131.77 samples/sec Loss 1.8383 Epoch: 9 Global Step: 154300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:55:21,716-Speed 4411.22 samples/sec Loss 1.8480 Epoch: 9 Global Step: 154350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:55:33,304-Speed 4418.63 samples/sec Loss 1.8539 Epoch: 9 Global Step: 154400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:55:44,889-Speed 4419.59 samples/sec Loss 1.8492 Epoch: 9 Global Step: 154450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:55:56,454-Speed 4427.51 samples/sec Loss 1.8597 Epoch: 9 Global Step: 154500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:56:08,871-Speed 4123.83 samples/sec Loss 1.8501 Epoch: 9 Global Step: 154550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:56:20,157-Speed 4536.75 samples/sec Loss 1.8604 Epoch: 9 Global Step: 154600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:56:31,556-Speed 4491.83 samples/sec Loss 1.8648 Epoch: 9 Global Step: 154650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:56:43,879-Speed 4155.25 samples/sec Loss 1.9000 Epoch: 9 Global Step: 154700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:56:55,335-Speed 4469.22 samples/sec Loss 1.8644 Epoch: 9 Global Step: 154750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:57:06,736-Speed 4491.02 samples/sec Loss 1.8848 Epoch: 9 Global Step: 154800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:57:18,159-Speed 4482.49 samples/sec Loss 1.8719 Epoch: 9 Global Step: 154850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:57:29,685-Speed 4442.46 samples/sec Loss 1.8766 Epoch: 9 Global Step: 154900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:57:41,373-Speed 4380.72 samples/sec Loss 1.8709 Epoch: 9 Global Step: 154950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:57:52,767-Speed 4493.84 samples/sec Loss 1.8643 Epoch: 9 Global Step: 155000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:58:04,353-Speed 4419.51 samples/sec Loss 1.8841 Epoch: 9 Global Step: 155050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:58:15,705-Speed 4510.29 samples/sec Loss 1.8702 Epoch: 9 Global Step: 155100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:58:28,436-Speed 4021.71 samples/sec Loss 1.9018 Epoch: 9 Global Step: 155150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:58:39,921-Speed 4458.37 samples/sec Loss 1.8749 Epoch: 9 Global Step: 155200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:58:51,204-Speed 4537.87 samples/sec Loss 1.8730 Epoch: 9 Global Step: 155250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:59:03,660-Speed 4110.49 samples/sec Loss 1.8732 Epoch: 9 Global Step: 155300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:59:15,239-Speed 4422.05 samples/sec Loss 1.8921 Epoch: 9 Global Step: 155350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:59:27,646-Speed 4127.00 samples/sec Loss 1.8787 Epoch: 9 Global Step: 155400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:59:39,433-Speed 4344.04 samples/sec Loss 1.8926 Epoch: 9 Global Step: 155450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 09:59:50,887-Speed 4470.29 samples/sec Loss 1.8887 Epoch: 9 Global Step: 155500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:00:02,540-Speed 4393.88 samples/sec Loss 1.8644 Epoch: 9 Global Step: 155550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:00:13,983-Speed 4474.32 samples/sec Loss 1.8698 Epoch: 9 Global Step: 155600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:00:25,422-Speed 4476.07 samples/sec Loss 1.8802 Epoch: 9 Global Step: 155650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:00:37,165-Speed 4360.45 samples/sec Loss 1.8941 Epoch: 9 Global Step: 155700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:00:48,687-Speed 4443.97 samples/sec Loss 1.8844 Epoch: 9 Global Step: 155750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:01:00,087-Speed 4491.23 samples/sec Loss 1.8826 Epoch: 9 Global Step: 155800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:01:11,679-Speed 4417.11 samples/sec Loss 1.8892 Epoch: 9 Global Step: 155850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:01:23,236-Speed 4430.59 samples/sec Loss 1.8727 Epoch: 9 Global Step: 155900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:01:34,808-Speed 4424.84 samples/sec Loss 1.8795 Epoch: 9 Global Step: 155950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:01:47,176-Speed 4139.74 samples/sec Loss 1.8849 Epoch: 9 Global Step: 156000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:02:17,519-[lfw][156000]XNorm: 22.722088 Training: 2021-03-15 10:02:17,519-[lfw][156000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-15 10:02:17,521-[lfw][156000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:02:52,608-[cfp_fp][156000]XNorm: 20.689050 Training: 2021-03-15 10:02:52,608-[cfp_fp][156000]Accuracy-Flip: 0.98200+-0.00558 Training: 2021-03-15 10:02:52,608-[cfp_fp][156000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:03:22,898-[agedb_30][156000]XNorm: 22.635247 Training: 2021-03-15 10:03:22,898-[agedb_30][156000]Accuracy-Flip: 0.97600+-0.00593 Training: 2021-03-15 10:03:22,898-[agedb_30][156000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:03:34,502-Speed 477.05 samples/sec Loss 1.8883 Epoch: 9 Global Step: 156050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:03:45,832-Speed 4519.39 samples/sec Loss 1.8875 Epoch: 9 Global Step: 156100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:03:57,402-Speed 4425.16 samples/sec Loss 1.8987 Epoch: 9 Global Step: 156150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:04:09,033-Speed 4402.57 samples/sec Loss 1.8942 Epoch: 9 Global Step: 156200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:04:20,401-Speed 4504.05 samples/sec Loss 1.8961 Epoch: 9 Global Step: 156250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:04:31,863-Speed 4467.01 samples/sec Loss 1.8858 Epoch: 9 Global Step: 156300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:04:43,467-Speed 4412.62 samples/sec Loss 1.9012 Epoch: 9 Global Step: 156350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:04:54,941-Speed 4462.21 samples/sec Loss 1.9083 Epoch: 9 Global Step: 156400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:05:06,452-Speed 4448.11 samples/sec Loss 1.8795 Epoch: 9 Global Step: 156450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:05:17,922-Speed 4464.10 samples/sec Loss 1.8696 Epoch: 9 Global Step: 156500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:05:29,503-Speed 4421.11 samples/sec Loss 1.8814 Epoch: 9 Global Step: 156550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:05:41,090-Speed 4418.99 samples/sec Loss 1.9023 Epoch: 9 Global Step: 156600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:05:52,561-Speed 4463.95 samples/sec Loss 1.8776 Epoch: 9 Global Step: 156650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:06:04,403-Speed 4323.89 samples/sec Loss 1.9160 Epoch: 9 Global Step: 156700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:06:15,834-Speed 4479.17 samples/sec Loss 1.8935 Epoch: 9 Global Step: 156750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:06:27,458-Speed 4404.69 samples/sec Loss 1.8885 Epoch: 9 Global Step: 156800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:06:39,126-Speed 4388.47 samples/sec Loss 1.9003 Epoch: 9 Global Step: 156850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:06:51,361-Speed 4184.79 samples/sec Loss 1.9047 Epoch: 9 Global Step: 156900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:07:03,554-Speed 4199.37 samples/sec Loss 1.8786 Epoch: 9 Global Step: 156950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:07:15,279-Speed 4366.72 samples/sec Loss 1.9156 Epoch: 9 Global Step: 157000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:07:27,564-Speed 4167.86 samples/sec Loss 1.8639 Epoch: 9 Global Step: 157050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:07:39,163-Speed 4414.52 samples/sec Loss 1.9304 Epoch: 9 Global Step: 157100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:07:50,634-Speed 4463.47 samples/sec Loss 1.9308 Epoch: 9 Global Step: 157150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:08:02,024-Speed 4495.78 samples/sec Loss 1.9004 Epoch: 9 Global Step: 157200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:08:13,371-Speed 4512.13 samples/sec Loss 1.8959 Epoch: 9 Global Step: 157250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:08:24,776-Speed 4489.72 samples/sec Loss 1.9160 Epoch: 9 Global Step: 157300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:08:37,408-Speed 4053.18 samples/sec Loss 1.9019 Epoch: 9 Global Step: 157350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:08:49,008-Speed 4413.98 samples/sec Loss 1.8773 Epoch: 9 Global Step: 157400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:00,694-Speed 4381.43 samples/sec Loss 1.9107 Epoch: 9 Global Step: 157450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:12,317-Speed 4405.29 samples/sec Loss 1.8946 Epoch: 9 Global Step: 157500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:23,648-Speed 4518.90 samples/sec Loss 1.8713 Epoch: 9 Global Step: 157550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:35,093-Speed 4474.02 samples/sec Loss 1.9153 Epoch: 9 Global Step: 157600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:46,544-Speed 4471.30 samples/sec Loss 1.9073 Epoch: 9 Global Step: 157650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:09:58,246-Speed 4375.38 samples/sec Loss 1.9159 Epoch: 9 Global Step: 157700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:10:09,882-Speed 4400.47 samples/sec Loss 1.8970 Epoch: 9 Global Step: 157750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:10:21,442-Speed 4429.40 samples/sec Loss 1.9251 Epoch: 9 Global Step: 157800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:10:34,649-Speed 3876.83 samples/sec Loss 1.9078 Epoch: 9 Global Step: 157850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:10:46,052-Speed 4490.24 samples/sec Loss 1.9235 Epoch: 9 Global Step: 157900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:10:57,500-Speed 4472.57 samples/sec Loss 1.9054 Epoch: 9 Global Step: 157950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:11:09,093-Speed 4416.69 samples/sec Loss 1.8925 Epoch: 9 Global Step: 158000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:11:39,502-[lfw][158000]XNorm: 22.726895 Training: 2021-03-15 10:11:39,502-[lfw][158000]Accuracy-Flip: 0.99750+-0.00261 Training: 2021-03-15 10:11:39,502-[lfw][158000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:12:14,664-[cfp_fp][158000]XNorm: 20.785575 Training: 2021-03-15 10:12:14,665-[cfp_fp][158000]Accuracy-Flip: 0.97857+-0.00912 Training: 2021-03-15 10:12:14,665-[cfp_fp][158000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:12:44,873-[agedb_30][158000]XNorm: 22.678845 Training: 2021-03-15 10:12:44,873-[agedb_30][158000]Accuracy-Flip: 0.97433+-0.00810 Training: 2021-03-15 10:12:44,873-[agedb_30][158000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:12:56,321-Speed 477.49 samples/sec Loss 1.9081 Epoch: 9 Global Step: 158050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:13:08,735-Speed 4124.62 samples/sec Loss 1.8983 Epoch: 9 Global Step: 158100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:13:20,318-Speed 4420.44 samples/sec Loss 1.8958 Epoch: 9 Global Step: 158150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:13:31,924-Speed 4411.58 samples/sec Loss 1.9204 Epoch: 9 Global Step: 158200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:13:43,528-Speed 4412.69 samples/sec Loss 1.8952 Epoch: 9 Global Step: 158250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:13:54,966-Speed 4476.39 samples/sec Loss 1.9216 Epoch: 9 Global Step: 158300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:14:06,636-Speed 4387.50 samples/sec Loss 1.9115 Epoch: 9 Global Step: 158350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:14:18,169-Speed 4439.52 samples/sec Loss 1.9040 Epoch: 9 Global Step: 158400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:14:29,840-Speed 4387.27 samples/sec Loss 1.9326 Epoch: 9 Global Step: 158450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:14:41,567-Speed 4366.35 samples/sec Loss 1.9504 Epoch: 9 Global Step: 158500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:14:53,103-Speed 4438.32 samples/sec Loss 1.9187 Epoch: 9 Global Step: 158550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:15:05,662-Speed 4076.76 samples/sec Loss 1.8975 Epoch: 9 Global Step: 158600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:15:17,095-Speed 4478.53 samples/sec Loss 1.8938 Epoch: 9 Global Step: 158650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:15:28,704-Speed 4410.84 samples/sec Loss 1.9315 Epoch: 9 Global Step: 158700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:15:40,370-Speed 4388.88 samples/sec Loss 1.9167 Epoch: 9 Global Step: 158750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:15:51,952-Speed 4420.84 samples/sec Loss 1.9108 Epoch: 9 Global Step: 158800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:16:03,707-Speed 4355.61 samples/sec Loss 1.9113 Epoch: 9 Global Step: 158850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:16:15,286-Speed 4422.06 samples/sec Loss 1.9154 Epoch: 9 Global Step: 158900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:16:26,861-Speed 4423.75 samples/sec Loss 1.9214 Epoch: 9 Global Step: 158950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:16:38,484-Speed 4405.20 samples/sec Loss 1.9255 Epoch: 9 Global Step: 159000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:16:49,976-Speed 4455.55 samples/sec Loss 1.8948 Epoch: 9 Global Step: 159050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:01,611-Speed 4400.53 samples/sec Loss 1.9116 Epoch: 9 Global Step: 159100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:12,990-Speed 4499.78 samples/sec Loss 1.9282 Epoch: 9 Global Step: 159150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:24,824-Speed 4326.86 samples/sec Loss 1.9127 Epoch: 9 Global Step: 159200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:36,285-Speed 4467.37 samples/sec Loss 1.9213 Epoch: 9 Global Step: 159250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:47,575-Speed 4535.14 samples/sec Loss 1.9406 Epoch: 9 Global Step: 159300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:17:59,099-Speed 4443.19 samples/sec Loss 1.9501 Epoch: 9 Global Step: 159350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:18:10,698-Speed 4414.45 samples/sec Loss 1.9201 Epoch: 9 Global Step: 159400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:18:22,207-Speed 4449.00 samples/sec Loss 1.9231 Epoch: 9 Global Step: 159450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:18:34,722-Speed 4091.21 samples/sec Loss 1.9072 Epoch: 9 Global Step: 159500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:18:46,309-Speed 4418.84 samples/sec Loss 1.9201 Epoch: 9 Global Step: 159550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:18:58,754-Speed 4114.31 samples/sec Loss 1.9447 Epoch: 9 Global Step: 159600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:19:10,289-Speed 4438.79 samples/sec Loss 1.9210 Epoch: 9 Global Step: 159650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:19:22,521-Speed 4185.96 samples/sec Loss 1.9198 Epoch: 9 Global Step: 159700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:19:34,084-Speed 4428.18 samples/sec Loss 1.9072 Epoch: 9 Global Step: 159750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:19:45,645-Speed 4428.86 samples/sec Loss 1.9222 Epoch: 9 Global Step: 159800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:19:57,041-Speed 4493.40 samples/sec Loss 1.9157 Epoch: 9 Global Step: 159850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:20:08,403-Speed 4506.18 samples/sec Loss 1.9363 Epoch: 9 Global Step: 159900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:20:19,997-Speed 4416.56 samples/sec Loss 1.9244 Epoch: 9 Global Step: 159950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:20:32,656-Speed 4044.66 samples/sec Loss 1.9147 Epoch: 9 Global Step: 160000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:21:02,983-[lfw][160000]XNorm: 22.921040 Training: 2021-03-15 10:21:02,983-[lfw][160000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-15 10:21:02,983-[lfw][160000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:21:38,159-[cfp_fp][160000]XNorm: 20.561026 Training: 2021-03-15 10:21:38,159-[cfp_fp][160000]Accuracy-Flip: 0.97929+-0.00737 Training: 2021-03-15 10:21:38,159-[cfp_fp][160000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:22:08,561-[agedb_30][160000]XNorm: 22.676037 Training: 2021-03-15 10:22:08,561-[agedb_30][160000]Accuracy-Flip: 0.97600+-0.00735 Training: 2021-03-15 10:22:08,561-[agedb_30][160000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:22:20,135-Speed 476.37 samples/sec Loss 1.9262 Epoch: 9 Global Step: 160050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:22:31,773-Speed 4399.60 samples/sec Loss 1.9478 Epoch: 9 Global Step: 160100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:22:43,213-Speed 4475.74 samples/sec Loss 1.9193 Epoch: 9 Global Step: 160150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:22:54,917-Speed 4374.89 samples/sec Loss 1.9278 Epoch: 9 Global Step: 160200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:23:06,549-Speed 4401.91 samples/sec Loss 1.9444 Epoch: 9 Global Step: 160250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:23:18,127-Speed 4422.25 samples/sec Loss 1.9232 Epoch: 9 Global Step: 160300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:23:29,656-Speed 4441.01 samples/sec Loss 1.9491 Epoch: 9 Global Step: 160350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:23:42,056-Speed 4129.53 samples/sec Loss 1.9413 Epoch: 9 Global Step: 160400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:23:53,615-Speed 4429.52 samples/sec Loss 1.9388 Epoch: 9 Global Step: 160450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:24:06,110-Speed 4098.13 samples/sec Loss 1.9302 Epoch: 9 Global Step: 160500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:24:17,501-Speed 4495.04 samples/sec Loss 1.9208 Epoch: 9 Global Step: 160550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:24:28,974-Speed 4462.88 samples/sec Loss 1.9100 Epoch: 9 Global Step: 160600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:24:40,546-Speed 4424.49 samples/sec Loss 1.9593 Epoch: 9 Global Step: 160650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:24:52,994-Speed 4113.33 samples/sec Loss 1.9362 Epoch: 9 Global Step: 160700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:25:04,430-Speed 4477.18 samples/sec Loss 1.9333 Epoch: 9 Global Step: 160750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:25:15,957-Speed 4442.05 samples/sec Loss 1.9376 Epoch: 9 Global Step: 160800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:25:27,566-Speed 4410.47 samples/sec Loss 1.8822 Epoch: 9 Global Step: 160850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:25:39,194-Speed 4403.34 samples/sec Loss 1.9387 Epoch: 9 Global Step: 160900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:25:50,813-Speed 4406.84 samples/sec Loss 1.9270 Epoch: 9 Global Step: 160950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:26:02,460-Speed 4396.42 samples/sec Loss 1.9176 Epoch: 9 Global Step: 161000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:26:14,123-Speed 4390.06 samples/sec Loss 1.9406 Epoch: 9 Global Step: 161050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:26:25,728-Speed 4412.05 samples/sec Loss 1.9189 Epoch: 9 Global Step: 161100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:26:37,431-Speed 4375.10 samples/sec Loss 1.9406 Epoch: 9 Global Step: 161150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:26:49,712-Speed 4169.17 samples/sec Loss 1.9532 Epoch: 9 Global Step: 161200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:27:01,363-Speed 4394.81 samples/sec Loss 1.9250 Epoch: 9 Global Step: 161250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-15 10:27:12,829-Speed 4465.76 samples/sec Loss 1.9493 Epoch: 9 Global Step: 161300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:27:24,566-Speed 4362.28 samples/sec Loss 1.9457 Epoch: 9 Global Step: 161350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:27:36,194-Speed 4403.46 samples/sec Loss 1.9441 Epoch: 9 Global Step: 161400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:27:47,963-Speed 4350.56 samples/sec Loss 1.9307 Epoch: 9 Global Step: 161450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:27:59,735-Speed 4349.61 samples/sec Loss 1.9211 Epoch: 9 Global Step: 161500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:28:11,542-Speed 4336.30 samples/sec Loss 1.9301 Epoch: 9 Global Step: 161550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:28:23,135-Speed 4416.87 samples/sec Loss 1.9477 Epoch: 9 Global Step: 161600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:28:34,702-Speed 4426.59 samples/sec Loss 1.9461 Epoch: 9 Global Step: 161650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:28:46,442-Speed 4361.54 samples/sec Loss 1.9171 Epoch: 9 Global Step: 161700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:28:57,920-Speed 4460.93 samples/sec Loss 1.9380 Epoch: 9 Global Step: 161750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:29:09,478-Speed 4430.10 samples/sec Loss 1.9255 Epoch: 9 Global Step: 161800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:29:21,044-Speed 4426.82 samples/sec Loss 1.9738 Epoch: 9 Global Step: 161850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:29:32,622-Speed 4422.25 samples/sec Loss 1.9400 Epoch: 9 Global Step: 161900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:29:44,032-Speed 4487.92 samples/sec Loss 1.9460 Epoch: 9 Global Step: 161950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:29:56,559-Speed 4087.17 samples/sec Loss 1.9626 Epoch: 9 Global Step: 162000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:30:27,133-[lfw][162000]XNorm: 20.861393 Training: 2021-03-15 10:30:27,133-[lfw][162000]Accuracy-Flip: 0.99783+-0.00224 Training: 2021-03-15 10:30:27,133-[lfw][162000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:31:02,353-[cfp_fp][162000]XNorm: 19.545134 Training: 2021-03-15 10:31:02,353-[cfp_fp][162000]Accuracy-Flip: 0.97729+-0.00781 Training: 2021-03-15 10:31:02,353-[cfp_fp][162000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:31:32,648-[agedb_30][162000]XNorm: 20.729202 Training: 2021-03-15 10:31:32,649-[agedb_30][162000]Accuracy-Flip: 0.97600+-0.00708 Training: 2021-03-15 10:31:32,649-[agedb_30][162000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:31:44,289-Speed 475.27 samples/sec Loss 1.9372 Epoch: 9 Global Step: 162050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:31:55,939-Speed 4395.00 samples/sec Loss 1.9422 Epoch: 9 Global Step: 162100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:32:07,368-Speed 4479.89 samples/sec Loss 1.9233 Epoch: 9 Global Step: 162150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:32:19,667-Speed 4163.28 samples/sec Loss 1.9336 Epoch: 9 Global Step: 162200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:32:31,419-Speed 4356.78 samples/sec Loss 1.9499 Epoch: 9 Global Step: 162250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:32:43,941-Speed 4088.91 samples/sec Loss 1.9414 Epoch: 9 Global Step: 162300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:32:55,554-Speed 4409.32 samples/sec Loss 1.9245 Epoch: 9 Global Step: 162350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:33:07,080-Speed 4442.07 samples/sec Loss 1.9242 Epoch: 9 Global Step: 162400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:33:18,475-Speed 4493.53 samples/sec Loss 1.9332 Epoch: 9 Global Step: 162450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:33:30,055-Speed 4421.66 samples/sec Loss 1.9712 Epoch: 9 Global Step: 162500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:33:42,396-Speed 4149.06 samples/sec Loss 1.9459 Epoch: 9 Global Step: 162550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:33:53,715-Speed 4523.57 samples/sec Loss 1.9308 Epoch: 9 Global Step: 162600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:34:05,172-Speed 4468.93 samples/sec Loss 1.9290 Epoch: 9 Global Step: 162650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:34:16,807-Speed 4400.80 samples/sec Loss 1.9726 Epoch: 9 Global Step: 162700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:34:28,439-Speed 4401.84 samples/sec Loss 1.9585 Epoch: 9 Global Step: 162750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:34:39,949-Speed 4448.39 samples/sec Loss 1.9204 Epoch: 9 Global Step: 162800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:34:51,617-Speed 4388.26 samples/sec Loss 1.9629 Epoch: 9 Global Step: 162850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:35:03,158-Speed 4436.77 samples/sec Loss 1.9579 Epoch: 9 Global Step: 162900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:35:15,398-Speed 4183.00 samples/sec Loss 1.9429 Epoch: 9 Global Step: 162950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:35:27,025-Speed 4403.81 samples/sec Loss 1.9451 Epoch: 9 Global Step: 163000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:35:38,505-Speed 4460.32 samples/sec Loss 1.9572 Epoch: 9 Global Step: 163050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:35:50,221-Speed 4370.18 samples/sec Loss 1.9608 Epoch: 9 Global Step: 163100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:36:01,779-Speed 4430.14 samples/sec Loss 1.9542 Epoch: 9 Global Step: 163150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:36:14,402-Speed 4056.25 samples/sec Loss 1.9351 Epoch: 9 Global Step: 163200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:36:25,892-Speed 4456.57 samples/sec Loss 1.9753 Epoch: 9 Global Step: 163250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:36:38,425-Speed 4085.31 samples/sec Loss 1.9326 Epoch: 9 Global Step: 163300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:36:49,850-Speed 4481.36 samples/sec Loss 1.9709 Epoch: 9 Global Step: 163350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:01,459-Speed 4410.67 samples/sec Loss 1.9299 Epoch: 9 Global Step: 163400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:12,853-Speed 4493.92 samples/sec Loss 1.9335 Epoch: 9 Global Step: 163450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:24,592-Speed 4361.81 samples/sec Loss 1.9555 Epoch: 9 Global Step: 163500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:36,164-Speed 4424.63 samples/sec Loss 1.9648 Epoch: 9 Global Step: 163550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:47,740-Speed 4423.18 samples/sec Loss 1.9173 Epoch: 9 Global Step: 163600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:37:59,085-Speed 4513.04 samples/sec Loss 1.9390 Epoch: 9 Global Step: 163650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:38:10,948-Speed 4316.12 samples/sec Loss 1.9624 Epoch: 9 Global Step: 163700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:38:22,400-Speed 4471.14 samples/sec Loss 1.9322 Epoch: 9 Global Step: 163750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:38:34,810-Speed 4126.01 samples/sec Loss 1.9414 Epoch: 9 Global Step: 163800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:38:46,265-Speed 4469.66 samples/sec Loss 1.9393 Epoch: 9 Global Step: 163850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:38:58,023-Speed 4354.88 samples/sec Loss 1.9563 Epoch: 9 Global Step: 163900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:39:09,792-Speed 4350.54 samples/sec Loss 1.9389 Epoch: 9 Global Step: 163950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:39:21,459-Speed 4388.68 samples/sec Loss 1.9844 Epoch: 9 Global Step: 164000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:39:51,729-[lfw][164000]XNorm: 22.977893 Training: 2021-03-15 10:39:51,729-[lfw][164000]Accuracy-Flip: 0.99733+-0.00281 Training: 2021-03-15 10:39:51,729-[lfw][164000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:40:27,039-[cfp_fp][164000]XNorm: 20.695161 Training: 2021-03-15 10:40:27,039-[cfp_fp][164000]Accuracy-Flip: 0.98186+-0.00557 Training: 2021-03-15 10:40:27,039-[cfp_fp][164000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:40:57,469-[agedb_30][164000]XNorm: 22.657531 Training: 2021-03-15 10:40:57,470-[agedb_30][164000]Accuracy-Flip: 0.97683+-0.00828 Training: 2021-03-15 10:40:57,470-[agedb_30][164000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:41:08,863-Speed 476.71 samples/sec Loss 1.9656 Epoch: 9 Global Step: 164050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:41:20,539-Speed 4384.99 samples/sec Loss 1.9497 Epoch: 9 Global Step: 164100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:41:32,158-Speed 4406.69 samples/sec Loss 1.9360 Epoch: 9 Global Step: 164150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:41:43,676-Speed 4445.66 samples/sec Loss 1.9507 Epoch: 9 Global Step: 164200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:41:55,036-Speed 4507.04 samples/sec Loss 1.9505 Epoch: 9 Global Step: 164250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:42:06,745-Speed 4373.00 samples/sec Loss 1.9541 Epoch: 9 Global Step: 164300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:42:18,119-Speed 4501.78 samples/sec Loss 1.9834 Epoch: 9 Global Step: 164350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:42:29,715-Speed 4415.48 samples/sec Loss 1.9796 Epoch: 9 Global Step: 164400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:42:41,159-Speed 4474.09 samples/sec Loss 1.9898 Epoch: 9 Global Step: 164450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:42:52,703-Speed 4435.50 samples/sec Loss 1.9590 Epoch: 9 Global Step: 164500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:43:05,003-Speed 4162.80 samples/sec Loss 1.9443 Epoch: 9 Global Step: 164550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:43:16,364-Speed 4506.74 samples/sec Loss 1.9624 Epoch: 9 Global Step: 164600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:43:27,844-Speed 4460.33 samples/sec Loss 1.9347 Epoch: 9 Global Step: 164650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:43:39,374-Speed 4440.60 samples/sec Loss 1.9516 Epoch: 9 Global Step: 164700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:43:50,814-Speed 4475.66 samples/sec Loss 1.9568 Epoch: 9 Global Step: 164750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:44:02,281-Speed 4465.32 samples/sec Loss 1.9757 Epoch: 9 Global Step: 164800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:44:14,524-Speed 4182.09 samples/sec Loss 1.9468 Epoch: 9 Global Step: 164850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:44:26,127-Speed 4412.73 samples/sec Loss 1.9526 Epoch: 9 Global Step: 164900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:44:38,708-Speed 4070.05 samples/sec Loss 1.9443 Epoch: 9 Global Step: 164950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:44:50,225-Speed 4445.58 samples/sec Loss 1.9455 Epoch: 9 Global Step: 165000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:45:01,911-Speed 4381.62 samples/sec Loss 1.9514 Epoch: 9 Global Step: 165050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:45:13,467-Speed 4430.74 samples/sec Loss 1.9685 Epoch: 9 Global Step: 165100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:45:24,979-Speed 4447.74 samples/sec Loss 1.9482 Epoch: 9 Global Step: 165150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:45:36,527-Speed 4434.07 samples/sec Loss 1.9500 Epoch: 9 Global Step: 165200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:45:49,060-Speed 4085.41 samples/sec Loss 1.9688 Epoch: 9 Global Step: 165250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:00,545-Speed 4458.08 samples/sec Loss 1.9369 Epoch: 9 Global Step: 165300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:12,070-Speed 4442.78 samples/sec Loss 1.9732 Epoch: 9 Global Step: 165350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:23,578-Speed 4449.10 samples/sec Loss 1.9609 Epoch: 9 Global Step: 165400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:34,997-Speed 4484.13 samples/sec Loss 1.9477 Epoch: 9 Global Step: 165450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:46,488-Speed 4456.05 samples/sec Loss 1.9393 Epoch: 9 Global Step: 165500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:46:58,028-Speed 4436.73 samples/sec Loss 1.9500 Epoch: 9 Global Step: 165550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:47:10,404-Speed 4137.16 samples/sec Loss 1.9262 Epoch: 9 Global Step: 165600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:47:22,016-Speed 4409.51 samples/sec Loss 1.9652 Epoch: 9 Global Step: 165650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:47:33,559-Speed 4435.62 samples/sec Loss 1.9578 Epoch: 9 Global Step: 165700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:47:45,155-Speed 4415.61 samples/sec Loss 1.9352 Epoch: 9 Global Step: 165750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:47:56,578-Speed 4482.49 samples/sec Loss 1.9486 Epoch: 9 Global Step: 165800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:48:08,963-Speed 4134.14 samples/sec Loss 1.9822 Epoch: 9 Global Step: 165850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:48:20,470-Speed 4449.59 samples/sec Loss 1.9529 Epoch: 9 Global Step: 165900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:48:32,989-Speed 4089.90 samples/sec Loss 1.9771 Epoch: 9 Global Step: 165950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:48:44,668-Speed 4384.31 samples/sec Loss 1.9691 Epoch: 9 Global Step: 166000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:49:14,953-[lfw][166000]XNorm: 22.700923 Training: 2021-03-15 10:49:14,954-[lfw][166000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-15 10:49:14,954-[lfw][166000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:49:50,159-[cfp_fp][166000]XNorm: 20.437963 Training: 2021-03-15 10:49:50,159-[cfp_fp][166000]Accuracy-Flip: 0.97714+-0.00596 Training: 2021-03-15 10:49:50,159-[cfp_fp][166000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:50:20,906-[agedb_30][166000]XNorm: 22.221261 Training: 2021-03-15 10:50:20,907-[agedb_30][166000]Accuracy-Flip: 0.97467+-0.00645 Training: 2021-03-15 10:50:20,907-[agedb_30][166000]Accuracy-Highest: 0.97917 Training: 2021-03-15 10:50:32,432-Speed 475.12 samples/sec Loss 1.9600 Epoch: 9 Global Step: 166050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:50:44,057-Speed 4404.46 samples/sec Loss 1.9485 Epoch: 9 Global Step: 166100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:50:55,669-Speed 4409.12 samples/sec Loss 1.9524 Epoch: 9 Global Step: 166150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:51:07,176-Speed 4449.70 samples/sec Loss 1.9750 Epoch: 9 Global Step: 166200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:51:18,855-Speed 4384.25 samples/sec Loss 1.9613 Epoch: 9 Global Step: 166250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:51:30,353-Speed 4453.17 samples/sec Loss 1.9724 Epoch: 9 Global Step: 166300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:51:42,140-Speed 4343.88 samples/sec Loss 1.9588 Epoch: 9 Global Step: 166350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:51:53,959-Speed 4332.18 samples/sec Loss 1.9708 Epoch: 9 Global Step: 166400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:52:06,170-Speed 4193.25 samples/sec Loss 1.9791 Epoch: 9 Global Step: 166450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:52:17,622-Speed 4470.92 samples/sec Loss 1.9619 Epoch: 9 Global Step: 166500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:52:29,407-Speed 4344.80 samples/sec Loss 1.9518 Epoch: 9 Global Step: 166550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:52:41,100-Speed 4379.08 samples/sec Loss 1.9539 Epoch: 9 Global Step: 166600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:52:52,721-Speed 4405.98 samples/sec Loss 1.9559 Epoch: 9 Global Step: 166650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:53:04,450-Speed 4365.42 samples/sec Loss 1.9516 Epoch: 9 Global Step: 166700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:53:16,189-Speed 4361.89 samples/sec Loss 1.9532 Epoch: 9 Global Step: 166750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:53:27,720-Speed 4440.37 samples/sec Loss 1.9820 Epoch: 9 Global Step: 166800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:53:39,354-Speed 4401.13 samples/sec Loss 1.9320 Epoch: 9 Global Step: 166850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:53:51,153-Speed 4339.32 samples/sec Loss 1.9559 Epoch: 9 Global Step: 166900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:54:15,730-Speed 2083.28 samples/sec Loss 1.7042 Epoch: 10 Global Step: 166950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:54:27,446-Speed 4370.46 samples/sec Loss 1.6601 Epoch: 10 Global Step: 167000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:54:39,255-Speed 4336.24 samples/sec Loss 1.6566 Epoch: 10 Global Step: 167050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:54:51,866-Speed 4059.95 samples/sec Loss 1.6698 Epoch: 10 Global Step: 167100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:55:03,267-Speed 4491.16 samples/sec Loss 1.6659 Epoch: 10 Global Step: 167150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:55:14,571-Speed 4529.86 samples/sec Loss 1.6659 Epoch: 10 Global Step: 167200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:55:25,765-Speed 4573.83 samples/sec Loss 1.6578 Epoch: 10 Global Step: 167250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:55:37,176-Speed 4487.19 samples/sec Loss 1.6760 Epoch: 10 Global Step: 167300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:55:48,911-Speed 4363.20 samples/sec Loss 1.6656 Epoch: 10 Global Step: 167350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:00,525-Speed 4408.80 samples/sec Loss 1.6847 Epoch: 10 Global Step: 167400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:11,998-Speed 4463.12 samples/sec Loss 1.6893 Epoch: 10 Global Step: 167450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:23,566-Speed 4426.17 samples/sec Loss 1.6737 Epoch: 10 Global Step: 167500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:36,843-Speed 3856.52 samples/sec Loss 1.7056 Epoch: 10 Global Step: 167550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:48,346-Speed 4450.89 samples/sec Loss 1.6951 Epoch: 10 Global Step: 167600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:56:59,623-Speed 4540.74 samples/sec Loss 1.6932 Epoch: 10 Global Step: 167650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:57:11,197-Speed 4423.72 samples/sec Loss 1.7087 Epoch: 10 Global Step: 167700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:57:22,672-Speed 4462.09 samples/sec Loss 1.6777 Epoch: 10 Global Step: 167750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:57:34,158-Speed 4457.91 samples/sec Loss 1.6976 Epoch: 10 Global Step: 167800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:57:46,628-Speed 4105.92 samples/sec Loss 1.7013 Epoch: 10 Global Step: 167850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:57:57,983-Speed 4509.31 samples/sec Loss 1.6964 Epoch: 10 Global Step: 167900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:58:09,303-Speed 4523.14 samples/sec Loss 1.7181 Epoch: 10 Global Step: 167950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:58:20,726-Speed 4482.26 samples/sec Loss 1.7094 Epoch: 10 Global Step: 168000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 10:58:50,936-[lfw][168000]XNorm: 22.945861 Training: 2021-03-15 10:58:50,936-[lfw][168000]Accuracy-Flip: 0.99767+-0.00213 Training: 2021-03-15 10:58:50,936-[lfw][168000]Accuracy-Highest: 0.99783 Training: 2021-03-15 10:59:25,855-[cfp_fp][168000]XNorm: 20.621362 Training: 2021-03-15 10:59:25,855-[cfp_fp][168000]Accuracy-Flip: 0.97871+-0.00692 Training: 2021-03-15 10:59:25,855-[cfp_fp][168000]Accuracy-Highest: 0.98200 Training: 2021-03-15 10:59:56,055-[agedb_30][168000]XNorm: 22.910198 Training: 2021-03-15 10:59:56,055-[agedb_30][168000]Accuracy-Flip: 0.97867+-0.00690 Training: 2021-03-15 10:59:56,055-[agedb_30][168000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:00:07,542-Speed 479.33 samples/sec Loss 1.7108 Epoch: 10 Global Step: 168050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:00:19,016-Speed 4462.61 samples/sec Loss 1.6919 Epoch: 10 Global Step: 168100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:00:30,539-Speed 4443.46 samples/sec Loss 1.7068 Epoch: 10 Global Step: 168150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:00:42,944-Speed 4127.61 samples/sec Loss 1.6880 Epoch: 10 Global Step: 168200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:00:54,333-Speed 4495.58 samples/sec Loss 1.7117 Epoch: 10 Global Step: 168250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:01:06,039-Speed 4374.03 samples/sec Loss 1.7006 Epoch: 10 Global Step: 168300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:01:17,439-Speed 4491.62 samples/sec Loss 1.7019 Epoch: 10 Global Step: 168350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:01:29,174-Speed 4363.19 samples/sec Loss 1.7251 Epoch: 10 Global Step: 168400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:01:40,548-Speed 4501.60 samples/sec Loss 1.7319 Epoch: 10 Global Step: 168450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:01:53,210-Speed 4043.85 samples/sec Loss 1.7283 Epoch: 10 Global Step: 168500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:02:04,571-Speed 4506.60 samples/sec Loss 1.7499 Epoch: 10 Global Step: 168550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:02:16,804-Speed 4185.62 samples/sec Loss 1.7243 Epoch: 10 Global Step: 168600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:02:28,130-Speed 4521.13 samples/sec Loss 1.7396 Epoch: 10 Global Step: 168650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:02:39,490-Speed 4507.21 samples/sec Loss 1.7390 Epoch: 10 Global Step: 168700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:02:51,000-Speed 4448.56 samples/sec Loss 1.7435 Epoch: 10 Global Step: 168750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:03:02,518-Speed 4445.33 samples/sec Loss 1.7723 Epoch: 10 Global Step: 168800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:03:13,726-Speed 4568.56 samples/sec Loss 1.7373 Epoch: 10 Global Step: 168850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:03:25,278-Speed 4432.13 samples/sec Loss 1.7234 Epoch: 10 Global Step: 168900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:03:36,782-Speed 4450.81 samples/sec Loss 1.7472 Epoch: 10 Global Step: 168950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:03:49,166-Speed 4134.80 samples/sec Loss 1.7399 Epoch: 10 Global Step: 169000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:00,533-Speed 4504.16 samples/sec Loss 1.7457 Epoch: 10 Global Step: 169050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:12,019-Speed 4458.00 samples/sec Loss 1.7687 Epoch: 10 Global Step: 169100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:23,386-Speed 4504.23 samples/sec Loss 1.7874 Epoch: 10 Global Step: 169150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:34,795-Speed 4488.07 samples/sec Loss 1.7546 Epoch: 10 Global Step: 169200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:46,423-Speed 4403.32 samples/sec Loss 1.7741 Epoch: 10 Global Step: 169250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:04:58,123-Speed 4376.32 samples/sec Loss 1.7386 Epoch: 10 Global Step: 169300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:05:09,499-Speed 4500.88 samples/sec Loss 1.7772 Epoch: 10 Global Step: 169350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:05:20,950-Speed 4471.47 samples/sec Loss 1.7905 Epoch: 10 Global Step: 169400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:05:32,249-Speed 4531.50 samples/sec Loss 1.7479 Epoch: 10 Global Step: 169450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:05:43,515-Speed 4544.68 samples/sec Loss 1.7615 Epoch: 10 Global Step: 169500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:05:55,820-Speed 4161.35 samples/sec Loss 1.7516 Epoch: 10 Global Step: 169550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:06:07,371-Speed 4432.48 samples/sec Loss 1.7498 Epoch: 10 Global Step: 169600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:06:18,948-Speed 4422.92 samples/sec Loss 1.7866 Epoch: 10 Global Step: 169650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:06:30,519-Speed 4424.95 samples/sec Loss 1.7736 Epoch: 10 Global Step: 169700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:06:42,245-Speed 4366.55 samples/sec Loss 1.7807 Epoch: 10 Global Step: 169750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:06:53,905-Speed 4391.39 samples/sec Loss 1.7665 Epoch: 10 Global Step: 169800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:07:05,620-Speed 4370.46 samples/sec Loss 1.7843 Epoch: 10 Global Step: 169850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:07:17,279-Speed 4391.64 samples/sec Loss 1.8062 Epoch: 10 Global Step: 169900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:07:28,759-Speed 4460.18 samples/sec Loss 1.7828 Epoch: 10 Global Step: 169950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:07:40,204-Speed 4473.74 samples/sec Loss 1.7730 Epoch: 10 Global Step: 170000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:08:10,547-[lfw][170000]XNorm: 23.643765 Training: 2021-03-15 11:08:10,547-[lfw][170000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-15 11:08:10,547-[lfw][170000]Accuracy-Highest: 0.99783 Training: 2021-03-15 11:08:45,773-[cfp_fp][170000]XNorm: 21.626319 Training: 2021-03-15 11:08:45,773-[cfp_fp][170000]Accuracy-Flip: 0.97786+-0.00809 Training: 2021-03-15 11:08:45,773-[cfp_fp][170000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:09:16,016-[agedb_30][170000]XNorm: 23.568233 Training: 2021-03-15 11:09:16,016-[agedb_30][170000]Accuracy-Flip: 0.97650+-0.00643 Training: 2021-03-15 11:09:16,016-[agedb_30][170000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:09:27,669-Speed 476.44 samples/sec Loss 1.7947 Epoch: 10 Global Step: 170050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:09:39,171-Speed 4451.56 samples/sec Loss 1.8064 Epoch: 10 Global Step: 170100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:09:52,447-Speed 3856.74 samples/sec Loss 1.7778 Epoch: 10 Global Step: 170150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:10:03,985-Speed 4437.41 samples/sec Loss 1.7830 Epoch: 10 Global Step: 170200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:10:15,440-Speed 4469.92 samples/sec Loss 1.7748 Epoch: 10 Global Step: 170250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:10:27,050-Speed 4410.15 samples/sec Loss 1.7753 Epoch: 10 Global Step: 170300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:10:38,600-Speed 4433.21 samples/sec Loss 1.7856 Epoch: 10 Global Step: 170350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:10:51,007-Speed 4126.86 samples/sec Loss 1.8137 Epoch: 10 Global Step: 170400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:02,467-Speed 4468.06 samples/sec Loss 1.7814 Epoch: 10 Global Step: 170450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:13,919-Speed 4471.19 samples/sec Loss 1.7906 Epoch: 10 Global Step: 170500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:25,321-Speed 4490.75 samples/sec Loss 1.8046 Epoch: 10 Global Step: 170550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:36,859-Speed 4437.71 samples/sec Loss 1.8017 Epoch: 10 Global Step: 170600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:48,279-Speed 4483.27 samples/sec Loss 1.8154 Epoch: 10 Global Step: 170650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:11:59,695-Speed 4485.44 samples/sec Loss 1.8321 Epoch: 10 Global Step: 170700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:12:11,431-Speed 4362.63 samples/sec Loss 1.8155 Epoch: 10 Global Step: 170750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:12:23,983-Speed 4079.14 samples/sec Loss 1.7915 Epoch: 10 Global Step: 170800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:12:35,529-Speed 4435.00 samples/sec Loss 1.7832 Epoch: 10 Global Step: 170850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:12:46,960-Speed 4479.18 samples/sec Loss 1.7883 Epoch: 10 Global Step: 170900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:12:58,495-Speed 4438.99 samples/sec Loss 1.8012 Epoch: 10 Global Step: 170950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:13:09,847-Speed 4510.08 samples/sec Loss 1.7923 Epoch: 10 Global Step: 171000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:13:21,343-Speed 4454.01 samples/sec Loss 1.7954 Epoch: 10 Global Step: 171050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:13:33,808-Speed 4107.84 samples/sec Loss 1.8141 Epoch: 10 Global Step: 171100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:13:46,132-Speed 4154.79 samples/sec Loss 1.8313 Epoch: 10 Global Step: 171150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:13:57,676-Speed 4435.33 samples/sec Loss 1.8078 Epoch: 10 Global Step: 171200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:14:09,166-Speed 4456.42 samples/sec Loss 1.8190 Epoch: 10 Global Step: 171250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:14:20,730-Speed 4427.80 samples/sec Loss 1.8154 Epoch: 10 Global Step: 171300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:14:32,145-Speed 4485.26 samples/sec Loss 1.8109 Epoch: 10 Global Step: 171350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:14:43,875-Speed 4365.26 samples/sec Loss 1.7974 Epoch: 10 Global Step: 171400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:14:55,359-Speed 4458.53 samples/sec Loss 1.8103 Epoch: 10 Global Step: 171450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:15:06,984-Speed 4404.42 samples/sec Loss 1.8277 Epoch: 10 Global Step: 171500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:15:19,202-Speed 4190.68 samples/sec Loss 1.7910 Epoch: 10 Global Step: 171550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:15:30,757-Speed 4431.21 samples/sec Loss 1.8249 Epoch: 10 Global Step: 171600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:15:42,335-Speed 4422.72 samples/sec Loss 1.7799 Epoch: 10 Global Step: 171650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:15:53,994-Speed 4391.57 samples/sec Loss 1.8181 Epoch: 10 Global Step: 171700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:16:05,449-Speed 4469.93 samples/sec Loss 1.7964 Epoch: 10 Global Step: 171750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:16:16,794-Speed 4512.98 samples/sec Loss 1.8186 Epoch: 10 Global Step: 171800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:16:28,560-Speed 4351.84 samples/sec Loss 1.8237 Epoch: 10 Global Step: 171850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:16:40,171-Speed 4409.83 samples/sec Loss 1.8398 Epoch: 10 Global Step: 171900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:16:51,868-Speed 4377.32 samples/sec Loss 1.8583 Epoch: 10 Global Step: 171950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:17:03,354-Speed 4457.81 samples/sec Loss 1.8430 Epoch: 10 Global Step: 172000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:17:33,757-[lfw][172000]XNorm: 22.109946 Training: 2021-03-15 11:17:33,758-[lfw][172000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-15 11:17:33,758-[lfw][172000]Accuracy-Highest: 0.99783 Training: 2021-03-15 11:18:08,800-[cfp_fp][172000]XNorm: 20.123505 Training: 2021-03-15 11:18:08,800-[cfp_fp][172000]Accuracy-Flip: 0.97729+-0.00907 Training: 2021-03-15 11:18:08,800-[cfp_fp][172000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:18:39,086-[agedb_30][172000]XNorm: 22.181378 Training: 2021-03-15 11:18:39,087-[agedb_30][172000]Accuracy-Flip: 0.97550+-0.00746 Training: 2021-03-15 11:18:39,087-[agedb_30][172000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:18:50,713-Speed 476.91 samples/sec Loss 1.8475 Epoch: 10 Global Step: 172050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:19:02,348-Speed 4401.07 samples/sec Loss 1.8325 Epoch: 10 Global Step: 172100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:19:14,679-Speed 4152.09 samples/sec Loss 1.8324 Epoch: 10 Global Step: 172150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:19:26,486-Speed 4336.70 samples/sec Loss 1.8234 Epoch: 10 Global Step: 172200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:19:38,310-Speed 4330.44 samples/sec Loss 1.8188 Epoch: 10 Global Step: 172250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:19:49,922-Speed 4409.36 samples/sec Loss 1.8173 Epoch: 10 Global Step: 172300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:01,586-Speed 4389.96 samples/sec Loss 1.8140 Epoch: 10 Global Step: 172350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:13,153-Speed 4426.52 samples/sec Loss 1.8189 Epoch: 10 Global Step: 172400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:24,489-Speed 4517.05 samples/sec Loss 1.8223 Epoch: 10 Global Step: 172450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:35,877-Speed 4496.07 samples/sec Loss 1.8281 Epoch: 10 Global Step: 172500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:47,737-Speed 4317.35 samples/sec Loss 1.8461 Epoch: 10 Global Step: 172550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:20:59,277-Speed 4436.64 samples/sec Loss 1.8243 Epoch: 10 Global Step: 172600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:21:11,060-Speed 4345.67 samples/sec Loss 1.8354 Epoch: 10 Global Step: 172650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:21:22,631-Speed 4424.81 samples/sec Loss 1.8312 Epoch: 10 Global Step: 172700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:21:34,205-Speed 4424.15 samples/sec Loss 1.8696 Epoch: 10 Global Step: 172750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:21:47,428-Speed 3872.30 samples/sec Loss 1.8369 Epoch: 10 Global Step: 172800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:21:59,040-Speed 4409.37 samples/sec Loss 1.8559 Epoch: 10 Global Step: 172850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:22:10,651-Speed 4409.69 samples/sec Loss 1.8471 Epoch: 10 Global Step: 172900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:22:22,115-Speed 4466.61 samples/sec Loss 1.8749 Epoch: 10 Global Step: 172950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:22:34,582-Speed 4106.84 samples/sec Loss 1.8345 Epoch: 10 Global Step: 173000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:22:46,059-Speed 4461.40 samples/sec Loss 1.8343 Epoch: 10 Global Step: 173050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:22:57,503-Speed 4474.21 samples/sec Loss 1.8408 Epoch: 10 Global Step: 173100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:23:08,990-Speed 4457.52 samples/sec Loss 1.8823 Epoch: 10 Global Step: 173150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:23:20,304-Speed 4525.32 samples/sec Loss 1.8496 Epoch: 10 Global Step: 173200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:23:31,878-Speed 4423.86 samples/sec Loss 1.8602 Epoch: 10 Global Step: 173250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:23:43,316-Speed 4476.58 samples/sec Loss 1.8389 Epoch: 10 Global Step: 173300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:23:54,677-Speed 4506.81 samples/sec Loss 1.8591 Epoch: 10 Global Step: 173350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:24:06,611-Speed 4290.44 samples/sec Loss 1.8526 Epoch: 10 Global Step: 173400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:24:18,954-Speed 4148.48 samples/sec Loss 1.8558 Epoch: 10 Global Step: 173450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:24:30,649-Speed 4377.89 samples/sec Loss 1.8707 Epoch: 10 Global Step: 173500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:24:42,183-Speed 4439.54 samples/sec Loss 1.8401 Epoch: 10 Global Step: 173550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:24:53,653-Speed 4463.72 samples/sec Loss 1.8793 Epoch: 10 Global Step: 173600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:25:05,173-Speed 4444.96 samples/sec Loss 1.8659 Epoch: 10 Global Step: 173650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:25:16,555-Speed 4498.51 samples/sec Loss 1.8363 Epoch: 10 Global Step: 173700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:25:28,304-Speed 4357.99 samples/sec Loss 1.8630 Epoch: 10 Global Step: 173750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:25:41,680-Speed 3828.24 samples/sec Loss 1.9003 Epoch: 10 Global Step: 173800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:25:53,461-Speed 4345.95 samples/sec Loss 1.8807 Epoch: 10 Global Step: 173850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:26:04,962-Speed 4452.19 samples/sec Loss 1.8614 Epoch: 10 Global Step: 173900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:26:16,594-Speed 4401.85 samples/sec Loss 1.8755 Epoch: 10 Global Step: 173950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:26:28,157-Speed 4428.17 samples/sec Loss 1.8495 Epoch: 10 Global Step: 174000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:26:58,524-[lfw][174000]XNorm: 23.541269 Training: 2021-03-15 11:26:58,525-[lfw][174000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-15 11:26:58,525-[lfw][174000]Accuracy-Highest: 0.99783 Training: 2021-03-15 11:27:33,497-[cfp_fp][174000]XNorm: 21.506431 Training: 2021-03-15 11:27:33,498-[cfp_fp][174000]Accuracy-Flip: 0.98114+-0.00679 Training: 2021-03-15 11:27:33,498-[cfp_fp][174000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:28:03,730-[agedb_30][174000]XNorm: 23.693969 Training: 2021-03-15 11:28:03,730-[agedb_30][174000]Accuracy-Flip: 0.97650+-0.00797 Training: 2021-03-15 11:28:03,730-[agedb_30][174000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:28:15,229-Speed 478.18 samples/sec Loss 1.8821 Epoch: 10 Global Step: 174050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:28:26,832-Speed 4412.94 samples/sec Loss 1.8401 Epoch: 10 Global Step: 174100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:28:39,018-Speed 4201.87 samples/sec Loss 1.8888 Epoch: 10 Global Step: 174150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:28:50,475-Speed 4468.96 samples/sec Loss 1.8860 Epoch: 10 Global Step: 174200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-15 11:29:02,000-Speed 4442.73 samples/sec Loss 1.8670 Epoch: 10 Global Step: 174250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:29:13,587-Speed 4419.09 samples/sec Loss 1.8640 Epoch: 10 Global Step: 174300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:29:25,034-Speed 4472.94 samples/sec Loss 1.8566 Epoch: 10 Global Step: 174350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:29:36,787-Speed 4356.42 samples/sec Loss 1.8428 Epoch: 10 Global Step: 174400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:29:48,523-Speed 4362.95 samples/sec Loss 1.8964 Epoch: 10 Global Step: 174450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:29:59,983-Speed 4467.89 samples/sec Loss 1.8758 Epoch: 10 Global Step: 174500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:30:11,443-Speed 4467.86 samples/sec Loss 1.8798 Epoch: 10 Global Step: 174550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:30:22,900-Speed 4469.14 samples/sec Loss 1.8743 Epoch: 10 Global Step: 174600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:30:34,555-Speed 4393.12 samples/sec Loss 1.8758 Epoch: 10 Global Step: 174650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:30:46,121-Speed 4427.08 samples/sec Loss 1.8667 Epoch: 10 Global Step: 174700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:30:58,473-Speed 4145.25 samples/sec Loss 1.8861 Epoch: 10 Global Step: 174750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:31:09,917-Speed 4474.01 samples/sec Loss 1.8546 Epoch: 10 Global Step: 174800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:31:21,408-Speed 4456.02 samples/sec Loss 1.8755 Epoch: 10 Global Step: 174850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:31:32,981-Speed 4424.34 samples/sec Loss 1.8884 Epoch: 10 Global Step: 174900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:31:44,546-Speed 4427.41 samples/sec Loss 1.8600 Epoch: 10 Global Step: 174950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:31:56,097-Speed 4432.55 samples/sec Loss 1.8677 Epoch: 10 Global Step: 175000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:32:07,838-Speed 4361.21 samples/sec Loss 1.8811 Epoch: 10 Global Step: 175050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:32:19,283-Speed 4473.71 samples/sec Loss 1.8946 Epoch: 10 Global Step: 175100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:32:30,714-Speed 4479.04 samples/sec Loss 1.8755 Epoch: 10 Global Step: 175150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:32:42,202-Speed 4457.51 samples/sec Loss 1.8898 Epoch: 10 Global Step: 175200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:32:53,854-Speed 4394.26 samples/sec Loss 1.8840 Epoch: 10 Global Step: 175250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:33:05,487-Speed 4401.32 samples/sec Loss 1.8841 Epoch: 10 Global Step: 175300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:33:16,951-Speed 4466.65 samples/sec Loss 1.8729 Epoch: 10 Global Step: 175350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:33:28,451-Speed 4452.28 samples/sec Loss 1.8864 Epoch: 10 Global Step: 175400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:33:41,777-Speed 3842.22 samples/sec Loss 1.8988 Epoch: 10 Global Step: 175450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:33:53,506-Speed 4365.60 samples/sec Loss 1.9051 Epoch: 10 Global Step: 175500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:34:05,109-Speed 4412.77 samples/sec Loss 1.8719 Epoch: 10 Global Step: 175550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:34:16,614-Speed 4450.33 samples/sec Loss 1.8981 Epoch: 10 Global Step: 175600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:34:28,077-Speed 4466.95 samples/sec Loss 1.8859 Epoch: 10 Global Step: 175650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:34:40,539-Speed 4108.68 samples/sec Loss 1.8750 Epoch: 10 Global Step: 175700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:34:52,113-Speed 4423.95 samples/sec Loss 1.8863 Epoch: 10 Global Step: 175750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:35:03,573-Speed 4468.12 samples/sec Loss 1.9075 Epoch: 10 Global Step: 175800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:35:15,213-Speed 4398.61 samples/sec Loss 1.8894 Epoch: 10 Global Step: 175850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:35:26,775-Speed 4428.51 samples/sec Loss 1.9043 Epoch: 10 Global Step: 175900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:35:38,431-Speed 4392.73 samples/sec Loss 1.8787 Epoch: 10 Global Step: 175950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:35:50,066-Speed 4400.96 samples/sec Loss 1.8815 Epoch: 10 Global Step: 176000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:36:20,226-[lfw][176000]XNorm: 22.473668 Training: 2021-03-15 11:36:20,226-[lfw][176000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-15 11:36:20,227-[lfw][176000]Accuracy-Highest: 0.99783 Training: 2021-03-15 11:36:55,168-[cfp_fp][176000]XNorm: 20.583012 Training: 2021-03-15 11:36:55,169-[cfp_fp][176000]Accuracy-Flip: 0.97800+-0.00775 Training: 2021-03-15 11:36:55,169-[cfp_fp][176000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:37:25,265-[agedb_30][176000]XNorm: 22.156981 Training: 2021-03-15 11:37:25,265-[agedb_30][176000]Accuracy-Flip: 0.97667+-0.00699 Training: 2021-03-15 11:37:25,265-[agedb_30][176000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:37:37,679-Speed 475.78 samples/sec Loss 1.9052 Epoch: 10 Global Step: 176050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:37:49,418-Speed 4361.97 samples/sec Loss 1.8794 Epoch: 10 Global Step: 176100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:00,974-Speed 4430.52 samples/sec Loss 1.8927 Epoch: 10 Global Step: 176150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:12,457-Speed 4459.21 samples/sec Loss 1.9005 Epoch: 10 Global Step: 176200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:24,078-Speed 4405.79 samples/sec Loss 1.9204 Epoch: 10 Global Step: 176250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:35,760-Speed 4383.26 samples/sec Loss 1.8734 Epoch: 10 Global Step: 176300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:47,325-Speed 4427.34 samples/sec Loss 1.8793 Epoch: 10 Global Step: 176350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:38:58,847-Speed 4443.74 samples/sec Loss 1.8917 Epoch: 10 Global Step: 176400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:39:12,032-Speed 3883.55 samples/sec Loss 1.9025 Epoch: 10 Global Step: 176450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:39:23,506-Speed 4462.38 samples/sec Loss 1.9023 Epoch: 10 Global Step: 176500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:39:34,849-Speed 4513.94 samples/sec Loss 1.9101 Epoch: 10 Global Step: 176550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:39:46,341-Speed 4455.49 samples/sec Loss 1.8845 Epoch: 10 Global Step: 176600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:39:57,729-Speed 4496.46 samples/sec Loss 1.8898 Epoch: 10 Global Step: 176650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:40:09,304-Speed 4423.50 samples/sec Loss 1.8778 Epoch: 10 Global Step: 176700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:40:21,682-Speed 4136.47 samples/sec Loss 1.9081 Epoch: 10 Global Step: 176750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:40:33,199-Speed 4445.86 samples/sec Loss 1.9260 Epoch: 10 Global Step: 176800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:40:44,610-Speed 4487.23 samples/sec Loss 1.8892 Epoch: 10 Global Step: 176850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:40:56,247-Speed 4399.98 samples/sec Loss 1.8993 Epoch: 10 Global Step: 176900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:41:07,853-Speed 4411.73 samples/sec Loss 1.9208 Epoch: 10 Global Step: 176950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:41:19,594-Speed 4360.86 samples/sec Loss 1.8996 Epoch: 10 Global Step: 177000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:41:31,389-Speed 4341.00 samples/sec Loss 1.9002 Epoch: 10 Global Step: 177050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:41:42,993-Speed 4412.31 samples/sec Loss 1.9240 Epoch: 10 Global Step: 177100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:41:54,838-Speed 4322.80 samples/sec Loss 1.9187 Epoch: 10 Global Step: 177150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:42:06,287-Speed 4472.45 samples/sec Loss 1.9034 Epoch: 10 Global Step: 177200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:42:17,742-Speed 4469.87 samples/sec Loss 1.8904 Epoch: 10 Global Step: 177250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:42:29,252-Speed 4448.27 samples/sec Loss 1.8880 Epoch: 10 Global Step: 177300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:42:41,642-Speed 4132.75 samples/sec Loss 1.9358 Epoch: 10 Global Step: 177350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:42:53,319-Speed 4384.57 samples/sec Loss 1.9157 Epoch: 10 Global Step: 177400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:43:04,855-Speed 4438.59 samples/sec Loss 1.9195 Epoch: 10 Global Step: 177450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:43:16,695-Speed 4324.54 samples/sec Loss 1.9130 Epoch: 10 Global Step: 177500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:43:28,063-Speed 4504.01 samples/sec Loss 1.9286 Epoch: 10 Global Step: 177550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:43:39,510-Speed 4472.94 samples/sec Loss 1.9202 Epoch: 10 Global Step: 177600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:43:50,912-Speed 4490.89 samples/sec Loss 1.9494 Epoch: 10 Global Step: 177650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:02,470-Speed 4430.08 samples/sec Loss 1.9198 Epoch: 10 Global Step: 177700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:13,839-Speed 4503.38 samples/sec Loss 1.8965 Epoch: 10 Global Step: 177750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:24,953-Speed 4607.10 samples/sec Loss 1.9123 Epoch: 10 Global Step: 177800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:36,499-Speed 4434.79 samples/sec Loss 1.9158 Epoch: 10 Global Step: 177850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:48,045-Speed 4434.58 samples/sec Loss 1.9402 Epoch: 10 Global Step: 177900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:44:59,769-Speed 4367.66 samples/sec Loss 1.9168 Epoch: 10 Global Step: 177950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:45:11,298-Speed 4441.15 samples/sec Loss 1.9047 Epoch: 10 Global Step: 178000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:45:41,507-[lfw][178000]XNorm: 23.000276 Training: 2021-03-15 11:45:41,507-[lfw][178000]Accuracy-Flip: 0.99833+-0.00236 Training: 2021-03-15 11:45:41,507-[lfw][178000]Accuracy-Highest: 0.99833 Training: 2021-03-15 11:46:16,669-[cfp_fp][178000]XNorm: 20.670480 Training: 2021-03-15 11:46:16,669-[cfp_fp][178000]Accuracy-Flip: 0.98186+-0.00821 Training: 2021-03-15 11:46:16,669-[cfp_fp][178000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:46:47,009-[agedb_30][178000]XNorm: 22.784863 Training: 2021-03-15 11:46:47,009-[agedb_30][178000]Accuracy-Flip: 0.97700+-0.00702 Training: 2021-03-15 11:46:47,009-[agedb_30][178000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:46:59,478-Speed 473.28 samples/sec Loss 1.9270 Epoch: 10 Global Step: 178050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:47:11,824-Speed 4147.33 samples/sec Loss 1.9414 Epoch: 10 Global Step: 178100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:47:23,535-Speed 4372.06 samples/sec Loss 1.9175 Epoch: 10 Global Step: 178150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:47:35,228-Speed 4378.87 samples/sec Loss 1.9006 Epoch: 10 Global Step: 178200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:47:46,923-Speed 4378.08 samples/sec Loss 1.8967 Epoch: 10 Global Step: 178250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:47:58,458-Speed 4438.88 samples/sec Loss 1.9433 Epoch: 10 Global Step: 178300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:48:10,956-Speed 4096.93 samples/sec Loss 1.9081 Epoch: 10 Global Step: 178350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:48:22,433-Speed 4461.34 samples/sec Loss 1.9285 Epoch: 10 Global Step: 178400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:48:33,864-Speed 4479.23 samples/sec Loss 1.9228 Epoch: 10 Global Step: 178450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:48:45,325-Speed 4467.46 samples/sec Loss 1.9119 Epoch: 10 Global Step: 178500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:48:56,972-Speed 4396.17 samples/sec Loss 1.9164 Epoch: 10 Global Step: 178550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:49:08,326-Speed 4509.48 samples/sec Loss 1.9118 Epoch: 10 Global Step: 178600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:49:19,688-Speed 4506.53 samples/sec Loss 1.9623 Epoch: 10 Global Step: 178650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:49:32,151-Speed 4108.31 samples/sec Loss 1.9077 Epoch: 10 Global Step: 178700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:49:43,841-Speed 4380.27 samples/sec Loss 1.8966 Epoch: 10 Global Step: 178750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:49:55,283-Speed 4474.93 samples/sec Loss 1.9157 Epoch: 10 Global Step: 178800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:50:06,643-Speed 4507.13 samples/sec Loss 1.9309 Epoch: 10 Global Step: 178850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:50:18,172-Speed 4441.27 samples/sec Loss 1.9434 Epoch: 10 Global Step: 178900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:50:29,768-Speed 4415.44 samples/sec Loss 1.9324 Epoch: 10 Global Step: 178950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:50:41,340-Speed 4424.86 samples/sec Loss 1.9374 Epoch: 10 Global Step: 179000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:50:52,703-Speed 4506.33 samples/sec Loss 1.9230 Epoch: 10 Global Step: 179050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:51:05,930-Speed 3871.02 samples/sec Loss 1.9062 Epoch: 10 Global Step: 179100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:51:17,501-Speed 4425.10 samples/sec Loss 1.9390 Epoch: 10 Global Step: 179150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:51:29,074-Speed 4423.98 samples/sec Loss 1.9451 Epoch: 10 Global Step: 179200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:51:40,592-Speed 4445.61 samples/sec Loss 1.9310 Epoch: 10 Global Step: 179250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:51:52,127-Speed 4438.66 samples/sec Loss 1.9108 Epoch: 10 Global Step: 179300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:52:03,787-Speed 4391.58 samples/sec Loss 1.9422 Epoch: 10 Global Step: 179350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:52:16,134-Speed 4146.82 samples/sec Loss 1.9444 Epoch: 10 Global Step: 179400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:52:27,891-Speed 4354.96 samples/sec Loss 1.9479 Epoch: 10 Global Step: 179450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:52:39,419-Speed 4441.65 samples/sec Loss 1.9050 Epoch: 10 Global Step: 179500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:52:50,920-Speed 4452.17 samples/sec Loss 1.9404 Epoch: 10 Global Step: 179550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:02,411-Speed 4455.69 samples/sec Loss 1.9011 Epoch: 10 Global Step: 179600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:13,927-Speed 4446.23 samples/sec Loss 1.9473 Epoch: 10 Global Step: 179650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:25,277-Speed 4511.39 samples/sec Loss 1.9573 Epoch: 10 Global Step: 179700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:36,549-Speed 4542.41 samples/sec Loss 1.9417 Epoch: 10 Global Step: 179750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:48,022-Speed 4462.84 samples/sec Loss 1.9571 Epoch: 10 Global Step: 179800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:53:59,692-Speed 4387.41 samples/sec Loss 1.9511 Epoch: 10 Global Step: 179850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:54:11,368-Speed 4385.37 samples/sec Loss 1.9336 Epoch: 10 Global Step: 179900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:54:23,022-Speed 4393.42 samples/sec Loss 1.9486 Epoch: 10 Global Step: 179950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:54:35,373-Speed 4145.65 samples/sec Loss 1.9565 Epoch: 10 Global Step: 180000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:55:05,988-[lfw][180000]XNorm: 22.384606 Training: 2021-03-15 11:55:05,988-[lfw][180000]Accuracy-Flip: 0.99683+-0.00353 Training: 2021-03-15 11:55:05,988-[lfw][180000]Accuracy-Highest: 0.99833 Training: 2021-03-15 11:55:41,162-[cfp_fp][180000]XNorm: 20.559518 Training: 2021-03-15 11:55:41,162-[cfp_fp][180000]Accuracy-Flip: 0.97757+-0.00864 Training: 2021-03-15 11:55:41,162-[cfp_fp][180000]Accuracy-Highest: 0.98200 Training: 2021-03-15 11:56:11,513-[agedb_30][180000]XNorm: 22.273604 Training: 2021-03-15 11:56:11,513-[agedb_30][180000]Accuracy-Flip: 0.97617+-0.00723 Training: 2021-03-15 11:56:11,513-[agedb_30][180000]Accuracy-Highest: 0.97917 Training: 2021-03-15 11:56:23,000-Speed 475.72 samples/sec Loss 1.9489 Epoch: 10 Global Step: 180050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:56:34,438-Speed 4476.58 samples/sec Loss 1.9565 Epoch: 10 Global Step: 180100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:56:46,004-Speed 4426.78 samples/sec Loss 1.9288 Epoch: 10 Global Step: 180150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:56:57,628-Speed 4404.92 samples/sec Loss 1.9405 Epoch: 10 Global Step: 180200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:57:09,156-Speed 4441.51 samples/sec Loss 1.9320 Epoch: 10 Global Step: 180250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:57:20,697-Speed 4436.88 samples/sec Loss 1.9258 Epoch: 10 Global Step: 180300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:57:32,280-Speed 4420.34 samples/sec Loss 1.9414 Epoch: 10 Global Step: 180350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:57:43,891-Speed 4409.87 samples/sec Loss 1.9234 Epoch: 10 Global Step: 180400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:57:55,362-Speed 4463.69 samples/sec Loss 1.9321 Epoch: 10 Global Step: 180450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:58:06,847-Speed 4458.17 samples/sec Loss 1.9570 Epoch: 10 Global Step: 180500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:58:18,306-Speed 4468.24 samples/sec Loss 1.9414 Epoch: 10 Global Step: 180550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:58:29,873-Speed 4426.61 samples/sec Loss 1.9596 Epoch: 10 Global Step: 180600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:58:41,246-Speed 4502.06 samples/sec Loss 1.9380 Epoch: 10 Global Step: 180650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:58:52,769-Speed 4443.23 samples/sec Loss 1.9425 Epoch: 10 Global Step: 180700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:59:06,025-Speed 3862.78 samples/sec Loss 1.9602 Epoch: 10 Global Step: 180750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:59:17,422-Speed 4492.43 samples/sec Loss 1.9346 Epoch: 10 Global Step: 180800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:59:28,789-Speed 4504.62 samples/sec Loss 1.9247 Epoch: 10 Global Step: 180850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:59:40,197-Speed 4488.38 samples/sec Loss 1.9563 Epoch: 10 Global Step: 180900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 11:59:51,828-Speed 4402.03 samples/sec Loss 1.9696 Epoch: 10 Global Step: 180950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:00:03,428-Speed 4414.26 samples/sec Loss 1.9659 Epoch: 10 Global Step: 181000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:00:15,651-Speed 4188.92 samples/sec Loss 1.9404 Epoch: 10 Global Step: 181050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:00:27,118-Speed 4465.33 samples/sec Loss 1.9307 Epoch: 10 Global Step: 181100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:00:38,684-Speed 4426.81 samples/sec Loss 1.9257 Epoch: 10 Global Step: 181150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:00:50,109-Speed 4481.55 samples/sec Loss 1.9018 Epoch: 10 Global Step: 181200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:01:01,583-Speed 4462.67 samples/sec Loss 1.9734 Epoch: 10 Global Step: 181250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:01:13,274-Speed 4379.40 samples/sec Loss 1.9346 Epoch: 10 Global Step: 181300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:01:25,616-Speed 4148.68 samples/sec Loss 1.9634 Epoch: 10 Global Step: 181350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:01:37,244-Speed 4403.36 samples/sec Loss 1.9495 Epoch: 10 Global Step: 181400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:01:48,708-Speed 4466.53 samples/sec Loss 1.9376 Epoch: 10 Global Step: 181450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:00,085-Speed 4500.43 samples/sec Loss 1.9550 Epoch: 10 Global Step: 181500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:11,486-Speed 4490.92 samples/sec Loss 1.9404 Epoch: 10 Global Step: 181550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:22,868-Speed 4498.41 samples/sec Loss 1.9566 Epoch: 10 Global Step: 181600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:34,343-Speed 4462.46 samples/sec Loss 1.9003 Epoch: 10 Global Step: 181650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:47,507-Speed 3889.60 samples/sec Loss 1.9753 Epoch: 10 Global Step: 181700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:02:58,993-Speed 4457.78 samples/sec Loss 1.9433 Epoch: 10 Global Step: 181750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:03:10,626-Speed 4401.35 samples/sec Loss 1.9456 Epoch: 10 Global Step: 181800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:03:22,289-Speed 4390.02 samples/sec Loss 1.9396 Epoch: 10 Global Step: 181850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:03:33,730-Speed 4475.46 samples/sec Loss 1.9544 Epoch: 10 Global Step: 181900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:03:45,244-Speed 4446.98 samples/sec Loss 1.9361 Epoch: 10 Global Step: 181950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:03:57,556-Speed 4158.49 samples/sec Loss 1.9414 Epoch: 10 Global Step: 182000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:04:27,798-[lfw][182000]XNorm: 22.213985 Training: 2021-03-15 12:04:27,798-[lfw][182000]Accuracy-Flip: 0.99683+-0.00293 Training: 2021-03-15 12:04:27,798-[lfw][182000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:05:02,892-[cfp_fp][182000]XNorm: 20.215941 Training: 2021-03-15 12:05:02,893-[cfp_fp][182000]Accuracy-Flip: 0.98100+-0.00726 Training: 2021-03-15 12:05:02,893-[cfp_fp][182000]Accuracy-Highest: 0.98200 Training: 2021-03-15 12:05:33,059-[agedb_30][182000]XNorm: 22.512578 Training: 2021-03-15 12:05:33,060-[agedb_30][182000]Accuracy-Flip: 0.97800+-0.00686 Training: 2021-03-15 12:05:33,060-[agedb_30][182000]Accuracy-Highest: 0.97917 Training: 2021-03-15 12:05:44,776-Speed 477.53 samples/sec Loss 1.9580 Epoch: 10 Global Step: 182050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:05:56,156-Speed 4499.58 samples/sec Loss 1.9600 Epoch: 10 Global Step: 182100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:06:07,579-Speed 4482.27 samples/sec Loss 1.9622 Epoch: 10 Global Step: 182150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:06:19,123-Speed 4435.49 samples/sec Loss 1.9506 Epoch: 10 Global Step: 182200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:06:30,494-Speed 4502.82 samples/sec Loss 1.9521 Epoch: 10 Global Step: 182250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:06:42,089-Speed 4415.65 samples/sec Loss 1.9680 Epoch: 10 Global Step: 182300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:06:53,673-Speed 4420.11 samples/sec Loss 1.9714 Epoch: 10 Global Step: 182350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:07:05,109-Speed 4477.32 samples/sec Loss 1.9630 Epoch: 10 Global Step: 182400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:07:16,637-Speed 4441.77 samples/sec Loss 1.9371 Epoch: 10 Global Step: 182450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:07:28,440-Speed 4338.08 samples/sec Loss 1.9536 Epoch: 10 Global Step: 182500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:07:39,933-Speed 4455.04 samples/sec Loss 1.9294 Epoch: 10 Global Step: 182550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:07:51,316-Speed 4498.35 samples/sec Loss 1.9540 Epoch: 10 Global Step: 182600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:08:02,886-Speed 4425.23 samples/sec Loss 1.9732 Epoch: 10 Global Step: 182650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:08:14,611-Speed 4367.11 samples/sec Loss 1.9377 Epoch: 10 Global Step: 182700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:08:26,988-Speed 4136.79 samples/sec Loss 1.9709 Epoch: 10 Global Step: 182750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:08:38,443-Speed 4469.85 samples/sec Loss 1.9579 Epoch: 10 Global Step: 182800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:08:49,932-Speed 4456.79 samples/sec Loss 1.9615 Epoch: 10 Global Step: 182850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:01,414-Speed 4459.22 samples/sec Loss 1.9671 Epoch: 10 Global Step: 182900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:12,821-Speed 4488.95 samples/sec Loss 1.9491 Epoch: 10 Global Step: 182950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:24,324-Speed 4451.05 samples/sec Loss 1.9431 Epoch: 10 Global Step: 183000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:35,794-Speed 4464.10 samples/sec Loss 1.9592 Epoch: 10 Global Step: 183050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:47,250-Speed 4469.51 samples/sec Loss 1.9621 Epoch: 10 Global Step: 183100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:09:58,953-Speed 4375.17 samples/sec Loss 1.9741 Epoch: 10 Global Step: 183150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:10:10,438-Speed 4458.26 samples/sec Loss 1.9660 Epoch: 10 Global Step: 183200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:10:21,793-Speed 4509.28 samples/sec Loss 1.9708 Epoch: 10 Global Step: 183250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:10:33,272-Speed 4460.58 samples/sec Loss 1.9777 Epoch: 10 Global Step: 183300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:10:44,734-Speed 4466.93 samples/sec Loss 1.9583 Epoch: 10 Global Step: 183350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:10:58,057-Speed 3843.26 samples/sec Loss 1.9597 Epoch: 10 Global Step: 183400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:11:09,416-Speed 4507.65 samples/sec Loss 1.9395 Epoch: 10 Global Step: 183450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:11:21,020-Speed 4412.29 samples/sec Loss 1.9636 Epoch: 10 Global Step: 183500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:11:32,466-Speed 4473.39 samples/sec Loss 1.9660 Epoch: 10 Global Step: 183550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:11:44,219-Speed 4356.45 samples/sec Loss 1.9310 Epoch: 10 Global Step: 183600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:12:08,663-Speed 2094.69 samples/sec Loss 1.4916 Epoch: 11 Global Step: 183650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:12:20,267-Speed 4412.36 samples/sec Loss 1.3656 Epoch: 11 Global Step: 183700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:12:32,659-Speed 4132.00 samples/sec Loss 1.3364 Epoch: 11 Global Step: 183750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:12:44,277-Speed 4407.20 samples/sec Loss 1.3101 Epoch: 11 Global Step: 183800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:12:55,832-Speed 4431.07 samples/sec Loss 1.2642 Epoch: 11 Global Step: 183850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:13:07,598-Speed 4351.55 samples/sec Loss 1.2614 Epoch: 11 Global Step: 183900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:13:20,163-Speed 4075.03 samples/sec Loss 1.2314 Epoch: 11 Global Step: 183950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:13:31,674-Speed 4448.13 samples/sec Loss 1.2268 Epoch: 11 Global Step: 184000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:14:02,004-[lfw][184000]XNorm: 22.716408 Training: 2021-03-15 12:14:02,004-[lfw][184000]Accuracy-Flip: 0.99750+-0.00291 Training: 2021-03-15 12:14:02,004-[lfw][184000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:14:37,190-[cfp_fp][184000]XNorm: 21.194559 Training: 2021-03-15 12:14:37,190-[cfp_fp][184000]Accuracy-Flip: 0.98386+-0.00550 Training: 2021-03-15 12:14:37,190-[cfp_fp][184000]Accuracy-Highest: 0.98386 Training: 2021-03-15 12:15:07,406-[agedb_30][184000]XNorm: 22.747642 Training: 2021-03-15 12:15:07,407-[agedb_30][184000]Accuracy-Flip: 0.97817+-0.00697 Training: 2021-03-15 12:15:07,407-[agedb_30][184000]Accuracy-Highest: 0.97917 Training: 2021-03-15 12:15:19,073-Speed 476.73 samples/sec Loss 1.2261 Epoch: 11 Global Step: 184050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:15:30,737-Speed 4389.88 samples/sec Loss 1.2253 Epoch: 11 Global Step: 184100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:15:42,310-Speed 4424.09 samples/sec Loss 1.2035 Epoch: 11 Global Step: 184150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:15:54,091-Speed 4346.08 samples/sec Loss 1.2004 Epoch: 11 Global Step: 184200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:16:05,652-Speed 4429.03 samples/sec Loss 1.1901 Epoch: 11 Global Step: 184250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:16:19,144-Speed 3795.04 samples/sec Loss 1.1741 Epoch: 11 Global Step: 184300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:16:30,461-Speed 4524.14 samples/sec Loss 1.1834 Epoch: 11 Global Step: 184350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:16:42,097-Speed 4400.55 samples/sec Loss 1.1707 Epoch: 11 Global Step: 184400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:16:53,492-Speed 4493.28 samples/sec Loss 1.1740 Epoch: 11 Global Step: 184450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:17:04,809-Speed 4524.51 samples/sec Loss 1.1781 Epoch: 11 Global Step: 184500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:17:16,261-Speed 4470.69 samples/sec Loss 1.1372 Epoch: 11 Global Step: 184550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:17:27,805-Speed 4435.52 samples/sec Loss 1.1623 Epoch: 11 Global Step: 184600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:17:40,473-Speed 4041.98 samples/sec Loss 1.1274 Epoch: 11 Global Step: 184650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:17:52,042-Speed 4425.86 samples/sec Loss 1.1268 Epoch: 11 Global Step: 184700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:18:03,547-Speed 4450.34 samples/sec Loss 1.1482 Epoch: 11 Global Step: 184750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:18:15,129-Speed 4420.86 samples/sec Loss 1.1051 Epoch: 11 Global Step: 184800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:18:26,602-Speed 4462.68 samples/sec Loss 1.1187 Epoch: 11 Global Step: 184850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:18:38,218-Speed 4407.87 samples/sec Loss 1.1235 Epoch: 11 Global Step: 184900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:18:49,817-Speed 4414.50 samples/sec Loss 1.1297 Epoch: 11 Global Step: 184950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:01,534-Speed 4369.87 samples/sec Loss 1.1299 Epoch: 11 Global Step: 185000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:13,029-Speed 4454.32 samples/sec Loss 1.1047 Epoch: 11 Global Step: 185050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:24,571-Speed 4436.09 samples/sec Loss 1.1163 Epoch: 11 Global Step: 185100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:35,911-Speed 4515.18 samples/sec Loss 1.1122 Epoch: 11 Global Step: 185150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:47,573-Speed 4390.58 samples/sec Loss 1.1087 Epoch: 11 Global Step: 185200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:19:59,222-Speed 4395.24 samples/sec Loss 1.1108 Epoch: 11 Global Step: 185250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:20:11,567-Speed 4147.61 samples/sec Loss 1.1004 Epoch: 11 Global Step: 185300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:20:23,171-Speed 4412.49 samples/sec Loss 1.0965 Epoch: 11 Global Step: 185350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:20:34,549-Speed 4499.91 samples/sec Loss 1.0779 Epoch: 11 Global Step: 185400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:20:45,935-Speed 4497.15 samples/sec Loss 1.0781 Epoch: 11 Global Step: 185450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:20:57,386-Speed 4471.38 samples/sec Loss 1.0868 Epoch: 11 Global Step: 185500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:21:08,925-Speed 4437.29 samples/sec Loss 1.0891 Epoch: 11 Global Step: 185550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:21:20,630-Speed 4374.49 samples/sec Loss 1.0982 Epoch: 11 Global Step: 185600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:21:32,061-Speed 4479.02 samples/sec Loss 1.0840 Epoch: 11 Global Step: 185650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:21:43,556-Speed 4454.57 samples/sec Loss 1.0933 Epoch: 11 Global Step: 185700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:21:55,420-Speed 4315.48 samples/sec Loss 1.0728 Epoch: 11 Global Step: 185750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:22:06,915-Speed 4454.48 samples/sec Loss 1.0690 Epoch: 11 Global Step: 185800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:22:18,393-Speed 4460.66 samples/sec Loss 1.0778 Epoch: 11 Global Step: 185850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:22:30,151-Speed 4354.96 samples/sec Loss 1.0820 Epoch: 11 Global Step: 185900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:22:42,685-Speed 4085.03 samples/sec Loss 1.0763 Epoch: 11 Global Step: 185950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:22:54,337-Speed 4394.34 samples/sec Loss 1.0710 Epoch: 11 Global Step: 186000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:23:24,336-[lfw][186000]XNorm: 22.542352 Training: 2021-03-15 12:23:24,336-[lfw][186000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-15 12:23:24,336-[lfw][186000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:23:59,326-[cfp_fp][186000]XNorm: 21.349713 Training: 2021-03-15 12:23:59,326-[cfp_fp][186000]Accuracy-Flip: 0.98671+-0.00478 Training: 2021-03-15 12:23:59,327-[cfp_fp][186000]Accuracy-Highest: 0.98671 Training: 2021-03-15 12:24:29,472-[agedb_30][186000]XNorm: 22.764433 Training: 2021-03-15 12:24:29,473-[agedb_30][186000]Accuracy-Flip: 0.98033+-0.00702 Training: 2021-03-15 12:24:29,473-[agedb_30][186000]Accuracy-Highest: 0.98033 Training: 2021-03-15 12:24:42,079-Speed 475.21 samples/sec Loss 1.0891 Epoch: 11 Global Step: 186050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:24:53,773-Speed 4378.45 samples/sec Loss 1.0720 Epoch: 11 Global Step: 186100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:25:06,192-Speed 4123.10 samples/sec Loss 1.0768 Epoch: 11 Global Step: 186150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:25:17,904-Speed 4371.78 samples/sec Loss 1.0739 Epoch: 11 Global Step: 186200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:25:29,400-Speed 4453.76 samples/sec Loss 1.0647 Epoch: 11 Global Step: 186250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:25:40,944-Speed 4435.49 samples/sec Loss 1.0532 Epoch: 11 Global Step: 186300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:25:52,520-Speed 4423.10 samples/sec Loss 1.0613 Epoch: 11 Global Step: 186350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:26:04,294-Speed 4348.83 samples/sec Loss 1.0581 Epoch: 11 Global Step: 186400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:26:16,690-Speed 4130.27 samples/sec Loss 1.0370 Epoch: 11 Global Step: 186450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:26:28,111-Speed 4483.46 samples/sec Loss 1.0851 Epoch: 11 Global Step: 186500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:26:39,733-Speed 4405.46 samples/sec Loss 1.0626 Epoch: 11 Global Step: 186550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:26:51,205-Speed 4463.33 samples/sec Loss 1.0330 Epoch: 11 Global Step: 186600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:27:02,855-Speed 4394.92 samples/sec Loss 1.0394 Epoch: 11 Global Step: 186650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:27:14,337-Speed 4459.23 samples/sec Loss 1.0601 Epoch: 11 Global Step: 186700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:27:25,844-Speed 4449.83 samples/sec Loss 1.0600 Epoch: 11 Global Step: 186750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:27:38,057-Speed 4192.46 samples/sec Loss 1.0591 Epoch: 11 Global Step: 186800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:27:50,493-Speed 4117.18 samples/sec Loss 1.0446 Epoch: 11 Global Step: 186850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:28:02,020-Speed 4441.91 samples/sec Loss 1.0443 Epoch: 11 Global Step: 186900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-15 12:28:13,570-Speed 4433.10 samples/sec Loss 1.0539 Epoch: 11 Global Step: 186950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:28:25,296-Speed 4366.41 samples/sec Loss 1.0474 Epoch: 11 Global Step: 187000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:28:36,619-Speed 4522.20 samples/sec Loss 1.0355 Epoch: 11 Global Step: 187050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:28:48,123-Speed 4450.77 samples/sec Loss 1.0361 Epoch: 11 Global Step: 187100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:28:59,535-Speed 4486.49 samples/sec Loss 1.0268 Epoch: 11 Global Step: 187150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:29:10,974-Speed 4476.25 samples/sec Loss 1.0313 Epoch: 11 Global Step: 187200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:29:23,485-Speed 4092.66 samples/sec Loss 1.0249 Epoch: 11 Global Step: 187250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:29:35,200-Speed 4370.56 samples/sec Loss 1.0322 Epoch: 11 Global Step: 187300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:29:46,591-Speed 4495.18 samples/sec Loss 1.0341 Epoch: 11 Global Step: 187350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:29:58,368-Speed 4347.55 samples/sec Loss 1.0220 Epoch: 11 Global Step: 187400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:30:10,033-Speed 4389.21 samples/sec Loss 1.0314 Epoch: 11 Global Step: 187450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:30:21,758-Speed 4366.91 samples/sec Loss 1.0165 Epoch: 11 Global Step: 187500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:30:33,473-Speed 4370.95 samples/sec Loss 1.0351 Epoch: 11 Global Step: 187550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:30:45,054-Speed 4421.04 samples/sec Loss 1.0327 Epoch: 11 Global Step: 187600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:30:56,547-Speed 4455.18 samples/sec Loss 1.0311 Epoch: 11 Global Step: 187650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:31:08,971-Speed 4121.15 samples/sec Loss 1.0106 Epoch: 11 Global Step: 187700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:31:20,628-Speed 4392.47 samples/sec Loss 1.0100 Epoch: 11 Global Step: 187750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:31:32,184-Speed 4430.74 samples/sec Loss 1.0330 Epoch: 11 Global Step: 187800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:31:43,630-Speed 4473.23 samples/sec Loss 1.0182 Epoch: 11 Global Step: 187850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:31:55,239-Speed 4410.62 samples/sec Loss 1.0232 Epoch: 11 Global Step: 187900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:32:06,718-Speed 4460.39 samples/sec Loss 1.0229 Epoch: 11 Global Step: 187950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:32:18,193-Speed 4462.27 samples/sec Loss 1.0330 Epoch: 11 Global Step: 188000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:32:48,492-[lfw][188000]XNorm: 22.539879 Training: 2021-03-15 12:32:48,492-[lfw][188000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 12:32:48,493-[lfw][188000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:33:23,794-[cfp_fp][188000]XNorm: 21.335900 Training: 2021-03-15 12:33:23,794-[cfp_fp][188000]Accuracy-Flip: 0.98757+-0.00404 Training: 2021-03-15 12:33:23,795-[cfp_fp][188000]Accuracy-Highest: 0.98757 Training: 2021-03-15 12:33:54,178-[agedb_30][188000]XNorm: 22.636841 Training: 2021-03-15 12:33:54,179-[agedb_30][188000]Accuracy-Flip: 0.98000+-0.00641 Training: 2021-03-15 12:33:54,179-[agedb_30][188000]Accuracy-Highest: 0.98033 Training: 2021-03-15 12:34:05,457-Speed 477.33 samples/sec Loss 1.0247 Epoch: 11 Global Step: 188050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:34:16,865-Speed 4488.29 samples/sec Loss 1.0277 Epoch: 11 Global Step: 188100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:34:28,209-Speed 4513.62 samples/sec Loss 1.0175 Epoch: 11 Global Step: 188150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:34:39,758-Speed 4433.47 samples/sec Loss 1.0292 Epoch: 11 Global Step: 188200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:34:51,479-Speed 4368.20 samples/sec Loss 1.0135 Epoch: 11 Global Step: 188250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:35:03,174-Speed 4378.32 samples/sec Loss 1.0188 Epoch: 11 Global Step: 188300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:35:14,630-Speed 4469.48 samples/sec Loss 1.0055 Epoch: 11 Global Step: 188350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:35:25,902-Speed 4542.37 samples/sec Loss 1.0091 Epoch: 11 Global Step: 188400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:35:37,326-Speed 4482.03 samples/sec Loss 1.0048 Epoch: 11 Global Step: 188450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:35:48,974-Speed 4395.79 samples/sec Loss 0.9967 Epoch: 11 Global Step: 188500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:36:03,431-Speed 3541.46 samples/sec Loss 1.0077 Epoch: 11 Global Step: 188550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:36:14,836-Speed 4489.56 samples/sec Loss 1.0135 Epoch: 11 Global Step: 188600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:36:26,207-Speed 4503.11 samples/sec Loss 1.0050 Epoch: 11 Global Step: 188650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:36:37,670-Speed 4466.40 samples/sec Loss 0.9897 Epoch: 11 Global Step: 188700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:36:49,094-Speed 4481.98 samples/sec Loss 1.0060 Epoch: 11 Global Step: 188750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:00,331-Speed 4556.59 samples/sec Loss 1.0236 Epoch: 11 Global Step: 188800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:11,801-Speed 4463.99 samples/sec Loss 1.0090 Epoch: 11 Global Step: 188850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:23,325-Speed 4443.21 samples/sec Loss 1.0076 Epoch: 11 Global Step: 188900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:35,224-Speed 4302.97 samples/sec Loss 1.0119 Epoch: 11 Global Step: 188950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:47,675-Speed 4112.41 samples/sec Loss 1.0047 Epoch: 11 Global Step: 189000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:37:59,155-Speed 4460.26 samples/sec Loss 1.0019 Epoch: 11 Global Step: 189050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:38:10,640-Speed 4457.89 samples/sec Loss 0.9954 Epoch: 11 Global Step: 189100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:38:22,269-Speed 4403.02 samples/sec Loss 1.0031 Epoch: 11 Global Step: 189150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:38:33,625-Speed 4508.93 samples/sec Loss 1.0037 Epoch: 11 Global Step: 189200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:38:45,193-Speed 4426.33 samples/sec Loss 1.0196 Epoch: 11 Global Step: 189250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:38:58,455-Speed 3860.65 samples/sec Loss 1.0151 Epoch: 11 Global Step: 189300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:39:09,870-Speed 4485.42 samples/sec Loss 1.0003 Epoch: 11 Global Step: 189350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:39:21,430-Speed 4429.53 samples/sec Loss 0.9964 Epoch: 11 Global Step: 189400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:39:33,025-Speed 4415.56 samples/sec Loss 0.9984 Epoch: 11 Global Step: 189450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:39:44,547-Speed 4443.82 samples/sec Loss 1.0037 Epoch: 11 Global Step: 189500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:39:56,237-Speed 4380.06 samples/sec Loss 1.0186 Epoch: 11 Global Step: 189550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:40:07,818-Speed 4421.12 samples/sec Loss 0.9992 Epoch: 11 Global Step: 189600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:40:19,402-Speed 4420.21 samples/sec Loss 0.9838 Epoch: 11 Global Step: 189650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:40:31,310-Speed 4299.85 samples/sec Loss 0.9990 Epoch: 11 Global Step: 189700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:40:42,878-Speed 4426.07 samples/sec Loss 0.9915 Epoch: 11 Global Step: 189750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:40:55,165-Speed 4167.23 samples/sec Loss 0.9839 Epoch: 11 Global Step: 189800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:41:06,706-Speed 4436.71 samples/sec Loss 0.9950 Epoch: 11 Global Step: 189850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:41:18,428-Speed 4368.03 samples/sec Loss 1.0007 Epoch: 11 Global Step: 189900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:41:29,916-Speed 4456.74 samples/sec Loss 0.9843 Epoch: 11 Global Step: 189950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:41:41,517-Speed 4413.73 samples/sec Loss 0.9909 Epoch: 11 Global Step: 190000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:42:11,652-[lfw][190000]XNorm: 22.364760 Training: 2021-03-15 12:42:11,652-[lfw][190000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 12:42:11,653-[lfw][190000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:42:46,585-[cfp_fp][190000]XNorm: 21.214999 Training: 2021-03-15 12:42:46,586-[cfp_fp][190000]Accuracy-Flip: 0.98857+-0.00495 Training: 2021-03-15 12:42:46,586-[cfp_fp][190000]Accuracy-Highest: 0.98857 Training: 2021-03-15 12:43:16,668-[agedb_30][190000]XNorm: 22.602927 Training: 2021-03-15 12:43:16,668-[agedb_30][190000]Accuracy-Flip: 0.98217+-0.00803 Training: 2021-03-15 12:43:16,668-[agedb_30][190000]Accuracy-Highest: 0.98217 Training: 2021-03-15 12:43:28,158-Speed 480.12 samples/sec Loss 1.0003 Epoch: 11 Global Step: 190050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:43:39,657-Speed 4452.67 samples/sec Loss 0.9907 Epoch: 11 Global Step: 190100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:43:51,955-Speed 4163.54 samples/sec Loss 0.9748 Epoch: 11 Global Step: 190150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:44:03,770-Speed 4333.47 samples/sec Loss 0.9805 Epoch: 11 Global Step: 190200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:44:15,355-Speed 4419.81 samples/sec Loss 0.9962 Epoch: 11 Global Step: 190250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:44:27,154-Speed 4339.71 samples/sec Loss 0.9916 Epoch: 11 Global Step: 190300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:44:38,645-Speed 4455.66 samples/sec Loss 1.0044 Epoch: 11 Global Step: 190350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:44:50,090-Speed 4473.84 samples/sec Loss 0.9926 Epoch: 11 Global Step: 190400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:01,432-Speed 4514.48 samples/sec Loss 0.9758 Epoch: 11 Global Step: 190450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:13,016-Speed 4419.98 samples/sec Loss 0.9692 Epoch: 11 Global Step: 190500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:24,365-Speed 4511.70 samples/sec Loss 0.9883 Epoch: 11 Global Step: 190550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:35,880-Speed 4446.65 samples/sec Loss 0.9682 Epoch: 11 Global Step: 190600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:47,371-Speed 4455.53 samples/sec Loss 0.9903 Epoch: 11 Global Step: 190650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:45:58,803-Speed 4479.17 samples/sec Loss 0.9816 Epoch: 11 Global Step: 190700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:46:10,332-Speed 4441.02 samples/sec Loss 0.9928 Epoch: 11 Global Step: 190750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:46:21,839-Speed 4449.60 samples/sec Loss 0.9798 Epoch: 11 Global Step: 190800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:46:33,401-Speed 4428.63 samples/sec Loss 0.9803 Epoch: 11 Global Step: 190850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:46:44,890-Speed 4456.76 samples/sec Loss 0.9840 Epoch: 11 Global Step: 190900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:46:56,362-Speed 4463.14 samples/sec Loss 0.9804 Epoch: 11 Global Step: 190950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:47:07,885-Speed 4443.54 samples/sec Loss 0.9784 Epoch: 11 Global Step: 191000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:47:19,441-Speed 4430.95 samples/sec Loss 0.9863 Epoch: 11 Global Step: 191050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:47:32,786-Speed 3836.80 samples/sec Loss 0.9738 Epoch: 11 Global Step: 191100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:47:45,244-Speed 4109.90 samples/sec Loss 0.9850 Epoch: 11 Global Step: 191150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:47:56,642-Speed 4492.26 samples/sec Loss 0.9785 Epoch: 11 Global Step: 191200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:48:08,144-Speed 4451.42 samples/sec Loss 0.9588 Epoch: 11 Global Step: 191250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:48:19,858-Speed 4370.93 samples/sec Loss 0.9638 Epoch: 11 Global Step: 191300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:48:31,363-Speed 4450.50 samples/sec Loss 0.9618 Epoch: 11 Global Step: 191350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:48:42,752-Speed 4495.85 samples/sec Loss 0.9907 Epoch: 11 Global Step: 191400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:48:55,399-Speed 4048.49 samples/sec Loss 0.9945 Epoch: 11 Global Step: 191450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:49:07,030-Speed 4402.14 samples/sec Loss 0.9679 Epoch: 11 Global Step: 191500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:49:18,686-Speed 4392.90 samples/sec Loss 0.9784 Epoch: 11 Global Step: 191550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:49:30,184-Speed 4453.09 samples/sec Loss 0.9770 Epoch: 11 Global Step: 191600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:49:41,596-Speed 4486.90 samples/sec Loss 0.9659 Epoch: 11 Global Step: 191650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:49:53,227-Speed 4401.89 samples/sec Loss 0.9560 Epoch: 11 Global Step: 191700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:50:05,576-Speed 4146.49 samples/sec Loss 0.9694 Epoch: 11 Global Step: 191750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:50:17,154-Speed 4422.17 samples/sec Loss 0.9696 Epoch: 11 Global Step: 191800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:50:29,396-Speed 4182.43 samples/sec Loss 0.9776 Epoch: 11 Global Step: 191850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:50:40,775-Speed 4499.80 samples/sec Loss 0.9711 Epoch: 11 Global Step: 191900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:50:52,400-Speed 4404.32 samples/sec Loss 0.9652 Epoch: 11 Global Step: 191950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:51:03,864-Speed 4466.54 samples/sec Loss 0.9756 Epoch: 11 Global Step: 192000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:51:34,104-[lfw][192000]XNorm: 22.431943 Training: 2021-03-15 12:51:34,105-[lfw][192000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-15 12:51:34,105-[lfw][192000]Accuracy-Highest: 0.99833 Training: 2021-03-15 12:52:09,266-[cfp_fp][192000]XNorm: 21.586501 Training: 2021-03-15 12:52:09,266-[cfp_fp][192000]Accuracy-Flip: 0.98700+-0.00513 Training: 2021-03-15 12:52:09,266-[cfp_fp][192000]Accuracy-Highest: 0.98857 Training: 2021-03-15 12:52:39,580-[agedb_30][192000]XNorm: 22.948293 Training: 2021-03-15 12:52:39,581-[agedb_30][192000]Accuracy-Flip: 0.98217+-0.00533 Training: 2021-03-15 12:52:39,581-[agedb_30][192000]Accuracy-Highest: 0.98217 Training: 2021-03-15 12:52:50,938-Speed 478.18 samples/sec Loss 0.9764 Epoch: 11 Global Step: 192050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:53:02,328-Speed 4495.35 samples/sec Loss 0.9603 Epoch: 11 Global Step: 192100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:53:13,823-Speed 4454.09 samples/sec Loss 0.9787 Epoch: 11 Global Step: 192150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:53:25,396-Speed 4424.54 samples/sec Loss 0.9874 Epoch: 11 Global Step: 192200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:53:37,080-Speed 4382.23 samples/sec Loss 0.9747 Epoch: 11 Global Step: 192250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:53:48,604-Speed 4443.07 samples/sec Loss 0.9784 Epoch: 11 Global Step: 192300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:01,337-Speed 4021.13 samples/sec Loss 0.9762 Epoch: 11 Global Step: 192350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:12,780-Speed 4474.44 samples/sec Loss 0.9542 Epoch: 11 Global Step: 192400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:24,421-Speed 4398.61 samples/sec Loss 0.9595 Epoch: 11 Global Step: 192450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:35,919-Speed 4453.07 samples/sec Loss 0.9678 Epoch: 11 Global Step: 192500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:47,448-Speed 4441.08 samples/sec Loss 0.9679 Epoch: 11 Global Step: 192550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:54:59,833-Speed 4134.10 samples/sec Loss 0.9537 Epoch: 11 Global Step: 192600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:55:11,311-Speed 4460.86 samples/sec Loss 0.9605 Epoch: 11 Global Step: 192650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:55:22,744-Speed 4478.61 samples/sec Loss 0.9635 Epoch: 11 Global Step: 192700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:55:34,306-Speed 4428.45 samples/sec Loss 0.9666 Epoch: 11 Global Step: 192750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:55:45,746-Speed 4475.73 samples/sec Loss 0.9549 Epoch: 11 Global Step: 192800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:55:57,500-Speed 4356.21 samples/sec Loss 0.9645 Epoch: 11 Global Step: 192850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:56:09,161-Speed 4390.73 samples/sec Loss 0.9822 Epoch: 11 Global Step: 192900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:56:20,922-Speed 4353.54 samples/sec Loss 0.9724 Epoch: 11 Global Step: 192950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:56:32,604-Speed 4382.90 samples/sec Loss 0.9544 Epoch: 11 Global Step: 193000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:56:44,396-Speed 4342.30 samples/sec Loss 0.9611 Epoch: 11 Global Step: 193050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:56:55,983-Speed 4418.84 samples/sec Loss 0.9557 Epoch: 11 Global Step: 193100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:57:07,326-Speed 4514.11 samples/sec Loss 0.9854 Epoch: 11 Global Step: 193150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:57:18,567-Speed 4554.80 samples/sec Loss 0.9541 Epoch: 11 Global Step: 193200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:57:30,048-Speed 4459.74 samples/sec Loss 0.9548 Epoch: 11 Global Step: 193250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:57:41,718-Speed 4387.64 samples/sec Loss 0.9537 Epoch: 11 Global Step: 193300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:57:53,309-Speed 4417.38 samples/sec Loss 0.9482 Epoch: 11 Global Step: 193350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:58:04,708-Speed 4491.60 samples/sec Loss 0.9484 Epoch: 11 Global Step: 193400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:58:16,236-Speed 4441.59 samples/sec Loss 0.9605 Epoch: 11 Global Step: 193450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:58:27,960-Speed 4367.29 samples/sec Loss 0.9554 Epoch: 11 Global Step: 193500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:58:40,297-Speed 4150.26 samples/sec Loss 0.9563 Epoch: 11 Global Step: 193550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:58:51,741-Speed 4474.20 samples/sec Loss 0.9666 Epoch: 11 Global Step: 193600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:59:05,132-Speed 3823.75 samples/sec Loss 0.9363 Epoch: 11 Global Step: 193650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:59:16,439-Speed 4528.27 samples/sec Loss 0.9502 Epoch: 11 Global Step: 193700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:59:27,871-Speed 4478.62 samples/sec Loss 0.9505 Epoch: 11 Global Step: 193750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:59:39,403-Speed 4440.23 samples/sec Loss 0.9455 Epoch: 11 Global Step: 193800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 12:59:50,944-Speed 4436.48 samples/sec Loss 0.9603 Epoch: 11 Global Step: 193850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:00:03,659-Speed 4026.78 samples/sec Loss 0.9289 Epoch: 11 Global Step: 193900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:00:14,965-Speed 4528.80 samples/sec Loss 0.9565 Epoch: 11 Global Step: 193950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:00:26,646-Speed 4383.46 samples/sec Loss 0.9636 Epoch: 11 Global Step: 194000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:00:56,987-[lfw][194000]XNorm: 21.976471 Training: 2021-03-15 13:00:56,987-[lfw][194000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 13:00:56,987-[lfw][194000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:01:32,287-[cfp_fp][194000]XNorm: 21.265495 Training: 2021-03-15 13:01:32,287-[cfp_fp][194000]Accuracy-Flip: 0.98786+-0.00488 Training: 2021-03-15 13:01:32,287-[cfp_fp][194000]Accuracy-Highest: 0.98857 Training: 2021-03-15 13:02:02,684-[agedb_30][194000]XNorm: 22.474744 Training: 2021-03-15 13:02:02,684-[agedb_30][194000]Accuracy-Flip: 0.98133+-0.00694 Training: 2021-03-15 13:02:02,684-[agedb_30][194000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:02:14,152-Speed 476.25 samples/sec Loss 0.9513 Epoch: 11 Global Step: 194050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:02:25,709-Speed 4430.58 samples/sec Loss 0.9435 Epoch: 11 Global Step: 194100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:02:36,960-Speed 4551.01 samples/sec Loss 0.9314 Epoch: 11 Global Step: 194150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:02:48,813-Speed 4319.55 samples/sec Loss 0.9551 Epoch: 11 Global Step: 194200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:00,207-Speed 4493.91 samples/sec Loss 0.9592 Epoch: 11 Global Step: 194250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:12,616-Speed 4126.13 samples/sec Loss 0.9366 Epoch: 11 Global Step: 194300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:24,140-Speed 4443.27 samples/sec Loss 0.9402 Epoch: 11 Global Step: 194350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:36,537-Speed 4130.01 samples/sec Loss 0.9523 Epoch: 11 Global Step: 194400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:47,930-Speed 4494.32 samples/sec Loss 0.9627 Epoch: 11 Global Step: 194450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:03:59,656-Speed 4366.71 samples/sec Loss 0.9421 Epoch: 11 Global Step: 194500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:04:11,207-Speed 4432.53 samples/sec Loss 0.9626 Epoch: 11 Global Step: 194550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:04:22,744-Speed 4438.25 samples/sec Loss 0.9596 Epoch: 11 Global Step: 194600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:04:34,158-Speed 4485.66 samples/sec Loss 0.9500 Epoch: 11 Global Step: 194650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:04:45,749-Speed 4417.39 samples/sec Loss 0.9430 Epoch: 11 Global Step: 194700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:04:57,313-Speed 4427.95 samples/sec Loss 0.9600 Epoch: 11 Global Step: 194750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:05:08,659-Speed 4512.73 samples/sec Loss 0.9464 Epoch: 11 Global Step: 194800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:05:20,029-Speed 4503.16 samples/sec Loss 0.9425 Epoch: 11 Global Step: 194850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:05:31,577-Speed 4434.08 samples/sec Loss 0.9530 Epoch: 11 Global Step: 194900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:05:43,827-Speed 4179.73 samples/sec Loss 0.9531 Epoch: 11 Global Step: 194950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:05:55,387-Speed 4429.03 samples/sec Loss 0.9485 Epoch: 11 Global Step: 195000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:06:06,990-Speed 4413.02 samples/sec Loss 0.9543 Epoch: 11 Global Step: 195050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:06:19,358-Speed 4139.69 samples/sec Loss 0.9584 Epoch: 11 Global Step: 195100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:06:30,885-Speed 4442.22 samples/sec Loss 0.9310 Epoch: 11 Global Step: 195150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:06:42,231-Speed 4512.85 samples/sec Loss 0.9483 Epoch: 11 Global Step: 195200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:06:53,771-Speed 4436.76 samples/sec Loss 0.9373 Epoch: 11 Global Step: 195250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:07:05,399-Speed 4403.28 samples/sec Loss 0.9655 Epoch: 11 Global Step: 195300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:07:17,120-Speed 4368.53 samples/sec Loss 0.9361 Epoch: 11 Global Step: 195350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:07:28,666-Speed 4434.71 samples/sec Loss 0.9412 Epoch: 11 Global Step: 195400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:07:40,082-Speed 4485.17 samples/sec Loss 0.9443 Epoch: 11 Global Step: 195450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:07:51,489-Speed 4488.51 samples/sec Loss 0.9338 Epoch: 11 Global Step: 195500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:08:03,004-Speed 4446.50 samples/sec Loss 0.9438 Epoch: 11 Global Step: 195550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:08:14,432-Speed 4480.42 samples/sec Loss 0.9342 Epoch: 11 Global Step: 195600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:08:26,174-Speed 4360.53 samples/sec Loss 0.9374 Epoch: 11 Global Step: 195650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:08:37,528-Speed 4509.72 samples/sec Loss 0.9200 Epoch: 11 Global Step: 195700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:08:49,012-Speed 4458.46 samples/sec Loss 0.9486 Epoch: 11 Global Step: 195750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:09:00,543-Speed 4440.37 samples/sec Loss 0.9323 Epoch: 11 Global Step: 195800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:09:12,005-Speed 4467.31 samples/sec Loss 0.9216 Epoch: 11 Global Step: 195850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:09:23,542-Speed 4437.88 samples/sec Loss 0.9285 Epoch: 11 Global Step: 195900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:09:34,987-Speed 4473.66 samples/sec Loss 0.9353 Epoch: 11 Global Step: 195950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:09:47,254-Speed 4174.10 samples/sec Loss 0.9316 Epoch: 11 Global Step: 196000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:10:17,439-[lfw][196000]XNorm: 22.280926 Training: 2021-03-15 13:10:17,439-[lfw][196000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-15 13:10:17,439-[lfw][196000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:10:52,467-[cfp_fp][196000]XNorm: 21.535415 Training: 2021-03-15 13:10:52,468-[cfp_fp][196000]Accuracy-Flip: 0.98986+-0.00435 Training: 2021-03-15 13:10:52,468-[cfp_fp][196000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:11:22,759-[agedb_30][196000]XNorm: 22.725126 Training: 2021-03-15 13:11:22,759-[agedb_30][196000]Accuracy-Flip: 0.98033+-0.00714 Training: 2021-03-15 13:11:22,759-[agedb_30][196000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:11:34,442-Speed 477.67 samples/sec Loss 0.9413 Epoch: 11 Global Step: 196050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:11:46,046-Speed 4412.47 samples/sec Loss 0.9541 Epoch: 11 Global Step: 196100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:11:57,836-Speed 4342.84 samples/sec Loss 0.9272 Epoch: 11 Global Step: 196150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:12:10,296-Speed 4109.36 samples/sec Loss 0.9368 Epoch: 11 Global Step: 196200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:12:22,532-Speed 4184.60 samples/sec Loss 0.9310 Epoch: 11 Global Step: 196250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:12:33,978-Speed 4473.05 samples/sec Loss 0.9320 Epoch: 11 Global Step: 196300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:12:45,541-Speed 4428.21 samples/sec Loss 0.9082 Epoch: 11 Global Step: 196350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:12:57,986-Speed 4114.22 samples/sec Loss 0.9399 Epoch: 11 Global Step: 196400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:13:09,588-Speed 4413.30 samples/sec Loss 0.9473 Epoch: 11 Global Step: 196450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:13:21,144-Speed 4430.63 samples/sec Loss 0.9314 Epoch: 11 Global Step: 196500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:13:32,631-Speed 4457.47 samples/sec Loss 0.9373 Epoch: 11 Global Step: 196550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:13:44,335-Speed 4374.62 samples/sec Loss 0.9153 Epoch: 11 Global Step: 196600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:13:55,833-Speed 4453.30 samples/sec Loss 0.9324 Epoch: 11 Global Step: 196650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:14:07,200-Speed 4504.63 samples/sec Loss 0.9496 Epoch: 11 Global Step: 196700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:14:19,440-Speed 4183.01 samples/sec Loss 0.9327 Epoch: 11 Global Step: 196750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:14:31,092-Speed 4394.17 samples/sec Loss 0.9273 Epoch: 11 Global Step: 196800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:14:42,648-Speed 4431.02 samples/sec Loss 0.9542 Epoch: 11 Global Step: 196850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:14:54,189-Speed 4436.65 samples/sec Loss 0.9394 Epoch: 11 Global Step: 196900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:15:06,552-Speed 4141.57 samples/sec Loss 0.9420 Epoch: 11 Global Step: 196950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:15:17,935-Speed 4498.11 samples/sec Loss 0.9461 Epoch: 11 Global Step: 197000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:15:29,637-Speed 4375.28 samples/sec Loss 0.9403 Epoch: 11 Global Step: 197050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:15:41,458-Speed 4331.35 samples/sec Loss 0.9470 Epoch: 11 Global Step: 197100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:15:53,096-Speed 4399.85 samples/sec Loss 0.9277 Epoch: 11 Global Step: 197150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:16:04,722-Speed 4403.89 samples/sec Loss 0.9249 Epoch: 11 Global Step: 197200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:16:16,078-Speed 4508.93 samples/sec Loss 0.9329 Epoch: 11 Global Step: 197250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:16:27,467-Speed 4495.71 samples/sec Loss 0.9379 Epoch: 11 Global Step: 197300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:16:38,974-Speed 4449.54 samples/sec Loss 0.9429 Epoch: 11 Global Step: 197350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:16:50,629-Speed 4393.44 samples/sec Loss 0.9335 Epoch: 11 Global Step: 197400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:17:03,158-Speed 4086.57 samples/sec Loss 0.9340 Epoch: 11 Global Step: 197450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:17:14,916-Speed 4354.70 samples/sec Loss 0.9289 Epoch: 11 Global Step: 197500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:17:26,550-Speed 4400.92 samples/sec Loss 0.9278 Epoch: 11 Global Step: 197550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:17:38,738-Speed 4201.19 samples/sec Loss 0.9315 Epoch: 11 Global Step: 197600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:17:49,988-Speed 4551.00 samples/sec Loss 0.9276 Epoch: 11 Global Step: 197650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:01,355-Speed 4504.62 samples/sec Loss 0.9176 Epoch: 11 Global Step: 197700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:13,024-Speed 4387.82 samples/sec Loss 0.9326 Epoch: 11 Global Step: 197750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:24,729-Speed 4374.48 samples/sec Loss 0.9474 Epoch: 11 Global Step: 197800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:36,121-Speed 4494.69 samples/sec Loss 0.9221 Epoch: 11 Global Step: 197850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:47,612-Speed 4455.64 samples/sec Loss 0.9403 Epoch: 11 Global Step: 197900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:18:59,033-Speed 4483.16 samples/sec Loss 0.9354 Epoch: 11 Global Step: 197950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:19:10,758-Speed 4366.87 samples/sec Loss 0.9403 Epoch: 11 Global Step: 198000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:19:41,059-[lfw][198000]XNorm: 22.100007 Training: 2021-03-15 13:19:41,059-[lfw][198000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-15 13:19:41,059-[lfw][198000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:20:16,266-[cfp_fp][198000]XNorm: 21.357132 Training: 2021-03-15 13:20:16,267-[cfp_fp][198000]Accuracy-Flip: 0.98943+-0.00420 Training: 2021-03-15 13:20:16,267-[cfp_fp][198000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:20:46,681-[agedb_30][198000]XNorm: 22.590082 Training: 2021-03-15 13:20:46,681-[agedb_30][198000]Accuracy-Flip: 0.98200+-0.00674 Training: 2021-03-15 13:20:46,681-[agedb_30][198000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:20:58,187-Speed 476.60 samples/sec Loss 0.9118 Epoch: 11 Global Step: 198050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:21:10,095-Speed 4299.84 samples/sec Loss 0.9221 Epoch: 11 Global Step: 198100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:21:21,642-Speed 4433.91 samples/sec Loss 0.9331 Epoch: 11 Global Step: 198150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:21:33,091-Speed 4472.52 samples/sec Loss 0.9341 Epoch: 11 Global Step: 198200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:21:44,630-Speed 4437.07 samples/sec Loss 0.9225 Epoch: 11 Global Step: 198250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:21:56,185-Speed 4431.22 samples/sec Loss 0.9123 Epoch: 11 Global Step: 198300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:22:07,836-Speed 4394.79 samples/sec Loss 0.9400 Epoch: 11 Global Step: 198350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:22:19,622-Speed 4344.19 samples/sec Loss 0.9232 Epoch: 11 Global Step: 198400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:22:32,130-Speed 4093.46 samples/sec Loss 0.9113 Epoch: 11 Global Step: 198450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:22:43,573-Speed 4474.76 samples/sec Loss 0.9153 Epoch: 11 Global Step: 198500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:22:55,092-Speed 4444.88 samples/sec Loss 0.9207 Epoch: 11 Global Step: 198550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:23:06,579-Speed 4457.22 samples/sec Loss 0.9276 Epoch: 11 Global Step: 198600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:23:18,391-Speed 4334.79 samples/sec Loss 0.9207 Epoch: 11 Global Step: 198650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:23:30,820-Speed 4119.50 samples/sec Loss 0.9304 Epoch: 11 Global Step: 198700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:23:42,398-Speed 4422.35 samples/sec Loss 0.9170 Epoch: 11 Global Step: 198750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:23:54,780-Speed 4135.24 samples/sec Loss 0.9141 Epoch: 11 Global Step: 198800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:24:06,324-Speed 4435.47 samples/sec Loss 0.9310 Epoch: 11 Global Step: 198850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:24:18,820-Speed 4097.39 samples/sec Loss 0.9381 Epoch: 11 Global Step: 198900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:24:30,496-Speed 4385.20 samples/sec Loss 0.9272 Epoch: 11 Global Step: 198950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:24:41,987-Speed 4456.14 samples/sec Loss 0.9068 Epoch: 11 Global Step: 199000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:24:53,443-Speed 4469.35 samples/sec Loss 0.9077 Epoch: 11 Global Step: 199050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:25:05,072-Speed 4403.12 samples/sec Loss 0.9282 Epoch: 11 Global Step: 199100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:25:16,659-Speed 4418.65 samples/sec Loss 0.9156 Epoch: 11 Global Step: 199150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:25:28,129-Speed 4464.20 samples/sec Loss 0.9361 Epoch: 11 Global Step: 199200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:25:40,424-Speed 4164.28 samples/sec Loss 0.9156 Epoch: 11 Global Step: 199250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:25:52,139-Speed 4370.84 samples/sec Loss 0.9420 Epoch: 11 Global Step: 199300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:26:03,997-Speed 4317.96 samples/sec Loss 0.9153 Epoch: 11 Global Step: 199350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:26:15,499-Speed 4451.38 samples/sec Loss 0.9265 Epoch: 11 Global Step: 199400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:26:27,841-Speed 4148.48 samples/sec Loss 0.9244 Epoch: 11 Global Step: 199450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:26:39,458-Speed 4407.61 samples/sec Loss 0.9310 Epoch: 11 Global Step: 199500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:26:51,506-Speed 4249.85 samples/sec Loss 0.9289 Epoch: 11 Global Step: 199550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:27:03,251-Speed 4359.41 samples/sec Loss 0.9164 Epoch: 11 Global Step: 199600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-15 13:27:14,749-Speed 4453.24 samples/sec Loss 0.9322 Epoch: 11 Global Step: 199650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:27:26,490-Speed 4360.97 samples/sec Loss 0.9046 Epoch: 11 Global Step: 199700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:27:38,333-Speed 4323.35 samples/sec Loss 0.9412 Epoch: 11 Global Step: 199750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:27:49,975-Speed 4398.20 samples/sec Loss 0.9086 Epoch: 11 Global Step: 199800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:28:01,639-Speed 4389.58 samples/sec Loss 0.9338 Epoch: 11 Global Step: 199850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:28:13,368-Speed 4365.31 samples/sec Loss 0.9052 Epoch: 11 Global Step: 199900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:28:24,885-Speed 4446.14 samples/sec Loss 0.9188 Epoch: 11 Global Step: 199950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:28:37,432-Speed 4080.56 samples/sec Loss 0.9394 Epoch: 11 Global Step: 200000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:29:07,643-[lfw][200000]XNorm: 22.281824 Training: 2021-03-15 13:29:07,644-[lfw][200000]Accuracy-Flip: 0.99783+-0.00279 Training: 2021-03-15 13:29:07,644-[lfw][200000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:29:42,854-[cfp_fp][200000]XNorm: 21.645849 Training: 2021-03-15 13:29:42,855-[cfp_fp][200000]Accuracy-Flip: 0.98786+-0.00483 Training: 2021-03-15 13:29:42,855-[cfp_fp][200000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:30:13,189-[agedb_30][200000]XNorm: 22.730642 Training: 2021-03-15 13:30:13,189-[agedb_30][200000]Accuracy-Flip: 0.98083+-0.00757 Training: 2021-03-15 13:30:13,189-[agedb_30][200000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:30:24,765-Speed 477.02 samples/sec Loss 0.9272 Epoch: 11 Global Step: 200050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:30:37,149-Speed 4134.35 samples/sec Loss 0.9043 Epoch: 11 Global Step: 200100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:30:48,820-Speed 4387.40 samples/sec Loss 0.9102 Epoch: 11 Global Step: 200150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:31:00,587-Speed 4351.09 samples/sec Loss 0.9349 Epoch: 11 Global Step: 200200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:31:12,462-Speed 4311.85 samples/sec Loss 0.9063 Epoch: 11 Global Step: 200250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:31:36,872-Speed 2097.55 samples/sec Loss 0.9171 Epoch: 12 Global Step: 200300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:31:49,083-Speed 4193.09 samples/sec Loss 0.8556 Epoch: 12 Global Step: 200350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:01,154-Speed 4241.89 samples/sec Loss 0.8441 Epoch: 12 Global Step: 200400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:12,861-Speed 4373.53 samples/sec Loss 0.8411 Epoch: 12 Global Step: 200450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:24,502-Speed 4398.64 samples/sec Loss 0.8346 Epoch: 12 Global Step: 200500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:36,074-Speed 4424.61 samples/sec Loss 0.8587 Epoch: 12 Global Step: 200550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:47,710-Speed 4400.23 samples/sec Loss 0.8539 Epoch: 12 Global Step: 200600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:32:59,399-Speed 4380.48 samples/sec Loss 0.8457 Epoch: 12 Global Step: 200650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:33:11,218-Speed 4331.99 samples/sec Loss 0.8399 Epoch: 12 Global Step: 200700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:33:22,837-Speed 4406.86 samples/sec Loss 0.8525 Epoch: 12 Global Step: 200750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:33:34,367-Speed 4440.99 samples/sec Loss 0.8390 Epoch: 12 Global Step: 200800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:33:46,074-Speed 4373.57 samples/sec Loss 0.8453 Epoch: 12 Global Step: 200850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:33:57,840-Speed 4351.77 samples/sec Loss 0.8422 Epoch: 12 Global Step: 200900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:34:09,575-Speed 4363.09 samples/sec Loss 0.8463 Epoch: 12 Global Step: 200950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:34:22,087-Speed 4091.96 samples/sec Loss 0.8466 Epoch: 12 Global Step: 201000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:34:33,802-Speed 4370.76 samples/sec Loss 0.8360 Epoch: 12 Global Step: 201050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:34:45,489-Speed 4381.01 samples/sec Loss 0.8325 Epoch: 12 Global Step: 201100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:34:57,005-Speed 4446.42 samples/sec Loss 0.8348 Epoch: 12 Global Step: 201150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:35:08,665-Speed 4391.24 samples/sec Loss 0.8610 Epoch: 12 Global Step: 201200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:35:21,243-Speed 4070.76 samples/sec Loss 0.8328 Epoch: 12 Global Step: 201250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:35:32,812-Speed 4425.66 samples/sec Loss 0.8424 Epoch: 12 Global Step: 201300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:35:44,496-Speed 4382.22 samples/sec Loss 0.8368 Epoch: 12 Global Step: 201350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:35:56,282-Speed 4344.32 samples/sec Loss 0.8564 Epoch: 12 Global Step: 201400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:36:09,756-Speed 3800.09 samples/sec Loss 0.8457 Epoch: 12 Global Step: 201450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:36:21,653-Speed 4303.85 samples/sec Loss 0.8381 Epoch: 12 Global Step: 201500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:36:33,643-Speed 4270.32 samples/sec Loss 0.8537 Epoch: 12 Global Step: 201550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:36:45,336-Speed 4378.73 samples/sec Loss 0.8409 Epoch: 12 Global Step: 201600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:36:57,784-Speed 4113.34 samples/sec Loss 0.8468 Epoch: 12 Global Step: 201650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:37:09,550-Speed 4351.80 samples/sec Loss 0.8375 Epoch: 12 Global Step: 201700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:37:21,196-Speed 4396.57 samples/sec Loss 0.8584 Epoch: 12 Global Step: 201750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:37:33,158-Speed 4280.47 samples/sec Loss 0.8382 Epoch: 12 Global Step: 201800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:37:45,069-Speed 4298.58 samples/sec Loss 0.8509 Epoch: 12 Global Step: 201850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:37:56,734-Speed 4389.52 samples/sec Loss 0.8385 Epoch: 12 Global Step: 201900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:38:08,454-Speed 4368.73 samples/sec Loss 0.8365 Epoch: 12 Global Step: 201950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:38:20,741-Speed 4167.04 samples/sec Loss 0.8298 Epoch: 12 Global Step: 202000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:38:51,129-[lfw][202000]XNorm: 22.652140 Training: 2021-03-15 13:38:51,129-[lfw][202000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 13:38:51,129-[lfw][202000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:39:26,350-[cfp_fp][202000]XNorm: 21.722959 Training: 2021-03-15 13:39:26,351-[cfp_fp][202000]Accuracy-Flip: 0.98929+-0.00512 Training: 2021-03-15 13:39:26,351-[cfp_fp][202000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:39:56,737-[agedb_30][202000]XNorm: 22.961673 Training: 2021-03-15 13:39:56,737-[agedb_30][202000]Accuracy-Flip: 0.98200+-0.00670 Training: 2021-03-15 13:39:56,737-[agedb_30][202000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:40:08,535-Speed 474.98 samples/sec Loss 0.8645 Epoch: 12 Global Step: 202050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:40:20,296-Speed 4353.51 samples/sec Loss 0.8417 Epoch: 12 Global Step: 202100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:40:31,882-Speed 4419.20 samples/sec Loss 0.8575 Epoch: 12 Global Step: 202150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:40:43,633-Speed 4357.47 samples/sec Loss 0.8673 Epoch: 12 Global Step: 202200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:40:55,149-Speed 4445.95 samples/sec Loss 0.8437 Epoch: 12 Global Step: 202250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:41:06,952-Speed 4338.29 samples/sec Loss 0.8413 Epoch: 12 Global Step: 202300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:41:18,565-Speed 4408.90 samples/sec Loss 0.8328 Epoch: 12 Global Step: 202350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:41:30,383-Speed 4332.67 samples/sec Loss 0.8446 Epoch: 12 Global Step: 202400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:41:42,062-Speed 4384.14 samples/sec Loss 0.8494 Epoch: 12 Global Step: 202450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:41:54,624-Speed 4075.93 samples/sec Loss 0.8397 Epoch: 12 Global Step: 202500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:42:07,388-Speed 4011.48 samples/sec Loss 0.8363 Epoch: 12 Global Step: 202550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:42:19,100-Speed 4371.40 samples/sec Loss 0.8600 Epoch: 12 Global Step: 202600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:42:30,760-Speed 4391.52 samples/sec Loss 0.8335 Epoch: 12 Global Step: 202650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:42:42,477-Speed 4369.94 samples/sec Loss 0.8688 Epoch: 12 Global Step: 202700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:42:54,425-Speed 4285.22 samples/sec Loss 0.8568 Epoch: 12 Global Step: 202750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:43:06,070-Speed 4396.97 samples/sec Loss 0.8559 Epoch: 12 Global Step: 202800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:43:17,950-Speed 4309.74 samples/sec Loss 0.8294 Epoch: 12 Global Step: 202850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:43:29,672-Speed 4368.10 samples/sec Loss 0.8407 Epoch: 12 Global Step: 202900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:43:41,621-Speed 4285.08 samples/sec Loss 0.8638 Epoch: 12 Global Step: 202950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:43:53,223-Speed 4413.29 samples/sec Loss 0.8542 Epoch: 12 Global Step: 203000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:44:05,164-Speed 4287.82 samples/sec Loss 0.8472 Epoch: 12 Global Step: 203050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:44:16,646-Speed 4459.46 samples/sec Loss 0.8345 Epoch: 12 Global Step: 203100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:44:28,346-Speed 4376.08 samples/sec Loss 0.8362 Epoch: 12 Global Step: 203150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:44:39,852-Speed 4450.20 samples/sec Loss 0.8479 Epoch: 12 Global Step: 203200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:44:51,515-Speed 4389.95 samples/sec Loss 0.8466 Epoch: 12 Global Step: 203250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:45:03,274-Speed 4354.22 samples/sec Loss 0.8567 Epoch: 12 Global Step: 203300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:45:14,992-Speed 4369.75 samples/sec Loss 0.8393 Epoch: 12 Global Step: 203350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:45:26,702-Speed 4372.34 samples/sec Loss 0.8575 Epoch: 12 Global Step: 203400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:45:38,451-Speed 4358.27 samples/sec Loss 0.8508 Epoch: 12 Global Step: 203450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:45:51,095-Speed 4049.23 samples/sec Loss 0.8423 Epoch: 12 Global Step: 203500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:46:02,850-Speed 4356.06 samples/sec Loss 0.8484 Epoch: 12 Global Step: 203550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:46:14,420-Speed 4425.18 samples/sec Loss 0.8386 Epoch: 12 Global Step: 203600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:46:26,045-Speed 4404.42 samples/sec Loss 0.8588 Epoch: 12 Global Step: 203650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:46:37,983-Speed 4289.01 samples/sec Loss 0.8600 Epoch: 12 Global Step: 203700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:46:49,624-Speed 4398.65 samples/sec Loss 0.8458 Epoch: 12 Global Step: 203750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:47:01,596-Speed 4276.74 samples/sec Loss 0.8361 Epoch: 12 Global Step: 203800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:47:13,591-Speed 4268.41 samples/sec Loss 0.8490 Epoch: 12 Global Step: 203850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:47:25,232-Speed 4398.57 samples/sec Loss 0.8502 Epoch: 12 Global Step: 203900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:47:38,066-Speed 3989.40 samples/sec Loss 0.8483 Epoch: 12 Global Step: 203950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:47:51,385-Speed 3844.37 samples/sec Loss 0.8343 Epoch: 12 Global Step: 204000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:48:21,533-[lfw][204000]XNorm: 22.092595 Training: 2021-03-15 13:48:21,533-[lfw][204000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-15 13:48:21,533-[lfw][204000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:48:56,497-[cfp_fp][204000]XNorm: 21.335390 Training: 2021-03-15 13:48:56,497-[cfp_fp][204000]Accuracy-Flip: 0.98843+-0.00493 Training: 2021-03-15 13:48:56,497-[cfp_fp][204000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:49:26,677-[agedb_30][204000]XNorm: 22.467572 Training: 2021-03-15 13:49:26,677-[agedb_30][204000]Accuracy-Flip: 0.98067+-0.00663 Training: 2021-03-15 13:49:26,677-[agedb_30][204000]Accuracy-Highest: 0.98217 Training: 2021-03-15 13:49:38,363-Speed 478.60 samples/sec Loss 0.8505 Epoch: 12 Global Step: 204050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:49:50,921-Speed 4077.22 samples/sec Loss 0.8447 Epoch: 12 Global Step: 204100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:50:02,564-Speed 4397.91 samples/sec Loss 0.8680 Epoch: 12 Global Step: 204150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:50:14,137-Speed 4424.27 samples/sec Loss 0.8403 Epoch: 12 Global Step: 204200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:50:26,246-Speed 4228.39 samples/sec Loss 0.8551 Epoch: 12 Global Step: 204250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:50:38,155-Speed 4299.25 samples/sec Loss 0.8588 Epoch: 12 Global Step: 204300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:50:49,923-Speed 4351.20 samples/sec Loss 0.8555 Epoch: 12 Global Step: 204350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:51:01,539-Speed 4407.92 samples/sec Loss 0.8268 Epoch: 12 Global Step: 204400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:51:13,440-Speed 4302.29 samples/sec Loss 0.8453 Epoch: 12 Global Step: 204450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:51:25,010-Speed 4425.36 samples/sec Loss 0.8444 Epoch: 12 Global Step: 204500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:51:37,506-Speed 4097.29 samples/sec Loss 0.8633 Epoch: 12 Global Step: 204550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:51:49,126-Speed 4406.61 samples/sec Loss 0.8513 Epoch: 12 Global Step: 204600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:52:00,862-Speed 4362.91 samples/sec Loss 0.8381 Epoch: 12 Global Step: 204650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:52:12,433-Speed 4424.73 samples/sec Loss 0.8635 Epoch: 12 Global Step: 204700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:52:24,020-Speed 4418.91 samples/sec Loss 0.8535 Epoch: 12 Global Step: 204750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:52:35,805-Speed 4344.87 samples/sec Loss 0.8272 Epoch: 12 Global Step: 204800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:52:47,481-Speed 4385.25 samples/sec Loss 0.8445 Epoch: 12 Global Step: 204850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:00,141-Speed 4044.28 samples/sec Loss 0.8369 Epoch: 12 Global Step: 204900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:11,854-Speed 4371.53 samples/sec Loss 0.8565 Epoch: 12 Global Step: 204950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:23,567-Speed 4371.50 samples/sec Loss 0.8363 Epoch: 12 Global Step: 205000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:35,522-Speed 4282.86 samples/sec Loss 0.8513 Epoch: 12 Global Step: 205050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:47,330-Speed 4336.24 samples/sec Loss 0.8589 Epoch: 12 Global Step: 205100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:53:59,169-Speed 4324.65 samples/sec Loss 0.8500 Epoch: 12 Global Step: 205150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:54:10,905-Speed 4362.75 samples/sec Loss 0.8443 Epoch: 12 Global Step: 205200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:54:23,822-Speed 3964.03 samples/sec Loss 0.8594 Epoch: 12 Global Step: 205250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:54:35,425-Speed 4412.86 samples/sec Loss 0.8403 Epoch: 12 Global Step: 205300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:54:47,060-Speed 4400.69 samples/sec Loss 0.8420 Epoch: 12 Global Step: 205350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:54:58,733-Speed 4386.22 samples/sec Loss 0.8351 Epoch: 12 Global Step: 205400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:55:10,400-Speed 4388.67 samples/sec Loss 0.8488 Epoch: 12 Global Step: 205450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:55:22,039-Speed 4399.32 samples/sec Loss 0.8330 Epoch: 12 Global Step: 205500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:55:33,799-Speed 4353.88 samples/sec Loss 0.8462 Epoch: 12 Global Step: 205550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:55:45,372-Speed 4424.17 samples/sec Loss 0.8561 Epoch: 12 Global Step: 205600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:55:57,200-Speed 4328.81 samples/sec Loss 0.8492 Epoch: 12 Global Step: 205650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:56:09,173-Speed 4276.50 samples/sec Loss 0.8578 Epoch: 12 Global Step: 205700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:56:20,852-Speed 4383.96 samples/sec Loss 0.8186 Epoch: 12 Global Step: 205750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:56:32,664-Speed 4334.84 samples/sec Loss 0.8351 Epoch: 12 Global Step: 205800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:56:44,255-Speed 4417.60 samples/sec Loss 0.8472 Epoch: 12 Global Step: 205850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:56:56,165-Speed 4299.18 samples/sec Loss 0.8472 Epoch: 12 Global Step: 205900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:57:08,754-Speed 4066.95 samples/sec Loss 0.8492 Epoch: 12 Global Step: 205950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:57:20,621-Speed 4314.75 samples/sec Loss 0.8292 Epoch: 12 Global Step: 206000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:57:50,880-[lfw][206000]XNorm: 22.028599 Training: 2021-03-15 13:57:50,880-[lfw][206000]Accuracy-Flip: 0.99767+-0.00281 Training: 2021-03-15 13:57:50,881-[lfw][206000]Accuracy-Highest: 0.99833 Training: 2021-03-15 13:58:25,959-[cfp_fp][206000]XNorm: 21.411596 Training: 2021-03-15 13:58:25,960-[cfp_fp][206000]Accuracy-Flip: 0.98871+-0.00467 Training: 2021-03-15 13:58:25,960-[cfp_fp][206000]Accuracy-Highest: 0.98986 Training: 2021-03-15 13:58:56,268-[agedb_30][206000]XNorm: 22.516709 Training: 2021-03-15 13:58:56,268-[agedb_30][206000]Accuracy-Flip: 0.98250+-0.00672 Training: 2021-03-15 13:58:56,268-[agedb_30][206000]Accuracy-Highest: 0.98250 Training: 2021-03-15 13:59:07,870-Speed 477.40 samples/sec Loss 0.8392 Epoch: 12 Global Step: 206050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:59:19,505-Speed 4400.50 samples/sec Loss 0.8391 Epoch: 12 Global Step: 206100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:59:31,375-Speed 4313.77 samples/sec Loss 0.8457 Epoch: 12 Global Step: 206150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:59:43,100-Speed 4367.06 samples/sec Loss 0.8530 Epoch: 12 Global Step: 206200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 13:59:54,891-Speed 4342.28 samples/sec Loss 0.8378 Epoch: 12 Global Step: 206250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:00:06,683-Speed 4342.04 samples/sec Loss 0.8251 Epoch: 12 Global Step: 206300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:00:18,255-Speed 4424.62 samples/sec Loss 0.8445 Epoch: 12 Global Step: 206350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:00:31,108-Speed 3983.82 samples/sec Loss 0.8447 Epoch: 12 Global Step: 206400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:00:42,746-Speed 4399.46 samples/sec Loss 0.8381 Epoch: 12 Global Step: 206450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:00:54,367-Speed 4406.07 samples/sec Loss 0.8346 Epoch: 12 Global Step: 206500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:01:06,143-Speed 4347.71 samples/sec Loss 0.8483 Epoch: 12 Global Step: 206550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:01:19,027-Speed 3974.17 samples/sec Loss 0.8578 Epoch: 12 Global Step: 206600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:01:31,672-Speed 4049.23 samples/sec Loss 0.8374 Epoch: 12 Global Step: 206650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:01:43,639-Speed 4278.77 samples/sec Loss 0.8602 Epoch: 12 Global Step: 206700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:01:56,416-Speed 4007.16 samples/sec Loss 0.8614 Epoch: 12 Global Step: 206750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:02:08,051-Speed 4400.88 samples/sec Loss 0.8422 Epoch: 12 Global Step: 206800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:02:19,854-Speed 4337.98 samples/sec Loss 0.8386 Epoch: 12 Global Step: 206850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:02:31,619-Speed 4351.91 samples/sec Loss 0.8374 Epoch: 12 Global Step: 206900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:02:43,242-Speed 4405.47 samples/sec Loss 0.8350 Epoch: 12 Global Step: 206950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:02:54,898-Speed 4392.82 samples/sec Loss 0.8456 Epoch: 12 Global Step: 207000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:03:06,587-Speed 4380.32 samples/sec Loss 0.8558 Epoch: 12 Global Step: 207050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:03:18,479-Speed 4305.38 samples/sec Loss 0.8368 Epoch: 12 Global Step: 207100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:03:31,177-Speed 4032.35 samples/sec Loss 0.8324 Epoch: 12 Global Step: 207150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:03:43,081-Speed 4301.14 samples/sec Loss 0.8409 Epoch: 12 Global Step: 207200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:03:54,815-Speed 4363.84 samples/sec Loss 0.8379 Epoch: 12 Global Step: 207250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:04:06,260-Speed 4473.76 samples/sec Loss 0.8544 Epoch: 12 Global Step: 207300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:04:18,097-Speed 4325.51 samples/sec Loss 0.8425 Epoch: 12 Global Step: 207350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:04:30,746-Speed 4047.95 samples/sec Loss 0.8471 Epoch: 12 Global Step: 207400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:04:42,722-Speed 4275.27 samples/sec Loss 0.8375 Epoch: 12 Global Step: 207450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:04:54,652-Speed 4291.85 samples/sec Loss 0.8345 Epoch: 12 Global Step: 207500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:05:06,456-Speed 4337.63 samples/sec Loss 0.8406 Epoch: 12 Global Step: 207550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:05:18,452-Speed 4268.28 samples/sec Loss 0.8493 Epoch: 12 Global Step: 207600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:05:30,112-Speed 4391.29 samples/sec Loss 0.8363 Epoch: 12 Global Step: 207650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:05:41,770-Speed 4392.31 samples/sec Loss 0.8524 Epoch: 12 Global Step: 207700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:05:53,431-Speed 4390.59 samples/sec Loss 0.8437 Epoch: 12 Global Step: 207750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:06:05,441-Speed 4263.38 samples/sec Loss 0.8551 Epoch: 12 Global Step: 207800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:06:18,322-Speed 3974.97 samples/sec Loss 0.8363 Epoch: 12 Global Step: 207850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:06:30,075-Speed 4356.40 samples/sec Loss 0.8365 Epoch: 12 Global Step: 207900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:06:41,812-Speed 4362.68 samples/sec Loss 0.8563 Epoch: 12 Global Step: 207950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:06:53,580-Speed 4350.77 samples/sec Loss 0.8343 Epoch: 12 Global Step: 208000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:07:23,818-[lfw][208000]XNorm: 22.233194 Training: 2021-03-15 14:07:23,818-[lfw][208000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-15 14:07:23,818-[lfw][208000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:07:58,981-[cfp_fp][208000]XNorm: 21.671178 Training: 2021-03-15 14:07:58,982-[cfp_fp][208000]Accuracy-Flip: 0.98943+-0.00492 Training: 2021-03-15 14:07:58,982-[cfp_fp][208000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:08:29,295-[agedb_30][208000]XNorm: 22.631738 Training: 2021-03-15 14:08:29,295-[agedb_30][208000]Accuracy-Flip: 0.98200+-0.00632 Training: 2021-03-15 14:08:29,295-[agedb_30][208000]Accuracy-Highest: 0.98250 Training: 2021-03-15 14:08:40,903-Speed 477.07 samples/sec Loss 0.8412 Epoch: 12 Global Step: 208050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:08:52,755-Speed 4320.08 samples/sec Loss 0.8507 Epoch: 12 Global Step: 208100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:09:04,416-Speed 4390.64 samples/sec Loss 0.8391 Epoch: 12 Global Step: 208150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:09:16,278-Speed 4316.53 samples/sec Loss 0.8391 Epoch: 12 Global Step: 208200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:09:28,103-Speed 4330.18 samples/sec Loss 0.8392 Epoch: 12 Global Step: 208250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:09:39,806-Speed 4375.09 samples/sec Loss 0.8446 Epoch: 12 Global Step: 208300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:09:52,304-Speed 4096.87 samples/sec Loss 0.8507 Epoch: 12 Global Step: 208350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:10:04,417-Speed 4226.69 samples/sec Loss 0.8560 Epoch: 12 Global Step: 208400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:10:16,001-Speed 4420.25 samples/sec Loss 0.8431 Epoch: 12 Global Step: 208450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:10:28,000-Speed 4267.31 samples/sec Loss 0.8523 Epoch: 12 Global Step: 208500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:10:39,761-Speed 4353.66 samples/sec Loss 0.8401 Epoch: 12 Global Step: 208550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:10:51,607-Speed 4322.15 samples/sec Loss 0.8428 Epoch: 12 Global Step: 208600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:11:03,319-Speed 4371.65 samples/sec Loss 0.8288 Epoch: 12 Global Step: 208650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:11:15,066-Speed 4358.69 samples/sec Loss 0.8524 Epoch: 12 Global Step: 208700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:11:26,654-Speed 4418.76 samples/sec Loss 0.8264 Epoch: 12 Global Step: 208750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:11:38,615-Speed 4280.75 samples/sec Loss 0.8282 Epoch: 12 Global Step: 208800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:11:50,561-Speed 4285.96 samples/sec Loss 0.8536 Epoch: 12 Global Step: 208850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:12:02,769-Speed 4194.30 samples/sec Loss 0.8504 Epoch: 12 Global Step: 208900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:12:14,388-Speed 4406.61 samples/sec Loss 0.8470 Epoch: 12 Global Step: 208950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:12:26,217-Speed 4328.63 samples/sec Loss 0.8285 Epoch: 12 Global Step: 209000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:12:37,769-Speed 4432.14 samples/sec Loss 0.8405 Epoch: 12 Global Step: 209050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:12:49,608-Speed 4325.03 samples/sec Loss 0.8294 Epoch: 12 Global Step: 209100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:13:01,452-Speed 4322.91 samples/sec Loss 0.8205 Epoch: 12 Global Step: 209150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:13:13,056-Speed 4412.58 samples/sec Loss 0.8334 Epoch: 12 Global Step: 209200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:13:26,424-Speed 3830.05 samples/sec Loss 0.8327 Epoch: 12 Global Step: 209250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:13:38,223-Speed 4339.68 samples/sec Loss 0.8322 Epoch: 12 Global Step: 209300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:13:50,138-Speed 4297.34 samples/sec Loss 0.8307 Epoch: 12 Global Step: 209350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:14:01,986-Speed 4321.67 samples/sec Loss 0.8480 Epoch: 12 Global Step: 209400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:14:14,667-Speed 4037.55 samples/sec Loss 0.8326 Epoch: 12 Global Step: 209450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:14:26,669-Speed 4266.13 samples/sec Loss 0.8322 Epoch: 12 Global Step: 209500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:14:38,533-Speed 4315.71 samples/sec Loss 0.8373 Epoch: 12 Global Step: 209550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:14:50,505-Speed 4276.81 samples/sec Loss 0.8510 Epoch: 12 Global Step: 209600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:15:02,449-Speed 4286.91 samples/sec Loss 0.8417 Epoch: 12 Global Step: 209650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:15:15,223-Speed 4008.31 samples/sec Loss 0.8408 Epoch: 12 Global Step: 209700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:15:27,015-Speed 4341.96 samples/sec Loss 0.8302 Epoch: 12 Global Step: 209750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:15:38,589-Speed 4424.07 samples/sec Loss 0.8517 Epoch: 12 Global Step: 209800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:15:51,162-Speed 4072.28 samples/sec Loss 0.8386 Epoch: 12 Global Step: 209850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:16:03,033-Speed 4313.31 samples/sec Loss 0.8269 Epoch: 12 Global Step: 209900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:16:14,830-Speed 4340.29 samples/sec Loss 0.8386 Epoch: 12 Global Step: 209950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:16:26,750-Speed 4295.16 samples/sec Loss 0.8447 Epoch: 12 Global Step: 210000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:16:57,168-[lfw][210000]XNorm: 22.603418 Training: 2021-03-15 14:16:57,168-[lfw][210000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-15 14:16:57,168-[lfw][210000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:17:32,401-[cfp_fp][210000]XNorm: 21.817374 Training: 2021-03-15 14:17:32,402-[cfp_fp][210000]Accuracy-Flip: 0.98900+-0.00503 Training: 2021-03-15 14:17:32,402-[cfp_fp][210000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:18:02,621-[agedb_30][210000]XNorm: 22.974966 Training: 2021-03-15 14:18:02,622-[agedb_30][210000]Accuracy-Flip: 0.98167+-0.00742 Training: 2021-03-15 14:18:02,622-[agedb_30][210000]Accuracy-Highest: 0.98250 Training: 2021-03-15 14:18:14,259-Speed 476.24 samples/sec Loss 0.8271 Epoch: 12 Global Step: 210050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:18:25,862-Speed 4412.89 samples/sec Loss 0.8379 Epoch: 12 Global Step: 210100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:18:37,666-Speed 4337.57 samples/sec Loss 0.8287 Epoch: 12 Global Step: 210150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:18:49,372-Speed 4373.94 samples/sec Loss 0.8524 Epoch: 12 Global Step: 210200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:19:01,117-Speed 4359.66 samples/sec Loss 0.8346 Epoch: 12 Global Step: 210250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:19:12,887-Speed 4350.00 samples/sec Loss 0.8508 Epoch: 12 Global Step: 210300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:19:24,788-Speed 4302.46 samples/sec Loss 0.8455 Epoch: 12 Global Step: 210350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:19:36,421-Speed 4401.43 samples/sec Loss 0.8256 Epoch: 12 Global Step: 210400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:19:48,169-Speed 4358.32 samples/sec Loss 0.8404 Epoch: 12 Global Step: 210450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:20:00,016-Speed 4322.07 samples/sec Loss 0.8523 Epoch: 12 Global Step: 210500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:20:12,858-Speed 3987.06 samples/sec Loss 0.8555 Epoch: 12 Global Step: 210550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:20:24,474-Speed 4407.80 samples/sec Loss 0.8452 Epoch: 12 Global Step: 210600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:20:36,261-Speed 4343.99 samples/sec Loss 0.8216 Epoch: 12 Global Step: 210650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:20:48,419-Speed 4211.33 samples/sec Loss 0.8480 Epoch: 12 Global Step: 210700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:00,220-Speed 4338.83 samples/sec Loss 0.8353 Epoch: 12 Global Step: 210750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:11,937-Speed 4369.85 samples/sec Loss 0.8341 Epoch: 12 Global Step: 210800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:23,633-Speed 4377.66 samples/sec Loss 0.8552 Epoch: 12 Global Step: 210850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:36,081-Speed 4113.17 samples/sec Loss 0.8443 Epoch: 12 Global Step: 210900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:47,917-Speed 4326.03 samples/sec Loss 0.8486 Epoch: 12 Global Step: 210950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:21:59,889-Speed 4277.01 samples/sec Loss 0.8460 Epoch: 12 Global Step: 211000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:22:11,735-Speed 4322.17 samples/sec Loss 0.8325 Epoch: 12 Global Step: 211050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:22:23,440-Speed 4374.38 samples/sec Loss 0.8495 Epoch: 12 Global Step: 211100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:22:35,172-Speed 4364.42 samples/sec Loss 0.8256 Epoch: 12 Global Step: 211150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:22:46,888-Speed 4370.16 samples/sec Loss 0.8190 Epoch: 12 Global Step: 211200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:22:58,678-Speed 4342.72 samples/sec Loss 0.8309 Epoch: 12 Global Step: 211250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:23:10,869-Speed 4200.29 samples/sec Loss 0.8211 Epoch: 12 Global Step: 211300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:23:22,578-Speed 4372.93 samples/sec Loss 0.8377 Epoch: 12 Global Step: 211350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:23:34,204-Speed 4403.88 samples/sec Loss 0.8458 Epoch: 12 Global Step: 211400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:23:46,750-Speed 4081.02 samples/sec Loss 0.8393 Epoch: 12 Global Step: 211450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:23:58,473-Speed 4367.82 samples/sec Loss 0.8383 Epoch: 12 Global Step: 211500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:24:10,063-Speed 4417.90 samples/sec Loss 0.8409 Epoch: 12 Global Step: 211550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:24:21,779-Speed 4369.98 samples/sec Loss 0.8149 Epoch: 12 Global Step: 211600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:24:33,657-Speed 4310.75 samples/sec Loss 0.8412 Epoch: 12 Global Step: 211650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:24:45,312-Speed 4393.32 samples/sec Loss 0.8362 Epoch: 12 Global Step: 211700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:24:57,046-Speed 4363.57 samples/sec Loss 0.8419 Epoch: 12 Global Step: 211750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:25:08,906-Speed 4316.95 samples/sec Loss 0.8336 Epoch: 12 Global Step: 211800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:25:21,553-Speed 4048.57 samples/sec Loss 0.8424 Epoch: 12 Global Step: 211850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:25:34,012-Speed 4109.73 samples/sec Loss 0.8475 Epoch: 12 Global Step: 211900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:25:45,691-Speed 4384.29 samples/sec Loss 0.8403 Epoch: 12 Global Step: 211950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:25:57,285-Speed 4416.16 samples/sec Loss 0.8483 Epoch: 12 Global Step: 212000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:26:27,712-[lfw][212000]XNorm: 22.079634 Training: 2021-03-15 14:26:27,712-[lfw][212000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 14:26:27,712-[lfw][212000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:27:02,617-[cfp_fp][212000]XNorm: 21.376624 Training: 2021-03-15 14:27:02,617-[cfp_fp][212000]Accuracy-Flip: 0.98871+-0.00445 Training: 2021-03-15 14:27:02,617-[cfp_fp][212000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:27:32,735-[agedb_30][212000]XNorm: 22.759623 Training: 2021-03-15 14:27:32,735-[agedb_30][212000]Accuracy-Flip: 0.98167+-0.00671 Training: 2021-03-15 14:27:32,735-[agedb_30][212000]Accuracy-Highest: 0.98250 Training: 2021-03-15 14:27:44,476-Speed 477.65 samples/sec Loss 0.8496 Epoch: 12 Global Step: 212050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:27:56,010-Speed 4439.25 samples/sec Loss 0.8504 Epoch: 12 Global Step: 212100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:28:08,575-Speed 4075.09 samples/sec Loss 0.8469 Epoch: 12 Global Step: 212150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:28:20,222-Speed 4396.28 samples/sec Loss 0.8401 Epoch: 12 Global Step: 212200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:28:31,832-Speed 4409.89 samples/sec Loss 0.8455 Epoch: 12 Global Step: 212250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:28:43,637-Speed 4337.32 samples/sec Loss 0.8224 Epoch: 12 Global Step: 212300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:28:55,459-Speed 4331.33 samples/sec Loss 0.8361 Epoch: 12 Global Step: 212350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:29:08,850-Speed 3823.40 samples/sec Loss 0.8547 Epoch: 12 Global Step: 212400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:29:20,640-Speed 4343.05 samples/sec Loss 0.8371 Epoch: 12 Global Step: 212450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:29:32,436-Speed 4340.43 samples/sec Loss 0.8352 Epoch: 12 Global Step: 212500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:29:44,029-Speed 4416.94 samples/sec Loss 0.8443 Epoch: 12 Global Step: 212550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:29:55,832-Speed 4337.74 samples/sec Loss 0.8601 Epoch: 12 Global Step: 212600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-15 14:30:07,677-Speed 4322.73 samples/sec Loss 0.8306 Epoch: 12 Global Step: 212650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:30:19,468-Speed 4342.61 samples/sec Loss 0.8362 Epoch: 12 Global Step: 212700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:30:30,922-Speed 4470.06 samples/sec Loss 0.8563 Epoch: 12 Global Step: 212750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:30:42,726-Speed 4337.67 samples/sec Loss 0.8348 Epoch: 12 Global Step: 212800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:30:54,471-Speed 4359.59 samples/sec Loss 0.8310 Epoch: 12 Global Step: 212850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:31:06,230-Speed 4354.29 samples/sec Loss 0.8342 Epoch: 12 Global Step: 212900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:31:17,959-Speed 4365.38 samples/sec Loss 0.8407 Epoch: 12 Global Step: 212950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:31:29,736-Speed 4347.57 samples/sec Loss 0.8403 Epoch: 12 Global Step: 213000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:31:41,369-Speed 4401.70 samples/sec Loss 0.8377 Epoch: 12 Global Step: 213050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:31:53,074-Speed 4374.12 samples/sec Loss 0.8369 Epoch: 12 Global Step: 213100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:32:04,960-Speed 4307.90 samples/sec Loss 0.8453 Epoch: 12 Global Step: 213150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:32:16,655-Speed 4378.16 samples/sec Loss 0.8298 Epoch: 12 Global Step: 213200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:32:29,176-Speed 4089.22 samples/sec Loss 0.8349 Epoch: 12 Global Step: 213250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:32:40,909-Speed 4363.78 samples/sec Loss 0.8274 Epoch: 12 Global Step: 213300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:32:53,566-Speed 4045.39 samples/sec Loss 0.8254 Epoch: 12 Global Step: 213350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:33:05,036-Speed 4463.94 samples/sec Loss 0.8140 Epoch: 12 Global Step: 213400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:33:16,606-Speed 4425.37 samples/sec Loss 0.8280 Epoch: 12 Global Step: 213450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:33:28,323-Speed 4369.91 samples/sec Loss 0.8486 Epoch: 12 Global Step: 213500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:33:40,128-Speed 4337.62 samples/sec Loss 0.8249 Epoch: 12 Global Step: 213550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:33:51,811-Speed 4382.59 samples/sec Loss 0.8523 Epoch: 12 Global Step: 213600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:34:03,402-Speed 4417.27 samples/sec Loss 0.8392 Epoch: 12 Global Step: 213650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:34:15,249-Speed 4321.80 samples/sec Loss 0.8381 Epoch: 12 Global Step: 213700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:34:27,176-Speed 4293.00 samples/sec Loss 0.8319 Epoch: 12 Global Step: 213750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:34:38,880-Speed 4374.70 samples/sec Loss 0.8235 Epoch: 12 Global Step: 213800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:34:50,690-Speed 4335.52 samples/sec Loss 0.8391 Epoch: 12 Global Step: 213850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:35:02,247-Speed 4430.47 samples/sec Loss 0.8396 Epoch: 12 Global Step: 213900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:35:13,867-Speed 4406.37 samples/sec Loss 0.8343 Epoch: 12 Global Step: 213950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:35:26,538-Speed 4040.96 samples/sec Loss 0.8441 Epoch: 12 Global Step: 214000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:35:56,557-[lfw][214000]XNorm: 21.673466 Training: 2021-03-15 14:35:56,558-[lfw][214000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 14:35:56,558-[lfw][214000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:36:31,549-[cfp_fp][214000]XNorm: 21.367498 Training: 2021-03-15 14:36:31,550-[cfp_fp][214000]Accuracy-Flip: 0.98943+-0.00466 Training: 2021-03-15 14:36:31,551-[cfp_fp][214000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:37:01,745-[agedb_30][214000]XNorm: 22.228655 Training: 2021-03-15 14:37:01,745-[agedb_30][214000]Accuracy-Flip: 0.98100+-0.00824 Training: 2021-03-15 14:37:01,745-[agedb_30][214000]Accuracy-Highest: 0.98250 Training: 2021-03-15 14:37:13,434-Speed 478.97 samples/sec Loss 0.8452 Epoch: 12 Global Step: 214050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:37:25,128-Speed 4378.55 samples/sec Loss 0.8177 Epoch: 12 Global Step: 214100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:37:36,664-Speed 4438.52 samples/sec Loss 0.8326 Epoch: 12 Global Step: 214150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:37:48,432-Speed 4351.12 samples/sec Loss 0.8434 Epoch: 12 Global Step: 214200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:00,212-Speed 4346.54 samples/sec Loss 0.8310 Epoch: 12 Global Step: 214250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:11,771-Speed 4429.47 samples/sec Loss 0.8295 Epoch: 12 Global Step: 214300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:23,357-Speed 4419.14 samples/sec Loss 0.8406 Epoch: 12 Global Step: 214350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:35,051-Speed 4378.70 samples/sec Loss 0.8326 Epoch: 12 Global Step: 214400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:46,638-Speed 4418.93 samples/sec Loss 0.8382 Epoch: 12 Global Step: 214450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:38:59,359-Speed 4025.00 samples/sec Loss 0.8327 Epoch: 12 Global Step: 214500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:39:11,084-Speed 4366.77 samples/sec Loss 0.8251 Epoch: 12 Global Step: 214550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:39:23,556-Speed 4105.32 samples/sec Loss 0.8329 Epoch: 12 Global Step: 214600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:39:35,125-Speed 4425.99 samples/sec Loss 0.8445 Epoch: 12 Global Step: 214650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:39:46,857-Speed 4364.08 samples/sec Loss 0.8235 Epoch: 12 Global Step: 214700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:39:58,801-Speed 4286.78 samples/sec Loss 0.8448 Epoch: 12 Global Step: 214750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:40:11,338-Speed 4084.17 samples/sec Loss 0.8352 Epoch: 12 Global Step: 214800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:40:23,795-Speed 4110.24 samples/sec Loss 0.8430 Epoch: 12 Global Step: 214850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:40:35,634-Speed 4325.03 samples/sec Loss 0.8347 Epoch: 12 Global Step: 214900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:40:47,367-Speed 4363.78 samples/sec Loss 0.8292 Epoch: 12 Global Step: 214950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:00,024-Speed 4045.29 samples/sec Loss 0.8281 Epoch: 12 Global Step: 215000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:11,744-Speed 4368.94 samples/sec Loss 0.8235 Epoch: 12 Global Step: 215050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:23,362-Speed 4407.15 samples/sec Loss 0.8496 Epoch: 12 Global Step: 215100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:35,007-Speed 4396.78 samples/sec Loss 0.8381 Epoch: 12 Global Step: 215150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:46,757-Speed 4357.69 samples/sec Loss 0.8098 Epoch: 12 Global Step: 215200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:41:58,725-Speed 4278.38 samples/sec Loss 0.8375 Epoch: 12 Global Step: 215250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:42:10,472-Speed 4358.45 samples/sec Loss 0.8306 Epoch: 12 Global Step: 215300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:42:22,196-Speed 4367.24 samples/sec Loss 0.8209 Epoch: 12 Global Step: 215350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:42:33,763-Speed 4426.81 samples/sec Loss 0.8361 Epoch: 12 Global Step: 215400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:42:45,447-Speed 4382.13 samples/sec Loss 0.8475 Epoch: 12 Global Step: 215450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:42:57,269-Speed 4331.12 samples/sec Loss 0.8243 Epoch: 12 Global Step: 215500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:43:09,097-Speed 4328.67 samples/sec Loss 0.8425 Epoch: 12 Global Step: 215550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:43:20,963-Speed 4315.19 samples/sec Loss 0.8418 Epoch: 12 Global Step: 215600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:43:32,844-Speed 4309.41 samples/sec Loss 0.8047 Epoch: 12 Global Step: 215650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:43:44,796-Speed 4284.16 samples/sec Loss 0.8324 Epoch: 12 Global Step: 215700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:43:57,378-Speed 4069.39 samples/sec Loss 0.8245 Epoch: 12 Global Step: 215750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:44:08,926-Speed 4434.02 samples/sec Loss 0.8340 Epoch: 12 Global Step: 215800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:44:20,441-Speed 4446.60 samples/sec Loss 0.8463 Epoch: 12 Global Step: 215850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:44:32,289-Speed 4321.25 samples/sec Loss 0.8379 Epoch: 12 Global Step: 215900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:44:45,091-Speed 3999.74 samples/sec Loss 0.8282 Epoch: 12 Global Step: 215950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:44:56,775-Speed 4382.25 samples/sec Loss 0.8367 Epoch: 12 Global Step: 216000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:45:27,044-[lfw][216000]XNorm: 22.311088 Training: 2021-03-15 14:45:27,045-[lfw][216000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 14:45:27,045-[lfw][216000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:46:02,054-[cfp_fp][216000]XNorm: 21.679396 Training: 2021-03-15 14:46:02,054-[cfp_fp][216000]Accuracy-Flip: 0.98971+-0.00518 Training: 2021-03-15 14:46:02,054-[cfp_fp][216000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:46:32,305-[agedb_30][216000]XNorm: 22.877891 Training: 2021-03-15 14:46:32,305-[agedb_30][216000]Accuracy-Flip: 0.98183+-0.00693 Training: 2021-03-15 14:46:32,305-[agedb_30][216000]Accuracy-Highest: 0.98250 Training: 2021-03-15 14:46:44,019-Speed 477.42 samples/sec Loss 0.8241 Epoch: 12 Global Step: 216050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:46:55,740-Speed 4368.27 samples/sec Loss 0.8313 Epoch: 12 Global Step: 216100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:47:07,327-Speed 4418.79 samples/sec Loss 0.8253 Epoch: 12 Global Step: 216150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:47:19,148-Speed 4331.62 samples/sec Loss 0.8283 Epoch: 12 Global Step: 216200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:47:30,879-Speed 4364.71 samples/sec Loss 0.8529 Epoch: 12 Global Step: 216250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:47:42,879-Speed 4266.61 samples/sec Loss 0.8451 Epoch: 12 Global Step: 216300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:47:54,680-Speed 4339.06 samples/sec Loss 0.8306 Epoch: 12 Global Step: 216350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:48:07,146-Speed 4107.11 samples/sec Loss 0.8288 Epoch: 12 Global Step: 216400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:48:18,892-Speed 4359.23 samples/sec Loss 0.8259 Epoch: 12 Global Step: 216450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:48:30,460-Speed 4426.11 samples/sec Loss 0.8337 Epoch: 12 Global Step: 216500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:48:42,221-Speed 4353.86 samples/sec Loss 0.8421 Epoch: 12 Global Step: 216550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:48:53,942-Speed 4368.07 samples/sec Loss 0.8160 Epoch: 12 Global Step: 216600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:49:05,683-Speed 4361.00 samples/sec Loss 0.8532 Epoch: 12 Global Step: 216650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:49:17,324-Speed 4398.58 samples/sec Loss 0.8398 Epoch: 12 Global Step: 216700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:49:29,051-Speed 4366.17 samples/sec Loss 0.8242 Epoch: 12 Global Step: 216750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:49:41,201-Speed 4214.08 samples/sec Loss 0.8121 Epoch: 12 Global Step: 216800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:49:52,922-Speed 4368.37 samples/sec Loss 0.8425 Epoch: 12 Global Step: 216850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:50:04,741-Speed 4332.28 samples/sec Loss 0.8106 Epoch: 12 Global Step: 216900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:50:16,448-Speed 4373.50 samples/sec Loss 0.8273 Epoch: 12 Global Step: 216950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:50:40,742-Speed 2107.62 samples/sec Loss 0.8161 Epoch: 13 Global Step: 217000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:50:53,093-Speed 4145.50 samples/sec Loss 0.7597 Epoch: 13 Global Step: 217050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:51:05,095-Speed 4266.17 samples/sec Loss 0.7678 Epoch: 13 Global Step: 217100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:51:17,324-Speed 4186.90 samples/sec Loss 0.7691 Epoch: 13 Global Step: 217150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:51:31,575-Speed 3592.98 samples/sec Loss 0.7462 Epoch: 13 Global Step: 217200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:51:43,531-Speed 4282.50 samples/sec Loss 0.7655 Epoch: 13 Global Step: 217250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:51:55,292-Speed 4353.66 samples/sec Loss 0.7667 Epoch: 13 Global Step: 217300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:52:08,245-Speed 3952.98 samples/sec Loss 0.7653 Epoch: 13 Global Step: 217350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:52:19,954-Speed 4372.87 samples/sec Loss 0.7581 Epoch: 13 Global Step: 217400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:52:31,868-Speed 4297.58 samples/sec Loss 0.7534 Epoch: 13 Global Step: 217450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:52:43,483-Speed 4408.06 samples/sec Loss 0.7533 Epoch: 13 Global Step: 217500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:52:56,162-Speed 4038.49 samples/sec Loss 0.7594 Epoch: 13 Global Step: 217550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:53:08,761-Speed 4063.77 samples/sec Loss 0.7502 Epoch: 13 Global Step: 217600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:53:20,448-Speed 4381.37 samples/sec Loss 0.7573 Epoch: 13 Global Step: 217650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:53:32,165-Speed 4369.78 samples/sec Loss 0.7765 Epoch: 13 Global Step: 217700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:53:44,061-Speed 4304.29 samples/sec Loss 0.7573 Epoch: 13 Global Step: 217750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:53:55,910-Speed 4320.94 samples/sec Loss 0.7618 Epoch: 13 Global Step: 217800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:54:07,630-Speed 4368.98 samples/sec Loss 0.7896 Epoch: 13 Global Step: 217850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:54:19,441-Speed 4334.91 samples/sec Loss 0.7566 Epoch: 13 Global Step: 217900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:54:31,074-Speed 4401.45 samples/sec Loss 0.7681 Epoch: 13 Global Step: 217950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:54:42,726-Speed 4394.38 samples/sec Loss 0.7641 Epoch: 13 Global Step: 218000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:55:12,952-[lfw][218000]XNorm: 22.221668 Training: 2021-03-15 14:55:12,952-[lfw][218000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-15 14:55:12,952-[lfw][218000]Accuracy-Highest: 0.99833 Training: 2021-03-15 14:55:48,066-[cfp_fp][218000]XNorm: 21.516119 Training: 2021-03-15 14:55:48,066-[cfp_fp][218000]Accuracy-Flip: 0.98857+-0.00443 Training: 2021-03-15 14:55:48,066-[cfp_fp][218000]Accuracy-Highest: 0.98986 Training: 2021-03-15 14:56:18,398-[agedb_30][218000]XNorm: 22.702756 Training: 2021-03-15 14:56:18,399-[agedb_30][218000]Accuracy-Flip: 0.98300+-0.00666 Training: 2021-03-15 14:56:18,399-[agedb_30][218000]Accuracy-Highest: 0.98300 Training: 2021-03-15 14:56:30,049-Speed 477.07 samples/sec Loss 0.7660 Epoch: 13 Global Step: 218050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:56:41,864-Speed 4333.61 samples/sec Loss 0.7661 Epoch: 13 Global Step: 218100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:56:54,599-Speed 4020.50 samples/sec Loss 0.7660 Epoch: 13 Global Step: 218150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:57:06,267-Speed 4388.40 samples/sec Loss 0.7774 Epoch: 13 Global Step: 218200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:57:18,139-Speed 4312.95 samples/sec Loss 0.7514 Epoch: 13 Global Step: 218250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:57:29,792-Speed 4393.82 samples/sec Loss 0.7767 Epoch: 13 Global Step: 218300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:57:41,438-Speed 4396.45 samples/sec Loss 0.7684 Epoch: 13 Global Step: 218350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:57:53,149-Speed 4371.92 samples/sec Loss 0.7546 Epoch: 13 Global Step: 218400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:58:06,226-Speed 3915.61 samples/sec Loss 0.7569 Epoch: 13 Global Step: 218450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:58:17,851-Speed 4404.58 samples/sec Loss 0.7637 Epoch: 13 Global Step: 218500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:58:29,630-Speed 4346.92 samples/sec Loss 0.7726 Epoch: 13 Global Step: 218550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:58:41,452-Speed 4330.79 samples/sec Loss 0.7568 Epoch: 13 Global Step: 218600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:58:53,491-Speed 4253.15 samples/sec Loss 0.7669 Epoch: 13 Global Step: 218650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:59:05,173-Speed 4383.03 samples/sec Loss 0.7608 Epoch: 13 Global Step: 218700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:59:16,794-Speed 4405.99 samples/sec Loss 0.7749 Epoch: 13 Global Step: 218750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:59:29,169-Speed 4137.49 samples/sec Loss 0.7694 Epoch: 13 Global Step: 218800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:59:40,926-Speed 4355.07 samples/sec Loss 0.7512 Epoch: 13 Global Step: 218850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 14:59:52,662-Speed 4362.74 samples/sec Loss 0.7583 Epoch: 13 Global Step: 218900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:00:04,465-Speed 4337.94 samples/sec Loss 0.7630 Epoch: 13 Global Step: 218950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:00:16,140-Speed 4385.73 samples/sec Loss 0.7764 Epoch: 13 Global Step: 219000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:00:27,761-Speed 4405.90 samples/sec Loss 0.7692 Epoch: 13 Global Step: 219050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:00:39,474-Speed 4371.27 samples/sec Loss 0.7769 Epoch: 13 Global Step: 219100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:00:51,857-Speed 4134.93 samples/sec Loss 0.7686 Epoch: 13 Global Step: 219150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:01:03,658-Speed 4338.79 samples/sec Loss 0.7616 Epoch: 13 Global Step: 219200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:01:15,565-Speed 4300.16 samples/sec Loss 0.7576 Epoch: 13 Global Step: 219250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:01:27,070-Speed 4450.31 samples/sec Loss 0.7624 Epoch: 13 Global Step: 219300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:01:38,863-Speed 4341.87 samples/sec Loss 0.7758 Epoch: 13 Global Step: 219350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:01:50,467-Speed 4412.52 samples/sec Loss 0.7685 Epoch: 13 Global Step: 219400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:02:02,143-Speed 4385.38 samples/sec Loss 0.7530 Epoch: 13 Global Step: 219450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:02:13,887-Speed 4359.60 samples/sec Loss 0.7683 Epoch: 13 Global Step: 219500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:02:25,803-Speed 4297.15 samples/sec Loss 0.7815 Epoch: 13 Global Step: 219550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:02:37,324-Speed 4443.88 samples/sec Loss 0.7747 Epoch: 13 Global Step: 219600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:02:49,361-Speed 4253.99 samples/sec Loss 0.7804 Epoch: 13 Global Step: 219650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:03:00,950-Speed 4418.21 samples/sec Loss 0.7840 Epoch: 13 Global Step: 219700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:03:13,532-Speed 4069.29 samples/sec Loss 0.7689 Epoch: 13 Global Step: 219750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:03:26,299-Speed 4010.55 samples/sec Loss 0.7746 Epoch: 13 Global Step: 219800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:03:38,001-Speed 4375.37 samples/sec Loss 0.7502 Epoch: 13 Global Step: 219850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:03:49,716-Speed 4370.61 samples/sec Loss 0.7668 Epoch: 13 Global Step: 219900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:04:01,434-Speed 4369.58 samples/sec Loss 0.7720 Epoch: 13 Global Step: 219950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:04:13,032-Speed 4414.92 samples/sec Loss 0.7504 Epoch: 13 Global Step: 220000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:04:43,285-[lfw][220000]XNorm: 22.444381 Training: 2021-03-15 15:04:43,285-[lfw][220000]Accuracy-Flip: 0.99783+-0.00279 Training: 2021-03-15 15:04:43,285-[lfw][220000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:05:18,147-[cfp_fp][220000]XNorm: 21.925014 Training: 2021-03-15 15:05:18,148-[cfp_fp][220000]Accuracy-Flip: 0.99043+-0.00483 Training: 2021-03-15 15:05:18,148-[cfp_fp][220000]Accuracy-Highest: 0.99043 Training: 2021-03-15 15:05:48,190-[agedb_30][220000]XNorm: 23.040033 Training: 2021-03-15 15:05:48,190-[agedb_30][220000]Accuracy-Flip: 0.98250+-0.00750 Training: 2021-03-15 15:05:48,190-[agedb_30][220000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:06:00,852-Speed 474.86 samples/sec Loss 0.7750 Epoch: 13 Global Step: 220050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:06:12,627-Speed 4348.38 samples/sec Loss 0.7692 Epoch: 13 Global Step: 220100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:06:24,591-Speed 4279.77 samples/sec Loss 0.7729 Epoch: 13 Global Step: 220150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:06:37,208-Speed 4058.14 samples/sec Loss 0.7739 Epoch: 13 Global Step: 220200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:06:49,573-Speed 4140.94 samples/sec Loss 0.7707 Epoch: 13 Global Step: 220250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:07:01,442-Speed 4313.92 samples/sec Loss 0.7756 Epoch: 13 Global Step: 220300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:07:13,258-Speed 4333.36 samples/sec Loss 0.7750 Epoch: 13 Global Step: 220350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:07:25,152-Speed 4304.67 samples/sec Loss 0.7748 Epoch: 13 Global Step: 220400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:07:36,847-Speed 4378.07 samples/sec Loss 0.7706 Epoch: 13 Global Step: 220450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:07:48,590-Speed 4360.48 samples/sec Loss 0.7674 Epoch: 13 Global Step: 220500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:00,173-Speed 4420.24 samples/sec Loss 0.7726 Epoch: 13 Global Step: 220550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:12,698-Speed 4088.08 samples/sec Loss 0.7628 Epoch: 13 Global Step: 220600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:24,556-Speed 4317.79 samples/sec Loss 0.7619 Epoch: 13 Global Step: 220650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:36,050-Speed 4454.85 samples/sec Loss 0.7878 Epoch: 13 Global Step: 220700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:48,044-Speed 4268.98 samples/sec Loss 0.7686 Epoch: 13 Global Step: 220750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:08:59,589-Speed 4434.99 samples/sec Loss 0.7682 Epoch: 13 Global Step: 220800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:09:11,377-Speed 4343.62 samples/sec Loss 0.7786 Epoch: 13 Global Step: 220850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:09:23,045-Speed 4388.21 samples/sec Loss 0.7516 Epoch: 13 Global Step: 220900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:09:34,706-Speed 4390.73 samples/sec Loss 0.7724 Epoch: 13 Global Step: 220950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:09:46,443-Speed 4362.37 samples/sec Loss 0.7741 Epoch: 13 Global Step: 221000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:09:58,152-Speed 4373.09 samples/sec Loss 0.7761 Epoch: 13 Global Step: 221050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:10:10,561-Speed 4125.98 samples/sec Loss 0.7744 Epoch: 13 Global Step: 221100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:10:22,322-Speed 4353.81 samples/sec Loss 0.7756 Epoch: 13 Global Step: 221150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:10:34,931-Speed 4060.56 samples/sec Loss 0.7798 Epoch: 13 Global Step: 221200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:10:46,573-Speed 4398.33 samples/sec Loss 0.7603 Epoch: 13 Global Step: 221250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:10:58,351-Speed 4347.07 samples/sec Loss 0.7915 Epoch: 13 Global Step: 221300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:11:10,198-Speed 4322.01 samples/sec Loss 0.7596 Epoch: 13 Global Step: 221350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:11:21,839-Speed 4398.56 samples/sec Loss 0.7696 Epoch: 13 Global Step: 221400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:11:33,552-Speed 4371.37 samples/sec Loss 0.7704 Epoch: 13 Global Step: 221450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:11:45,077-Speed 4442.54 samples/sec Loss 0.7690 Epoch: 13 Global Step: 221500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:11:56,589-Speed 4447.92 samples/sec Loss 0.7636 Epoch: 13 Global Step: 221550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:12:08,364-Speed 4348.35 samples/sec Loss 0.7736 Epoch: 13 Global Step: 221600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:12:20,064-Speed 4375.98 samples/sec Loss 0.7815 Epoch: 13 Global Step: 221650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:12:31,851-Speed 4343.99 samples/sec Loss 0.7725 Epoch: 13 Global Step: 221700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:12:43,602-Speed 4357.19 samples/sec Loss 0.7706 Epoch: 13 Global Step: 221750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:12:55,437-Speed 4326.35 samples/sec Loss 0.7623 Epoch: 13 Global Step: 221800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:13:06,906-Speed 4464.42 samples/sec Loss 0.7697 Epoch: 13 Global Step: 221850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:13:18,638-Speed 4364.55 samples/sec Loss 0.7838 Epoch: 13 Global Step: 221900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:13:30,282-Speed 4397.06 samples/sec Loss 0.7864 Epoch: 13 Global Step: 221950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:13:41,864-Speed 4421.04 samples/sec Loss 0.7882 Epoch: 13 Global Step: 222000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:14:12,127-[lfw][222000]XNorm: 22.786120 Training: 2021-03-15 15:14:12,127-[lfw][222000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 15:14:12,127-[lfw][222000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:14:47,306-[cfp_fp][222000]XNorm: 22.220134 Training: 2021-03-15 15:14:47,306-[cfp_fp][222000]Accuracy-Flip: 0.99086+-0.00470 Training: 2021-03-15 15:14:47,306-[cfp_fp][222000]Accuracy-Highest: 0.99086 Training: 2021-03-15 15:15:17,516-[agedb_30][222000]XNorm: 23.395599 Training: 2021-03-15 15:15:17,516-[agedb_30][222000]Accuracy-Flip: 0.98133+-0.00737 Training: 2021-03-15 15:15:17,516-[agedb_30][222000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:15:29,272-Speed 476.69 samples/sec Loss 0.7647 Epoch: 13 Global Step: 222050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:15:41,740-Speed 4106.71 samples/sec Loss 0.7683 Epoch: 13 Global Step: 222100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:15:53,528-Speed 4343.49 samples/sec Loss 0.7822 Epoch: 13 Global Step: 222150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:16:05,464-Speed 4289.60 samples/sec Loss 0.7703 Epoch: 13 Global Step: 222200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:16:17,333-Speed 4314.24 samples/sec Loss 0.7616 Epoch: 13 Global Step: 222250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:16:29,304-Speed 4277.14 samples/sec Loss 0.7626 Epoch: 13 Global Step: 222300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:16:41,017-Speed 4371.46 samples/sec Loss 0.7635 Epoch: 13 Global Step: 222350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:16:52,914-Speed 4303.72 samples/sec Loss 0.7683 Epoch: 13 Global Step: 222400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:17:04,596-Speed 4382.93 samples/sec Loss 0.7790 Epoch: 13 Global Step: 222450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:17:17,447-Speed 3984.19 samples/sec Loss 0.8058 Epoch: 13 Global Step: 222500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:17:29,065-Speed 4407.10 samples/sec Loss 0.7735 Epoch: 13 Global Step: 222550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:17:40,792-Speed 4366.43 samples/sec Loss 0.7876 Epoch: 13 Global Step: 222600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:17:52,426-Speed 4400.88 samples/sec Loss 0.7721 Epoch: 13 Global Step: 222650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:18:05,065-Speed 4051.16 samples/sec Loss 0.7851 Epoch: 13 Global Step: 222700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:18:16,600-Speed 4438.71 samples/sec Loss 0.7841 Epoch: 13 Global Step: 222750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:18:28,087-Speed 4457.37 samples/sec Loss 0.7594 Epoch: 13 Global Step: 222800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:18:39,727-Speed 4398.86 samples/sec Loss 0.7680 Epoch: 13 Global Step: 222850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:18:52,230-Speed 4095.31 samples/sec Loss 0.7768 Epoch: 13 Global Step: 222900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:19:04,100-Speed 4313.58 samples/sec Loss 0.7803 Epoch: 13 Global Step: 222950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:19:15,753-Speed 4393.73 samples/sec Loss 0.7810 Epoch: 13 Global Step: 223000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:19:28,231-Speed 4103.24 samples/sec Loss 0.7890 Epoch: 13 Global Step: 223050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:19:40,406-Speed 4205.55 samples/sec Loss 0.7926 Epoch: 13 Global Step: 223100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:19:52,205-Speed 4339.69 samples/sec Loss 0.7720 Epoch: 13 Global Step: 223150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:20:04,079-Speed 4311.89 samples/sec Loss 0.7761 Epoch: 13 Global Step: 223200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:20:15,734-Speed 4393.26 samples/sec Loss 0.7700 Epoch: 13 Global Step: 223250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:20:27,487-Speed 4356.40 samples/sec Loss 0.7709 Epoch: 13 Global Step: 223300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:20:39,190-Speed 4375.25 samples/sec Loss 0.7815 Epoch: 13 Global Step: 223350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:20:50,847-Speed 4392.21 samples/sec Loss 0.7832 Epoch: 13 Global Step: 223400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:21:02,640-Speed 4341.95 samples/sec Loss 0.7527 Epoch: 13 Global Step: 223450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:21:14,511-Speed 4313.21 samples/sec Loss 0.7778 Epoch: 13 Global Step: 223500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:21:26,152-Speed 4398.14 samples/sec Loss 0.7670 Epoch: 13 Global Step: 223550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:21:37,849-Speed 4377.44 samples/sec Loss 0.7680 Epoch: 13 Global Step: 223600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:21:50,390-Speed 4082.71 samples/sec Loss 0.7780 Epoch: 13 Global Step: 223650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:22:03,050-Speed 4044.31 samples/sec Loss 0.7785 Epoch: 13 Global Step: 223700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:22:14,661-Speed 4409.75 samples/sec Loss 0.7883 Epoch: 13 Global Step: 223750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:22:26,307-Speed 4396.82 samples/sec Loss 0.7860 Epoch: 13 Global Step: 223800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:22:38,148-Speed 4324.18 samples/sec Loss 0.7835 Epoch: 13 Global Step: 223850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:22:49,776-Speed 4403.18 samples/sec Loss 0.7800 Epoch: 13 Global Step: 223900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:23:01,522-Speed 4359.07 samples/sec Loss 0.7874 Epoch: 13 Global Step: 223950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:23:13,183-Speed 4390.94 samples/sec Loss 0.7670 Epoch: 13 Global Step: 224000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:23:43,435-[lfw][224000]XNorm: 22.117829 Training: 2021-03-15 15:23:43,435-[lfw][224000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 15:23:43,435-[lfw][224000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:24:18,559-[cfp_fp][224000]XNorm: 21.734804 Training: 2021-03-15 15:24:18,559-[cfp_fp][224000]Accuracy-Flip: 0.98971+-0.00460 Training: 2021-03-15 15:24:18,559-[cfp_fp][224000]Accuracy-Highest: 0.99086 Training: 2021-03-15 15:24:48,869-[agedb_30][224000]XNorm: 22.625930 Training: 2021-03-15 15:24:48,869-[agedb_30][224000]Accuracy-Flip: 0.98250+-0.00800 Training: 2021-03-15 15:24:48,870-[agedb_30][224000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:25:00,402-Speed 477.53 samples/sec Loss 0.7679 Epoch: 13 Global Step: 224050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:25:12,080-Speed 4384.62 samples/sec Loss 0.7873 Epoch: 13 Global Step: 224100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:25:23,790-Speed 4372.50 samples/sec Loss 0.7781 Epoch: 13 Global Step: 224150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:25:35,467-Speed 4384.65 samples/sec Loss 0.7908 Epoch: 13 Global Step: 224200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:25:46,997-Speed 4440.80 samples/sec Loss 0.7816 Epoch: 13 Global Step: 224250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:25:58,739-Speed 4360.75 samples/sec Loss 0.7791 Epoch: 13 Global Step: 224300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:26:10,415-Speed 4385.29 samples/sec Loss 0.7769 Epoch: 13 Global Step: 224350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:26:21,974-Speed 4429.88 samples/sec Loss 0.7714 Epoch: 13 Global Step: 224400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:26:33,750-Speed 4347.78 samples/sec Loss 0.7745 Epoch: 13 Global Step: 224450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:26:45,532-Speed 4345.75 samples/sec Loss 0.7607 Epoch: 13 Global Step: 224500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:26:57,164-Speed 4401.80 samples/sec Loss 0.7883 Epoch: 13 Global Step: 224550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:27:09,539-Speed 4137.59 samples/sec Loss 0.7657 Epoch: 13 Global Step: 224600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:27:21,089-Speed 4433.12 samples/sec Loss 0.7790 Epoch: 13 Global Step: 224650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:27:32,801-Speed 4371.93 samples/sec Loss 0.7762 Epoch: 13 Global Step: 224700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:27:44,496-Speed 4377.82 samples/sec Loss 0.7622 Epoch: 13 Global Step: 224750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:27:56,347-Speed 4320.61 samples/sec Loss 0.7716 Epoch: 13 Global Step: 224800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:28:08,235-Speed 4307.22 samples/sec Loss 0.7904 Epoch: 13 Global Step: 224850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:28:19,932-Speed 4377.28 samples/sec Loss 0.7811 Epoch: 13 Global Step: 224900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:28:31,696-Speed 4352.37 samples/sec Loss 0.7805 Epoch: 13 Global Step: 224950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:28:43,412-Speed 4370.29 samples/sec Loss 0.7741 Epoch: 13 Global Step: 225000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:28:54,955-Speed 4435.89 samples/sec Loss 0.7657 Epoch: 13 Global Step: 225050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:29:06,850-Speed 4304.42 samples/sec Loss 0.7954 Epoch: 13 Global Step: 225100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:29:19,376-Speed 4087.54 samples/sec Loss 0.7770 Epoch: 13 Global Step: 225150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:29:31,106-Speed 4365.28 samples/sec Loss 0.7865 Epoch: 13 Global Step: 225200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:29:42,884-Speed 4347.17 samples/sec Loss 0.7669 Epoch: 13 Global Step: 225250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:29:54,593-Speed 4372.91 samples/sec Loss 0.7830 Epoch: 13 Global Step: 225300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:30:06,374-Speed 4346.00 samples/sec Loss 0.7745 Epoch: 13 Global Step: 225350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-15 15:30:18,764-Speed 4132.45 samples/sec Loss 0.7700 Epoch: 13 Global Step: 225400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:30:30,456-Speed 4379.44 samples/sec Loss 0.7771 Epoch: 13 Global Step: 225450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:30:42,992-Speed 4084.38 samples/sec Loss 0.7817 Epoch: 13 Global Step: 225500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:30:54,599-Speed 4411.23 samples/sec Loss 0.7875 Epoch: 13 Global Step: 225550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:31:07,359-Speed 4012.68 samples/sec Loss 0.7658 Epoch: 13 Global Step: 225600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:31:19,007-Speed 4395.79 samples/sec Loss 0.7768 Epoch: 13 Global Step: 225650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:31:31,601-Speed 4065.47 samples/sec Loss 0.7693 Epoch: 13 Global Step: 225700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:31:43,255-Speed 4393.80 samples/sec Loss 0.7811 Epoch: 13 Global Step: 225750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:31:55,021-Speed 4351.34 samples/sec Loss 0.7697 Epoch: 13 Global Step: 225800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:32:06,602-Speed 4421.43 samples/sec Loss 0.7816 Epoch: 13 Global Step: 225850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:32:18,282-Speed 4383.63 samples/sec Loss 0.7757 Epoch: 13 Global Step: 225900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:32:29,998-Speed 4370.35 samples/sec Loss 0.7742 Epoch: 13 Global Step: 225950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:32:41,631-Speed 4401.44 samples/sec Loss 0.7800 Epoch: 13 Global Step: 226000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:33:11,931-[lfw][226000]XNorm: 22.262658 Training: 2021-03-15 15:33:11,931-[lfw][226000]Accuracy-Flip: 0.99750+-0.00261 Training: 2021-03-15 15:33:11,931-[lfw][226000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:33:47,091-[cfp_fp][226000]XNorm: 21.817506 Training: 2021-03-15 15:33:47,091-[cfp_fp][226000]Accuracy-Flip: 0.99057+-0.00363 Training: 2021-03-15 15:33:47,091-[cfp_fp][226000]Accuracy-Highest: 0.99086 Training: 2021-03-15 15:34:17,299-[agedb_30][226000]XNorm: 22.724451 Training: 2021-03-15 15:34:17,299-[agedb_30][226000]Accuracy-Flip: 0.98217+-0.00628 Training: 2021-03-15 15:34:17,299-[agedb_30][226000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:34:28,982-Speed 476.94 samples/sec Loss 0.7760 Epoch: 13 Global Step: 226050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:34:40,536-Speed 4431.30 samples/sec Loss 0.7799 Epoch: 13 Global Step: 226100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:34:53,170-Speed 4052.75 samples/sec Loss 0.7746 Epoch: 13 Global Step: 226150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:35:04,882-Speed 4371.87 samples/sec Loss 0.7790 Epoch: 13 Global Step: 226200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:35:16,812-Speed 4291.82 samples/sec Loss 0.7841 Epoch: 13 Global Step: 226250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:35:28,472-Speed 4391.50 samples/sec Loss 0.7969 Epoch: 13 Global Step: 226300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:35:41,135-Speed 4043.26 samples/sec Loss 0.7712 Epoch: 13 Global Step: 226350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:35:52,844-Speed 4373.04 samples/sec Loss 0.7745 Epoch: 13 Global Step: 226400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:36:04,853-Speed 4263.40 samples/sec Loss 0.7761 Epoch: 13 Global Step: 226450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:36:16,582-Speed 4365.45 samples/sec Loss 0.7732 Epoch: 13 Global Step: 226500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:36:28,321-Speed 4361.85 samples/sec Loss 0.7782 Epoch: 13 Global Step: 226550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:36:39,997-Speed 4385.16 samples/sec Loss 0.7818 Epoch: 13 Global Step: 226600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:36:51,729-Speed 4364.26 samples/sec Loss 0.7745 Epoch: 13 Global Step: 226650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:37:03,203-Speed 4462.39 samples/sec Loss 0.7848 Epoch: 13 Global Step: 226700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:37:14,713-Speed 4448.50 samples/sec Loss 0.7763 Epoch: 13 Global Step: 226750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:37:26,590-Speed 4310.99 samples/sec Loss 0.7917 Epoch: 13 Global Step: 226800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:37:38,363-Speed 4349.17 samples/sec Loss 0.7905 Epoch: 13 Global Step: 226850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:37:50,185-Speed 4331.24 samples/sec Loss 0.7676 Epoch: 13 Global Step: 226900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:38:01,986-Speed 4338.78 samples/sec Loss 0.7847 Epoch: 13 Global Step: 226950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:38:13,982-Speed 4268.18 samples/sec Loss 0.7808 Epoch: 13 Global Step: 227000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:38:25,718-Speed 4362.67 samples/sec Loss 0.7853 Epoch: 13 Global Step: 227050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:38:38,194-Speed 4104.12 samples/sec Loss 0.7734 Epoch: 13 Global Step: 227100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:38:49,905-Speed 4372.01 samples/sec Loss 0.7938 Epoch: 13 Global Step: 227150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:39:01,520-Speed 4408.63 samples/sec Loss 0.7826 Epoch: 13 Global Step: 227200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:39:13,225-Speed 4374.16 samples/sec Loss 0.7773 Epoch: 13 Global Step: 227250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:39:25,005-Speed 4346.66 samples/sec Loss 0.7814 Epoch: 13 Global Step: 227300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:39:36,677-Speed 4386.78 samples/sec Loss 0.7766 Epoch: 13 Global Step: 227350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:39:48,457-Speed 4346.40 samples/sec Loss 0.7711 Epoch: 13 Global Step: 227400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:00,425-Speed 4278.05 samples/sec Loss 0.7882 Epoch: 13 Global Step: 227450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:12,243-Speed 4332.54 samples/sec Loss 0.7700 Epoch: 13 Global Step: 227500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:23,940-Speed 4377.63 samples/sec Loss 0.7742 Epoch: 13 Global Step: 227550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:35,597-Speed 4392.33 samples/sec Loss 0.7915 Epoch: 13 Global Step: 227600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:47,219-Speed 4405.72 samples/sec Loss 0.7948 Epoch: 13 Global Step: 227650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:40:58,975-Speed 4355.28 samples/sec Loss 0.7890 Epoch: 13 Global Step: 227700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:41:11,713-Speed 4019.74 samples/sec Loss 0.7869 Epoch: 13 Global Step: 227750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:41:23,717-Speed 4265.20 samples/sec Loss 0.7801 Epoch: 13 Global Step: 227800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:41:35,420-Speed 4375.12 samples/sec Loss 0.7839 Epoch: 13 Global Step: 227850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:41:47,356-Speed 4289.81 samples/sec Loss 0.7757 Epoch: 13 Global Step: 227900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:41:59,141-Speed 4344.74 samples/sec Loss 0.7933 Epoch: 13 Global Step: 227950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:42:10,908-Speed 4351.19 samples/sec Loss 0.7822 Epoch: 13 Global Step: 228000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:42:40,976-[lfw][228000]XNorm: 22.465443 Training: 2021-03-15 15:42:40,977-[lfw][228000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-15 15:42:40,977-[lfw][228000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:43:16,063-[cfp_fp][228000]XNorm: 21.804154 Training: 2021-03-15 15:43:16,063-[cfp_fp][228000]Accuracy-Flip: 0.99086+-0.00508 Training: 2021-03-15 15:43:16,063-[cfp_fp][228000]Accuracy-Highest: 0.99086 Training: 2021-03-15 15:43:46,113-[agedb_30][228000]XNorm: 22.845194 Training: 2021-03-15 15:43:46,114-[agedb_30][228000]Accuracy-Flip: 0.98250+-0.00712 Training: 2021-03-15 15:43:46,114-[agedb_30][228000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:43:58,622-Speed 475.33 samples/sec Loss 0.7753 Epoch: 13 Global Step: 228050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:44:10,266-Speed 4397.57 samples/sec Loss 0.7910 Epoch: 13 Global Step: 228100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:44:22,741-Speed 4104.14 samples/sec Loss 0.7970 Epoch: 13 Global Step: 228150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:44:34,564-Speed 4330.72 samples/sec Loss 0.7758 Epoch: 13 Global Step: 228200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:44:46,991-Speed 4120.18 samples/sec Loss 0.7940 Epoch: 13 Global Step: 228250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:44:58,753-Speed 4353.41 samples/sec Loss 0.7754 Epoch: 13 Global Step: 228300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:45:10,537-Speed 4345.08 samples/sec Loss 0.7899 Epoch: 13 Global Step: 228350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:45:23,031-Speed 4097.91 samples/sec Loss 0.7638 Epoch: 13 Global Step: 228400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:45:34,840-Speed 4335.86 samples/sec Loss 0.7849 Epoch: 13 Global Step: 228450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:45:46,640-Speed 4339.28 samples/sec Loss 0.7648 Epoch: 13 Global Step: 228500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:45:58,234-Speed 4416.26 samples/sec Loss 0.7760 Epoch: 13 Global Step: 228550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:46:09,861-Speed 4403.63 samples/sec Loss 0.7714 Epoch: 13 Global Step: 228600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:46:22,458-Speed 4064.78 samples/sec Loss 0.7738 Epoch: 13 Global Step: 228650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:46:34,262-Speed 4337.72 samples/sec Loss 0.7714 Epoch: 13 Global Step: 228700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:46:46,184-Speed 4294.57 samples/sec Loss 0.7767 Epoch: 13 Global Step: 228750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:46:58,082-Speed 4303.38 samples/sec Loss 0.7701 Epoch: 13 Global Step: 228800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:47:09,933-Speed 4320.67 samples/sec Loss 0.7801 Epoch: 13 Global Step: 228850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:47:21,538-Speed 4412.02 samples/sec Loss 0.7737 Epoch: 13 Global Step: 228900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:47:33,370-Speed 4327.28 samples/sec Loss 0.7795 Epoch: 13 Global Step: 228950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:47:45,153-Speed 4345.36 samples/sec Loss 0.7796 Epoch: 13 Global Step: 229000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:47:57,706-Speed 4078.90 samples/sec Loss 0.7622 Epoch: 13 Global Step: 229050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:48:09,471-Speed 4352.24 samples/sec Loss 0.7898 Epoch: 13 Global Step: 229100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:48:21,512-Speed 4252.35 samples/sec Loss 0.7901 Epoch: 13 Global Step: 229150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:48:33,243-Speed 4364.47 samples/sec Loss 0.7832 Epoch: 13 Global Step: 229200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:48:44,883-Speed 4398.94 samples/sec Loss 0.7679 Epoch: 13 Global Step: 229250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:48:56,623-Speed 4361.11 samples/sec Loss 0.7830 Epoch: 13 Global Step: 229300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:49:08,358-Speed 4363.27 samples/sec Loss 0.7887 Epoch: 13 Global Step: 229350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:49:20,056-Speed 4377.13 samples/sec Loss 0.7833 Epoch: 13 Global Step: 229400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:49:31,799-Speed 4360.28 samples/sec Loss 0.7779 Epoch: 13 Global Step: 229450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:49:43,653-Speed 4319.40 samples/sec Loss 0.7813 Epoch: 13 Global Step: 229500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:49:56,496-Speed 3986.61 samples/sec Loss 0.7966 Epoch: 13 Global Step: 229550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:50:08,391-Speed 4304.44 samples/sec Loss 0.7855 Epoch: 13 Global Step: 229600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:50:20,218-Speed 4329.28 samples/sec Loss 0.7723 Epoch: 13 Global Step: 229650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:50:31,879-Speed 4390.89 samples/sec Loss 0.7856 Epoch: 13 Global Step: 229700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:50:43,622-Speed 4360.23 samples/sec Loss 0.7920 Epoch: 13 Global Step: 229750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:50:55,401-Speed 4346.99 samples/sec Loss 0.7889 Epoch: 13 Global Step: 229800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:51:07,111-Speed 4372.29 samples/sec Loss 0.7650 Epoch: 13 Global Step: 229850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:51:18,899-Speed 4343.86 samples/sec Loss 0.7755 Epoch: 13 Global Step: 229900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:51:30,690-Speed 4342.31 samples/sec Loss 0.7740 Epoch: 13 Global Step: 229950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:51:42,397-Speed 4373.82 samples/sec Loss 0.7785 Epoch: 13 Global Step: 230000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:52:12,576-[lfw][230000]XNorm: 21.857025 Training: 2021-03-15 15:52:12,576-[lfw][230000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-15 15:52:12,576-[lfw][230000]Accuracy-Highest: 0.99833 Training: 2021-03-15 15:52:47,566-[cfp_fp][230000]XNorm: 20.988020 Training: 2021-03-15 15:52:47,567-[cfp_fp][230000]Accuracy-Flip: 0.99043+-0.00456 Training: 2021-03-15 15:52:47,567-[cfp_fp][230000]Accuracy-Highest: 0.99086 Training: 2021-03-15 15:53:17,781-[agedb_30][230000]XNorm: 22.131515 Training: 2021-03-15 15:53:17,781-[agedb_30][230000]Accuracy-Flip: 0.98133+-0.00698 Training: 2021-03-15 15:53:17,781-[agedb_30][230000]Accuracy-Highest: 0.98300 Training: 2021-03-15 15:53:29,656-Speed 477.35 samples/sec Loss 0.7683 Epoch: 13 Global Step: 230050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:53:41,402-Speed 4359.21 samples/sec Loss 0.7718 Epoch: 13 Global Step: 230100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:53:53,206-Speed 4337.63 samples/sec Loss 0.7754 Epoch: 13 Global Step: 230150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:54:05,074-Speed 4314.11 samples/sec Loss 0.7865 Epoch: 13 Global Step: 230200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:54:16,868-Speed 4341.69 samples/sec Loss 0.7809 Epoch: 13 Global Step: 230250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:54:28,550-Speed 4382.86 samples/sec Loss 0.7923 Epoch: 13 Global Step: 230300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:54:40,355-Speed 4337.22 samples/sec Loss 0.7762 Epoch: 13 Global Step: 230350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:54:52,821-Speed 4107.37 samples/sec Loss 0.7789 Epoch: 13 Global Step: 230400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:55:05,785-Speed 3949.49 samples/sec Loss 0.7839 Epoch: 13 Global Step: 230450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:55:17,448-Speed 4390.27 samples/sec Loss 0.7889 Epoch: 13 Global Step: 230500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:55:29,224-Speed 4347.88 samples/sec Loss 0.7866 Epoch: 13 Global Step: 230550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:55:40,805-Speed 4421.25 samples/sec Loss 0.7805 Epoch: 13 Global Step: 230600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:55:52,550-Speed 4359.67 samples/sec Loss 0.7794 Epoch: 13 Global Step: 230650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:56:04,528-Speed 4274.61 samples/sec Loss 0.7912 Epoch: 13 Global Step: 230700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:56:16,084-Speed 4430.67 samples/sec Loss 0.7918 Epoch: 13 Global Step: 230750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:56:27,839-Speed 4355.90 samples/sec Loss 0.7905 Epoch: 13 Global Step: 230800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:56:40,589-Speed 4015.91 samples/sec Loss 0.7861 Epoch: 13 Global Step: 230850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:56:52,274-Speed 4381.88 samples/sec Loss 0.7959 Epoch: 13 Global Step: 230900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:57:05,026-Speed 4015.08 samples/sec Loss 0.7933 Epoch: 13 Global Step: 230950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:57:17,564-Speed 4083.75 samples/sec Loss 0.7865 Epoch: 13 Global Step: 231000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:57:29,453-Speed 4306.78 samples/sec Loss 0.7901 Epoch: 13 Global Step: 231050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:57:41,462-Speed 4263.50 samples/sec Loss 0.7768 Epoch: 13 Global Step: 231100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:57:54,067-Speed 4061.89 samples/sec Loss 0.8008 Epoch: 13 Global Step: 231150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:58:05,935-Speed 4314.50 samples/sec Loss 0.7639 Epoch: 13 Global Step: 231200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:58:17,911-Speed 4275.42 samples/sec Loss 0.7711 Epoch: 13 Global Step: 231250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:58:29,600-Speed 4380.10 samples/sec Loss 0.7908 Epoch: 13 Global Step: 231300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:58:41,446-Speed 4322.50 samples/sec Loss 0.7674 Epoch: 13 Global Step: 231350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:58:53,278-Speed 4327.45 samples/sec Loss 0.7779 Epoch: 13 Global Step: 231400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:59:05,063-Speed 4344.45 samples/sec Loss 0.7849 Epoch: 13 Global Step: 231450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:59:16,894-Speed 4328.10 samples/sec Loss 0.7899 Epoch: 13 Global Step: 231500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:59:28,621-Speed 4365.89 samples/sec Loss 0.7831 Epoch: 13 Global Step: 231550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:59:40,385-Speed 4352.53 samples/sec Loss 0.7858 Epoch: 13 Global Step: 231600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 15:59:52,036-Speed 4394.68 samples/sec Loss 0.7884 Epoch: 13 Global Step: 231650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:00:04,673-Speed 4051.66 samples/sec Loss 0.7819 Epoch: 13 Global Step: 231700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:00:16,352-Speed 4384.08 samples/sec Loss 0.7768 Epoch: 13 Global Step: 231750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:00:28,065-Speed 4371.49 samples/sec Loss 0.7869 Epoch: 13 Global Step: 231800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:00:39,815-Speed 4357.65 samples/sec Loss 0.7891 Epoch: 13 Global Step: 231850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:00:51,682-Speed 4314.64 samples/sec Loss 0.7887 Epoch: 13 Global Step: 231900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:01:04,302-Speed 4057.17 samples/sec Loss 0.7761 Epoch: 13 Global Step: 231950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:01:16,086-Speed 4344.90 samples/sec Loss 0.8029 Epoch: 13 Global Step: 232000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:01:46,280-[lfw][232000]XNorm: 22.044576 Training: 2021-03-15 16:01:46,280-[lfw][232000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 16:01:46,280-[lfw][232000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:02:21,243-[cfp_fp][232000]XNorm: 21.690421 Training: 2021-03-15 16:02:21,243-[cfp_fp][232000]Accuracy-Flip: 0.98971+-0.00481 Training: 2021-03-15 16:02:21,243-[cfp_fp][232000]Accuracy-Highest: 0.99086 Training: 2021-03-15 16:02:51,438-[agedb_30][232000]XNorm: 22.546963 Training: 2021-03-15 16:02:51,438-[agedb_30][232000]Accuracy-Flip: 0.98250+-0.00638 Training: 2021-03-15 16:02:51,438-[agedb_30][232000]Accuracy-Highest: 0.98300 Training: 2021-03-15 16:03:03,201-Speed 478.00 samples/sec Loss 0.7927 Epoch: 13 Global Step: 232050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:03:15,063-Speed 4316.30 samples/sec Loss 0.7672 Epoch: 13 Global Step: 232100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:03:26,843-Speed 4346.59 samples/sec Loss 0.8071 Epoch: 13 Global Step: 232150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:03:38,380-Speed 4438.15 samples/sec Loss 0.7827 Epoch: 13 Global Step: 232200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:03:49,789-Speed 4487.96 samples/sec Loss 0.7792 Epoch: 13 Global Step: 232250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:04:01,929-Speed 4217.41 samples/sec Loss 0.7898 Epoch: 13 Global Step: 232300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:04:13,759-Speed 4328.27 samples/sec Loss 0.7739 Epoch: 13 Global Step: 232350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:04:25,507-Speed 4358.25 samples/sec Loss 0.7829 Epoch: 13 Global Step: 232400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:04:37,176-Speed 4387.88 samples/sec Loss 0.8058 Epoch: 13 Global Step: 232450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:04:48,894-Speed 4369.58 samples/sec Loss 0.7898 Epoch: 13 Global Step: 232500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:05:00,740-Speed 4322.39 samples/sec Loss 0.7758 Epoch: 13 Global Step: 232550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:05:12,783-Speed 4251.50 samples/sec Loss 0.7787 Epoch: 13 Global Step: 232600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:05:24,524-Speed 4360.82 samples/sec Loss 0.7847 Epoch: 13 Global Step: 232650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:05:36,470-Speed 4286.11 samples/sec Loss 0.7898 Epoch: 13 Global Step: 232700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:05:48,182-Speed 4371.93 samples/sec Loss 0.7959 Epoch: 13 Global Step: 232750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:06:00,902-Speed 4025.20 samples/sec Loss 0.7846 Epoch: 13 Global Step: 232800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:06:12,729-Speed 4329.36 samples/sec Loss 0.7854 Epoch: 13 Global Step: 232850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:06:24,551-Speed 4331.20 samples/sec Loss 0.7809 Epoch: 13 Global Step: 232900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:06:36,149-Speed 4414.80 samples/sec Loss 0.7687 Epoch: 13 Global Step: 232950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:06:48,055-Speed 4300.29 samples/sec Loss 0.7897 Epoch: 13 Global Step: 233000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:00,778-Speed 4024.46 samples/sec Loss 0.7965 Epoch: 13 Global Step: 233050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:12,506-Speed 4365.57 samples/sec Loss 0.7911 Epoch: 13 Global Step: 233100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:24,126-Speed 4406.70 samples/sec Loss 0.7692 Epoch: 13 Global Step: 233150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:36,125-Speed 4267.03 samples/sec Loss 0.7863 Epoch: 13 Global Step: 233200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:47,861-Speed 4362.84 samples/sec Loss 0.7898 Epoch: 13 Global Step: 233250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:07:59,449-Speed 4418.48 samples/sec Loss 0.7710 Epoch: 13 Global Step: 233300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:08:11,214-Speed 4352.02 samples/sec Loss 0.7890 Epoch: 13 Global Step: 233350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:08:23,013-Speed 4339.71 samples/sec Loss 0.7769 Epoch: 13 Global Step: 233400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:08:34,728-Speed 4370.72 samples/sec Loss 0.7959 Epoch: 13 Global Step: 233450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:08:46,736-Speed 4263.91 samples/sec Loss 0.7798 Epoch: 13 Global Step: 233500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:09:00,116-Speed 3826.67 samples/sec Loss 0.7903 Epoch: 13 Global Step: 233550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:09:12,062-Speed 4286.34 samples/sec Loss 0.7885 Epoch: 13 Global Step: 233600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:09:24,571-Speed 4093.22 samples/sec Loss 0.7789 Epoch: 13 Global Step: 233650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:09:48,975-Speed 2098.03 samples/sec Loss 0.7501 Epoch: 14 Global Step: 233700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:10:02,546-Speed 3772.87 samples/sec Loss 0.6881 Epoch: 14 Global Step: 233750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:10:14,517-Speed 4277.15 samples/sec Loss 0.6844 Epoch: 14 Global Step: 233800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:10:26,725-Speed 4194.22 samples/sec Loss 0.6879 Epoch: 14 Global Step: 233850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:10:38,594-Speed 4313.90 samples/sec Loss 0.6762 Epoch: 14 Global Step: 233900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:10:50,441-Speed 4322.10 samples/sec Loss 0.6814 Epoch: 14 Global Step: 233950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:11:02,291-Speed 4320.97 samples/sec Loss 0.6933 Epoch: 14 Global Step: 234000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:11:32,712-[lfw][234000]XNorm: 22.093678 Training: 2021-03-15 16:11:32,713-[lfw][234000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 16:11:32,713-[lfw][234000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:12:07,942-[cfp_fp][234000]XNorm: 21.674746 Training: 2021-03-15 16:12:07,943-[cfp_fp][234000]Accuracy-Flip: 0.99029+-0.00455 Training: 2021-03-15 16:12:07,943-[cfp_fp][234000]Accuracy-Highest: 0.99086 Training: 2021-03-15 16:12:38,290-[agedb_30][234000]XNorm: 22.627067 Training: 2021-03-15 16:12:38,291-[agedb_30][234000]Accuracy-Flip: 0.98300+-0.00682 Training: 2021-03-15 16:12:38,291-[agedb_30][234000]Accuracy-Highest: 0.98300 Training: 2021-03-15 16:12:50,008-Speed 475.32 samples/sec Loss 0.6624 Epoch: 14 Global Step: 234050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:13:01,852-Speed 4323.16 samples/sec Loss 0.6664 Epoch: 14 Global Step: 234100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:13:13,668-Speed 4333.32 samples/sec Loss 0.6644 Epoch: 14 Global Step: 234150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:13:25,275-Speed 4411.10 samples/sec Loss 0.6619 Epoch: 14 Global Step: 234200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:13:37,014-Speed 4361.86 samples/sec Loss 0.6662 Epoch: 14 Global Step: 234250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:13:48,643-Speed 4402.83 samples/sec Loss 0.6798 Epoch: 14 Global Step: 234300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:14:02,005-Speed 3831.87 samples/sec Loss 0.6598 Epoch: 14 Global Step: 234350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:14:13,506-Speed 4452.01 samples/sec Loss 0.6660 Epoch: 14 Global Step: 234400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:14:25,438-Speed 4291.20 samples/sec Loss 0.6518 Epoch: 14 Global Step: 234450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:14:37,287-Speed 4321.33 samples/sec Loss 0.6678 Epoch: 14 Global Step: 234500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:14:48,987-Speed 4376.26 samples/sec Loss 0.6581 Epoch: 14 Global Step: 234550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:00,608-Speed 4405.87 samples/sec Loss 0.6363 Epoch: 14 Global Step: 234600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:12,519-Speed 4298.62 samples/sec Loss 0.6474 Epoch: 14 Global Step: 234650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:24,171-Speed 4394.50 samples/sec Loss 0.6437 Epoch: 14 Global Step: 234700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:35,856-Speed 4381.74 samples/sec Loss 0.6493 Epoch: 14 Global Step: 234750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:47,548-Speed 4379.40 samples/sec Loss 0.6585 Epoch: 14 Global Step: 234800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:15:59,105-Speed 4430.28 samples/sec Loss 0.6576 Epoch: 14 Global Step: 234850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:16:10,930-Speed 4329.85 samples/sec Loss 0.6423 Epoch: 14 Global Step: 234900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:16:22,567-Speed 4400.17 samples/sec Loss 0.6303 Epoch: 14 Global Step: 234950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:16:34,153-Speed 4419.04 samples/sec Loss 0.6490 Epoch: 14 Global Step: 235000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:16:45,891-Speed 4361.99 samples/sec Loss 0.6566 Epoch: 14 Global Step: 235050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:16:57,590-Speed 4376.86 samples/sec Loss 0.6454 Epoch: 14 Global Step: 235100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:17:09,686-Speed 4232.87 samples/sec Loss 0.6555 Epoch: 14 Global Step: 235150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:17:21,268-Speed 4421.09 samples/sec Loss 0.6459 Epoch: 14 Global Step: 235200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:17:33,774-Speed 4093.93 samples/sec Loss 0.6430 Epoch: 14 Global Step: 235250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:17:45,494-Speed 4368.82 samples/sec Loss 0.6455 Epoch: 14 Global Step: 235300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:17:57,086-Speed 4417.22 samples/sec Loss 0.6505 Epoch: 14 Global Step: 235350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:18:08,877-Speed 4342.26 samples/sec Loss 0.6408 Epoch: 14 Global Step: 235400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:18:20,561-Speed 4382.40 samples/sec Loss 0.6544 Epoch: 14 Global Step: 235450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:18:32,280-Speed 4368.98 samples/sec Loss 0.6387 Epoch: 14 Global Step: 235500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:18:43,989-Speed 4373.06 samples/sec Loss 0.6459 Epoch: 14 Global Step: 235550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:18:55,724-Speed 4363.09 samples/sec Loss 0.6449 Epoch: 14 Global Step: 235600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:19:08,262-Speed 4083.66 samples/sec Loss 0.6548 Epoch: 14 Global Step: 235650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:19:20,036-Speed 4348.82 samples/sec Loss 0.6590 Epoch: 14 Global Step: 235700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:19:31,771-Speed 4363.28 samples/sec Loss 0.6463 Epoch: 14 Global Step: 235750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:19:43,356-Speed 4419.49 samples/sec Loss 0.6542 Epoch: 14 Global Step: 235800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:19:55,054-Speed 4376.92 samples/sec Loss 0.6358 Epoch: 14 Global Step: 235850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:20:06,864-Speed 4335.57 samples/sec Loss 0.6356 Epoch: 14 Global Step: 235900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:20:18,571-Speed 4373.68 samples/sec Loss 0.6408 Epoch: 14 Global Step: 235950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:20:30,411-Speed 4324.57 samples/sec Loss 0.6334 Epoch: 14 Global Step: 236000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:21:00,614-[lfw][236000]XNorm: 22.063157 Training: 2021-03-15 16:21:00,615-[lfw][236000]Accuracy-Flip: 0.99783+-0.00289 Training: 2021-03-15 16:21:00,615-[lfw][236000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:21:35,751-[cfp_fp][236000]XNorm: 21.864097 Training: 2021-03-15 16:21:35,751-[cfp_fp][236000]Accuracy-Flip: 0.99100+-0.00409 Training: 2021-03-15 16:21:35,752-[cfp_fp][236000]Accuracy-Highest: 0.99100 Training: 2021-03-15 16:22:06,135-[agedb_30][236000]XNorm: 22.685214 Training: 2021-03-15 16:22:06,136-[agedb_30][236000]Accuracy-Flip: 0.98283+-0.00703 Training: 2021-03-15 16:22:06,136-[agedb_30][236000]Accuracy-Highest: 0.98300 Training: 2021-03-15 16:22:17,962-Speed 476.05 samples/sec Loss 0.6477 Epoch: 14 Global Step: 236050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:22:29,658-Speed 4377.62 samples/sec Loss 0.6384 Epoch: 14 Global Step: 236100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:22:42,067-Speed 4126.32 samples/sec Loss 0.6584 Epoch: 14 Global Step: 236150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:22:54,885-Speed 3994.40 samples/sec Loss 0.6432 Epoch: 14 Global Step: 236200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:23:07,574-Speed 4035.41 samples/sec Loss 0.6578 Epoch: 14 Global Step: 236250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:23:19,416-Speed 4323.47 samples/sec Loss 0.6350 Epoch: 14 Global Step: 236300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:23:31,862-Speed 4114.10 samples/sec Loss 0.6396 Epoch: 14 Global Step: 236350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:23:43,605-Speed 4360.11 samples/sec Loss 0.6424 Epoch: 14 Global Step: 236400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:23:55,481-Speed 4311.59 samples/sec Loss 0.6358 Epoch: 14 Global Step: 236450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:24:07,523-Speed 4251.75 samples/sec Loss 0.6347 Epoch: 14 Global Step: 236500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:24:19,187-Speed 4389.63 samples/sec Loss 0.6394 Epoch: 14 Global Step: 236550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:24:31,019-Speed 4327.78 samples/sec Loss 0.6274 Epoch: 14 Global Step: 236600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:24:42,685-Speed 4389.06 samples/sec Loss 0.6243 Epoch: 14 Global Step: 236650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:24:54,313-Speed 4403.11 samples/sec Loss 0.6467 Epoch: 14 Global Step: 236700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:25:05,853-Speed 4436.87 samples/sec Loss 0.6456 Epoch: 14 Global Step: 236750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:25:17,540-Speed 4381.28 samples/sec Loss 0.6263 Epoch: 14 Global Step: 236800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:25:30,054-Speed 4091.63 samples/sec Loss 0.6431 Epoch: 14 Global Step: 236850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:25:41,844-Speed 4342.87 samples/sec Loss 0.6296 Epoch: 14 Global Step: 236900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:25:54,304-Speed 4109.12 samples/sec Loss 0.6476 Epoch: 14 Global Step: 236950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:26:06,020-Speed 4370.12 samples/sec Loss 0.6334 Epoch: 14 Global Step: 237000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:26:17,737-Speed 4369.87 samples/sec Loss 0.6279 Epoch: 14 Global Step: 237050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:26:29,711-Speed 4276.19 samples/sec Loss 0.6354 Epoch: 14 Global Step: 237100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:26:41,495-Speed 4345.18 samples/sec Loss 0.6515 Epoch: 14 Global Step: 237150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:26:53,386-Speed 4305.79 samples/sec Loss 0.6438 Epoch: 14 Global Step: 237200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:27:05,004-Speed 4407.33 samples/sec Loss 0.6302 Epoch: 14 Global Step: 237250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:27:16,855-Speed 4320.31 samples/sec Loss 0.6232 Epoch: 14 Global Step: 237300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:27:28,390-Speed 4438.94 samples/sec Loss 0.6397 Epoch: 14 Global Step: 237350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:27:40,050-Speed 4391.44 samples/sec Loss 0.6272 Epoch: 14 Global Step: 237400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:27:51,513-Speed 4466.63 samples/sec Loss 0.6526 Epoch: 14 Global Step: 237450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:28:02,967-Speed 4470.46 samples/sec Loss 0.6325 Epoch: 14 Global Step: 237500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:28:14,494-Speed 4441.58 samples/sec Loss 0.6278 Epoch: 14 Global Step: 237550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:28:26,161-Speed 4388.76 samples/sec Loss 0.6247 Epoch: 14 Global Step: 237600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:28:37,672-Speed 4448.21 samples/sec Loss 0.6424 Epoch: 14 Global Step: 237650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:28:49,595-Speed 4294.35 samples/sec Loss 0.6314 Epoch: 14 Global Step: 237700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:29:01,928-Speed 4151.71 samples/sec Loss 0.6339 Epoch: 14 Global Step: 237750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:29:13,600-Speed 4386.43 samples/sec Loss 0.6341 Epoch: 14 Global Step: 237800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:29:25,305-Speed 4374.34 samples/sec Loss 0.6407 Epoch: 14 Global Step: 237850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:29:37,112-Speed 4336.65 samples/sec Loss 0.6267 Epoch: 14 Global Step: 237900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:29:48,957-Speed 4322.93 samples/sec Loss 0.6313 Epoch: 14 Global Step: 237950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:30:00,638-Speed 4383.33 samples/sec Loss 0.6428 Epoch: 14 Global Step: 238000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:30:30,928-[lfw][238000]XNorm: 22.121038 Training: 2021-03-15 16:30:30,929-[lfw][238000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-15 16:30:30,929-[lfw][238000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:31:06,210-[cfp_fp][238000]XNorm: 21.862534 Training: 2021-03-15 16:31:06,210-[cfp_fp][238000]Accuracy-Flip: 0.99129+-0.00401 Training: 2021-03-15 16:31:06,210-[cfp_fp][238000]Accuracy-Highest: 0.99129 Training: 2021-03-15 16:31:36,631-[agedb_30][238000]XNorm: 22.756461 Training: 2021-03-15 16:31:36,631-[agedb_30][238000]Accuracy-Flip: 0.98283+-0.00760 Training: 2021-03-15 16:31:36,631-[agedb_30][238000]Accuracy-Highest: 0.98300 Training: 2021-03-15 16:31:48,272-Speed 475.69 samples/sec Loss 0.6273 Epoch: 14 Global Step: 238050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:00,075-Speed 4337.79 samples/sec Loss 0.6278 Epoch: 14 Global Step: 238100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:12,000-Speed 4293.83 samples/sec Loss 0.6350 Epoch: 14 Global Step: 238150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:23,639-Speed 4399.16 samples/sec Loss 0.6332 Epoch: 14 Global Step: 238200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:36,096-Speed 4110.37 samples/sec Loss 0.6470 Epoch: 14 Global Step: 238250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:47,722-Speed 4404.18 samples/sec Loss 0.6233 Epoch: 14 Global Step: 238300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-15 16:32:59,361-Speed 4399.09 samples/sec Loss 0.6282 Epoch: 14 Global Step: 238350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:33:11,090-Speed 4365.18 samples/sec Loss 0.6191 Epoch: 14 Global Step: 238400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:33:22,806-Speed 4370.37 samples/sec Loss 0.6283 Epoch: 14 Global Step: 238450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:33:34,924-Speed 4225.19 samples/sec Loss 0.6291 Epoch: 14 Global Step: 238500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:33:46,511-Speed 4419.26 samples/sec Loss 0.6421 Epoch: 14 Global Step: 238550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:33:59,136-Speed 4055.59 samples/sec Loss 0.6211 Epoch: 14 Global Step: 238600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:34:10,792-Speed 4392.69 samples/sec Loss 0.6274 Epoch: 14 Global Step: 238650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:34:22,670-Speed 4310.71 samples/sec Loss 0.6354 Epoch: 14 Global Step: 238700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:34:34,472-Speed 4338.34 samples/sec Loss 0.6411 Epoch: 14 Global Step: 238750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:34:46,345-Speed 4312.56 samples/sec Loss 0.6302 Epoch: 14 Global Step: 238800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:34:58,917-Speed 4072.39 samples/sec Loss 0.6341 Epoch: 14 Global Step: 238850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:35:10,583-Speed 4389.08 samples/sec Loss 0.6294 Epoch: 14 Global Step: 238900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:35:22,355-Speed 4349.51 samples/sec Loss 0.6364 Epoch: 14 Global Step: 238950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:35:35,084-Speed 4022.58 samples/sec Loss 0.6296 Epoch: 14 Global Step: 239000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:35:47,644-Speed 4076.55 samples/sec Loss 0.6407 Epoch: 14 Global Step: 239050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:35:59,389-Speed 4359.45 samples/sec Loss 0.6327 Epoch: 14 Global Step: 239100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:36:11,275-Speed 4307.70 samples/sec Loss 0.6218 Epoch: 14 Global Step: 239150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:36:23,328-Speed 4248.05 samples/sec Loss 0.6352 Epoch: 14 Global Step: 239200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:36:35,176-Speed 4321.61 samples/sec Loss 0.6230 Epoch: 14 Global Step: 239250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:36:46,901-Speed 4366.92 samples/sec Loss 0.6334 Epoch: 14 Global Step: 239300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:36:59,266-Speed 4140.88 samples/sec Loss 0.6260 Epoch: 14 Global Step: 239350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:37:11,086-Speed 4331.71 samples/sec Loss 0.6342 Epoch: 14 Global Step: 239400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:37:22,772-Speed 4381.79 samples/sec Loss 0.6264 Epoch: 14 Global Step: 239450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:37:34,527-Speed 4355.42 samples/sec Loss 0.6255 Epoch: 14 Global Step: 239500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:37:46,190-Speed 4390.22 samples/sec Loss 0.6313 Epoch: 14 Global Step: 239550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:37:57,962-Speed 4349.60 samples/sec Loss 0.6412 Epoch: 14 Global Step: 239600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:38:09,726-Speed 4352.36 samples/sec Loss 0.6223 Epoch: 14 Global Step: 239650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:38:22,468-Speed 4018.27 samples/sec Loss 0.6227 Epoch: 14 Global Step: 239700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:38:34,166-Speed 4377.24 samples/sec Loss 0.6291 Epoch: 14 Global Step: 239750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:38:46,131-Speed 4279.22 samples/sec Loss 0.6238 Epoch: 14 Global Step: 239800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:38:57,858-Speed 4366.11 samples/sec Loss 0.6180 Epoch: 14 Global Step: 239850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:39:09,623-Speed 4352.05 samples/sec Loss 0.6293 Epoch: 14 Global Step: 239900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:39:21,356-Speed 4364.05 samples/sec Loss 0.6345 Epoch: 14 Global Step: 239950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:39:33,385-Speed 4256.56 samples/sec Loss 0.6242 Epoch: 14 Global Step: 240000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:40:03,389-[lfw][240000]XNorm: 21.966635 Training: 2021-03-15 16:40:03,389-[lfw][240000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-15 16:40:03,389-[lfw][240000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:40:38,376-[cfp_fp][240000]XNorm: 21.758907 Training: 2021-03-15 16:40:38,376-[cfp_fp][240000]Accuracy-Flip: 0.99171+-0.00377 Training: 2021-03-15 16:40:38,376-[cfp_fp][240000]Accuracy-Highest: 0.99171 Training: 2021-03-15 16:41:08,532-[agedb_30][240000]XNorm: 22.626100 Training: 2021-03-15 16:41:08,533-[agedb_30][240000]Accuracy-Flip: 0.98333+-0.00691 Training: 2021-03-15 16:41:08,533-[agedb_30][240000]Accuracy-Highest: 0.98333 Training: 2021-03-15 16:41:20,243-Speed 479.14 samples/sec Loss 0.6262 Epoch: 14 Global Step: 240050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:41:31,852-Speed 4410.67 samples/sec Loss 0.6336 Epoch: 14 Global Step: 240100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:41:43,398-Speed 4434.42 samples/sec Loss 0.6290 Epoch: 14 Global Step: 240150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:41:55,260-Speed 4316.80 samples/sec Loss 0.6272 Epoch: 14 Global Step: 240200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:42:07,885-Speed 4055.29 samples/sec Loss 0.6388 Epoch: 14 Global Step: 240250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:42:19,628-Speed 4360.26 samples/sec Loss 0.6084 Epoch: 14 Global Step: 240300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:42:31,361-Speed 4363.99 samples/sec Loss 0.6236 Epoch: 14 Global Step: 240350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:42:43,072-Speed 4372.08 samples/sec Loss 0.6191 Epoch: 14 Global Step: 240400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:42:54,820-Speed 4358.50 samples/sec Loss 0.6192 Epoch: 14 Global Step: 240450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:43:06,436-Speed 4407.75 samples/sec Loss 0.6257 Epoch: 14 Global Step: 240500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:43:18,035-Speed 4414.34 samples/sec Loss 0.6270 Epoch: 14 Global Step: 240550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:43:29,740-Speed 4374.33 samples/sec Loss 0.6322 Epoch: 14 Global Step: 240600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:43:41,659-Speed 4295.96 samples/sec Loss 0.6349 Epoch: 14 Global Step: 240650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:43:53,299-Speed 4398.88 samples/sec Loss 0.6201 Epoch: 14 Global Step: 240700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:44:05,064-Speed 4351.91 samples/sec Loss 0.6110 Epoch: 14 Global Step: 240750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:44:16,895-Speed 4327.96 samples/sec Loss 0.6233 Epoch: 14 Global Step: 240800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:44:28,675-Speed 4346.53 samples/sec Loss 0.6373 Epoch: 14 Global Step: 240850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:44:41,108-Speed 4118.23 samples/sec Loss 0.6233 Epoch: 14 Global Step: 240900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:44:52,934-Speed 4329.75 samples/sec Loss 0.6152 Epoch: 14 Global Step: 240950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:45:05,366-Speed 4118.55 samples/sec Loss 0.6228 Epoch: 14 Global Step: 241000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:45:17,025-Speed 4391.64 samples/sec Loss 0.6296 Epoch: 14 Global Step: 241050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:45:28,733-Speed 4372.99 samples/sec Loss 0.6220 Epoch: 14 Global Step: 241100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:45:40,522-Speed 4343.35 samples/sec Loss 0.6137 Epoch: 14 Global Step: 241150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:45:52,095-Speed 4424.05 samples/sec Loss 0.6329 Epoch: 14 Global Step: 241200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:46:03,838-Speed 4360.47 samples/sec Loss 0.6285 Epoch: 14 Global Step: 241250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:46:15,714-Speed 4311.33 samples/sec Loss 0.6185 Epoch: 14 Global Step: 241300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:46:27,619-Speed 4300.91 samples/sec Loss 0.6276 Epoch: 14 Global Step: 241350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:46:39,385-Speed 4351.42 samples/sec Loss 0.6246 Epoch: 14 Global Step: 241400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:46:51,057-Speed 4386.94 samples/sec Loss 0.6482 Epoch: 14 Global Step: 241450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:47:03,847-Speed 4003.16 samples/sec Loss 0.6260 Epoch: 14 Global Step: 241500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:47:15,644-Speed 4340.36 samples/sec Loss 0.6212 Epoch: 14 Global Step: 241550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:47:27,059-Speed 4485.62 samples/sec Loss 0.6150 Epoch: 14 Global Step: 241600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:47:38,916-Speed 4318.23 samples/sec Loss 0.6377 Epoch: 14 Global Step: 241650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:47:51,415-Speed 4096.63 samples/sec Loss 0.6224 Epoch: 14 Global Step: 241700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:48:04,032-Speed 4058.09 samples/sec Loss 0.6301 Epoch: 14 Global Step: 241750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:48:16,673-Speed 4050.32 samples/sec Loss 0.6284 Epoch: 14 Global Step: 241800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:48:28,690-Speed 4260.99 samples/sec Loss 0.6399 Epoch: 14 Global Step: 241850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:48:40,320-Speed 4402.37 samples/sec Loss 0.6295 Epoch: 14 Global Step: 241900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:48:51,899-Speed 4422.18 samples/sec Loss 0.6235 Epoch: 14 Global Step: 241950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:49:03,695-Speed 4340.67 samples/sec Loss 0.6271 Epoch: 14 Global Step: 242000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:49:33,797-[lfw][242000]XNorm: 22.060869 Training: 2021-03-15 16:49:33,797-[lfw][242000]Accuracy-Flip: 0.99800+-0.00296 Training: 2021-03-15 16:49:33,797-[lfw][242000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:50:08,889-[cfp_fp][242000]XNorm: 21.884803 Training: 2021-03-15 16:50:08,889-[cfp_fp][242000]Accuracy-Flip: 0.99157+-0.00391 Training: 2021-03-15 16:50:08,889-[cfp_fp][242000]Accuracy-Highest: 0.99171 Training: 2021-03-15 16:50:39,158-[agedb_30][242000]XNorm: 22.721823 Training: 2021-03-15 16:50:39,159-[agedb_30][242000]Accuracy-Flip: 0.98267+-0.00727 Training: 2021-03-15 16:50:39,159-[agedb_30][242000]Accuracy-Highest: 0.98333 Training: 2021-03-15 16:50:50,813-Speed 477.98 samples/sec Loss 0.6255 Epoch: 14 Global Step: 242050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:51:02,550-Speed 4362.38 samples/sec Loss 0.6168 Epoch: 14 Global Step: 242100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:51:14,166-Speed 4407.93 samples/sec Loss 0.6322 Epoch: 14 Global Step: 242150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:51:25,891-Speed 4367.16 samples/sec Loss 0.6348 Epoch: 14 Global Step: 242200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:51:37,766-Speed 4311.64 samples/sec Loss 0.6194 Epoch: 14 Global Step: 242250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:51:49,383-Speed 4407.32 samples/sec Loss 0.6264 Epoch: 14 Global Step: 242300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:52:01,145-Speed 4353.18 samples/sec Loss 0.6224 Epoch: 14 Global Step: 242350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:52:12,920-Speed 4348.49 samples/sec Loss 0.6314 Epoch: 14 Global Step: 242400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:52:25,515-Speed 4065.36 samples/sec Loss 0.6252 Epoch: 14 Global Step: 242450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:52:37,202-Speed 4380.93 samples/sec Loss 0.6401 Epoch: 14 Global Step: 242500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:52:49,180-Speed 4274.64 samples/sec Loss 0.6282 Epoch: 14 Global Step: 242550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:53:00,947-Speed 4351.25 samples/sec Loss 0.6311 Epoch: 14 Global Step: 242600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:53:12,672-Speed 4367.16 samples/sec Loss 0.6354 Epoch: 14 Global Step: 242650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:53:24,366-Speed 4378.27 samples/sec Loss 0.6235 Epoch: 14 Global Step: 242700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:53:36,138-Speed 4349.62 samples/sec Loss 0.6203 Epoch: 14 Global Step: 242750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:53:48,588-Speed 4112.66 samples/sec Loss 0.6107 Epoch: 14 Global Step: 242800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:00,346-Speed 4354.51 samples/sec Loss 0.6136 Epoch: 14 Global Step: 242850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:12,181-Speed 4326.40 samples/sec Loss 0.6217 Epoch: 14 Global Step: 242900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:24,299-Speed 4225.10 samples/sec Loss 0.6324 Epoch: 14 Global Step: 242950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:36,125-Speed 4329.77 samples/sec Loss 0.6085 Epoch: 14 Global Step: 243000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:47,604-Speed 4460.27 samples/sec Loss 0.6149 Epoch: 14 Global Step: 243050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:54:59,366-Speed 4353.46 samples/sec Loss 0.6301 Epoch: 14 Global Step: 243100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:55:11,272-Speed 4300.32 samples/sec Loss 0.6363 Epoch: 14 Global Step: 243150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:55:23,109-Speed 4325.57 samples/sec Loss 0.6212 Epoch: 14 Global Step: 243200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:55:34,790-Speed 4383.51 samples/sec Loss 0.6337 Epoch: 14 Global Step: 243250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:55:46,694-Speed 4301.31 samples/sec Loss 0.6327 Epoch: 14 Global Step: 243300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:55:58,865-Speed 4206.69 samples/sec Loss 0.6245 Epoch: 14 Global Step: 243350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:56:10,658-Speed 4341.86 samples/sec Loss 0.6136 Epoch: 14 Global Step: 243400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:56:22,280-Speed 4405.48 samples/sec Loss 0.6355 Epoch: 14 Global Step: 243450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:56:35,755-Speed 3799.98 samples/sec Loss 0.6271 Epoch: 14 Global Step: 243500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:56:47,425-Speed 4387.27 samples/sec Loss 0.6327 Epoch: 14 Global Step: 243550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:56:59,162-Speed 4362.51 samples/sec Loss 0.6341 Epoch: 14 Global Step: 243600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:57:10,921-Speed 4354.12 samples/sec Loss 0.6192 Epoch: 14 Global Step: 243650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:57:22,633-Speed 4371.75 samples/sec Loss 0.6121 Epoch: 14 Global Step: 243700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:57:34,353-Speed 4369.11 samples/sec Loss 0.6344 Epoch: 14 Global Step: 243750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:57:46,200-Speed 4321.80 samples/sec Loss 0.6221 Epoch: 14 Global Step: 243800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:57:58,002-Speed 4338.36 samples/sec Loss 0.6197 Epoch: 14 Global Step: 243850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:58:09,616-Speed 4408.58 samples/sec Loss 0.6230 Epoch: 14 Global Step: 243900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:58:21,471-Speed 4319.27 samples/sec Loss 0.6158 Epoch: 14 Global Step: 243950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:58:33,360-Speed 4306.44 samples/sec Loss 0.6349 Epoch: 14 Global Step: 244000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 16:59:03,638-[lfw][244000]XNorm: 22.084265 Training: 2021-03-15 16:59:03,638-[lfw][244000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 16:59:03,638-[lfw][244000]Accuracy-Highest: 0.99833 Training: 2021-03-15 16:59:38,803-[cfp_fp][244000]XNorm: 21.858084 Training: 2021-03-15 16:59:38,804-[cfp_fp][244000]Accuracy-Flip: 0.99129+-0.00401 Training: 2021-03-15 16:59:38,804-[cfp_fp][244000]Accuracy-Highest: 0.99171 Training: 2021-03-15 17:00:09,091-[agedb_30][244000]XNorm: 22.671280 Training: 2021-03-15 17:00:09,091-[agedb_30][244000]Accuracy-Flip: 0.98250+-0.00757 Training: 2021-03-15 17:00:09,091-[agedb_30][244000]Accuracy-Highest: 0.98333 Training: 2021-03-15 17:00:21,090-Speed 475.27 samples/sec Loss 0.6202 Epoch: 14 Global Step: 244050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:00:32,659-Speed 4425.76 samples/sec Loss 0.6311 Epoch: 14 Global Step: 244100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:00:44,343-Speed 4382.09 samples/sec Loss 0.6297 Epoch: 14 Global Step: 244150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:00:57,927-Speed 3769.24 samples/sec Loss 0.6171 Epoch: 14 Global Step: 244200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:01:09,427-Speed 4452.48 samples/sec Loss 0.6261 Epoch: 14 Global Step: 244250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:01:21,888-Speed 4109.01 samples/sec Loss 0.6185 Epoch: 14 Global Step: 244300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:01:33,513-Speed 4404.23 samples/sec Loss 0.6268 Epoch: 14 Global Step: 244350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:01:45,205-Speed 4379.21 samples/sec Loss 0.6250 Epoch: 14 Global Step: 244400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:01:56,901-Speed 4378.06 samples/sec Loss 0.6269 Epoch: 14 Global Step: 244450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:02:08,655-Speed 4356.06 samples/sec Loss 0.6207 Epoch: 14 Global Step: 244500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:02:21,176-Speed 4089.32 samples/sec Loss 0.6329 Epoch: 14 Global Step: 244550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:02:33,063-Speed 4307.23 samples/sec Loss 0.6212 Epoch: 14 Global Step: 244600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:02:44,839-Speed 4347.94 samples/sec Loss 0.6175 Epoch: 14 Global Step: 244650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:02:56,813-Speed 4276.36 samples/sec Loss 0.6189 Epoch: 14 Global Step: 244700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:03:08,621-Speed 4335.94 samples/sec Loss 0.6238 Epoch: 14 Global Step: 244750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:03:20,360-Speed 4362.00 samples/sec Loss 0.6302 Epoch: 14 Global Step: 244800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:03:32,138-Speed 4347.13 samples/sec Loss 0.6299 Epoch: 14 Global Step: 244850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:03:43,924-Speed 4344.17 samples/sec Loss 0.6213 Epoch: 14 Global Step: 244900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:03:55,667-Speed 4360.31 samples/sec Loss 0.6309 Epoch: 14 Global Step: 244950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:04:07,216-Speed 4433.44 samples/sec Loss 0.6279 Epoch: 14 Global Step: 245000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:04:19,020-Speed 4337.73 samples/sec Loss 0.6358 Epoch: 14 Global Step: 245050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:04:30,884-Speed 4315.80 samples/sec Loss 0.6175 Epoch: 14 Global Step: 245100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:04:43,600-Speed 4026.50 samples/sec Loss 0.6216 Epoch: 14 Global Step: 245150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:04:55,101-Speed 4451.98 samples/sec Loss 0.6260 Epoch: 14 Global Step: 245200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:05:06,724-Speed 4405.34 samples/sec Loss 0.6260 Epoch: 14 Global Step: 245250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:05:18,403-Speed 4383.91 samples/sec Loss 0.6162 Epoch: 14 Global Step: 245300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:05:31,070-Speed 4042.08 samples/sec Loss 0.6210 Epoch: 14 Global Step: 245350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:05:42,878-Speed 4336.32 samples/sec Loss 0.6251 Epoch: 14 Global Step: 245400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:05:54,746-Speed 4314.35 samples/sec Loss 0.6256 Epoch: 14 Global Step: 245450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:06:06,437-Speed 4379.45 samples/sec Loss 0.6194 Epoch: 14 Global Step: 245500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:06:18,372-Speed 4290.33 samples/sec Loss 0.6199 Epoch: 14 Global Step: 245550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:06:30,316-Speed 4286.63 samples/sec Loss 0.6207 Epoch: 14 Global Step: 245600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:06:41,983-Speed 4388.89 samples/sec Loss 0.6370 Epoch: 14 Global Step: 245650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:06:53,752-Speed 4350.20 samples/sec Loss 0.6056 Epoch: 14 Global Step: 245700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:07:05,536-Speed 4345.27 samples/sec Loss 0.6140 Epoch: 14 Global Step: 245750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:07:17,166-Speed 4402.56 samples/sec Loss 0.6121 Epoch: 14 Global Step: 245800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:07:28,812-Speed 4396.46 samples/sec Loss 0.6193 Epoch: 14 Global Step: 245850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:07:40,482-Speed 4387.63 samples/sec Loss 0.6165 Epoch: 14 Global Step: 245900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:07:51,971-Speed 4456.48 samples/sec Loss 0.6290 Epoch: 14 Global Step: 245950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:08:04,092-Speed 4224.12 samples/sec Loss 0.6255 Epoch: 14 Global Step: 246000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:08:34,354-[lfw][246000]XNorm: 21.901094 Training: 2021-03-15 17:08:34,354-[lfw][246000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 17:08:34,354-[lfw][246000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:09:09,462-[cfp_fp][246000]XNorm: 21.805831 Training: 2021-03-15 17:09:09,463-[cfp_fp][246000]Accuracy-Flip: 0.99100+-0.00474 Training: 2021-03-15 17:09:09,463-[cfp_fp][246000]Accuracy-Highest: 0.99171 Training: 2021-03-15 17:09:39,791-[agedb_30][246000]XNorm: 22.627938 Training: 2021-03-15 17:09:39,791-[agedb_30][246000]Accuracy-Flip: 0.98250+-0.00821 Training: 2021-03-15 17:09:39,791-[agedb_30][246000]Accuracy-Highest: 0.98333 Training: 2021-03-15 17:09:52,281-Speed 473.25 samples/sec Loss 0.6181 Epoch: 14 Global Step: 246050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:10:04,180-Speed 4303.05 samples/sec Loss 0.6145 Epoch: 14 Global Step: 246100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:10:15,788-Speed 4410.70 samples/sec Loss 0.6182 Epoch: 14 Global Step: 246150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:10:28,349-Speed 4076.28 samples/sec Loss 0.6282 Epoch: 14 Global Step: 246200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:10:40,024-Speed 4385.85 samples/sec Loss 0.6146 Epoch: 14 Global Step: 246250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:10:51,683-Speed 4391.30 samples/sec Loss 0.6299 Epoch: 14 Global Step: 246300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:11:03,195-Speed 4447.97 samples/sec Loss 0.6195 Epoch: 14 Global Step: 246350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:11:15,019-Speed 4330.45 samples/sec Loss 0.6209 Epoch: 14 Global Step: 246400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:11:26,795-Speed 4347.91 samples/sec Loss 0.6147 Epoch: 14 Global Step: 246450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:11:38,615-Speed 4331.73 samples/sec Loss 0.6208 Epoch: 14 Global Step: 246500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:11:50,371-Speed 4355.31 samples/sec Loss 0.6209 Epoch: 14 Global Step: 246550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:12:02,436-Speed 4244.00 samples/sec Loss 0.6136 Epoch: 14 Global Step: 246600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:12:14,899-Speed 4108.26 samples/sec Loss 0.6100 Epoch: 14 Global Step: 246650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:12:26,676-Speed 4347.62 samples/sec Loss 0.6201 Epoch: 14 Global Step: 246700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:12:39,436-Speed 4012.67 samples/sec Loss 0.6143 Epoch: 14 Global Step: 246750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:12:51,441-Speed 4265.09 samples/sec Loss 0.6217 Epoch: 14 Global Step: 246800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:13:03,057-Speed 4407.84 samples/sec Loss 0.6364 Epoch: 14 Global Step: 246850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:13:14,744-Speed 4381.00 samples/sec Loss 0.6222 Epoch: 14 Global Step: 246900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:13:26,757-Speed 4262.18 samples/sec Loss 0.6219 Epoch: 14 Global Step: 246950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:13:38,333-Speed 4423.47 samples/sec Loss 0.6291 Epoch: 14 Global Step: 247000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:13:50,824-Speed 4099.06 samples/sec Loss 0.6267 Epoch: 14 Global Step: 247050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:14:02,837-Speed 4262.05 samples/sec Loss 0.6210 Epoch: 14 Global Step: 247100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:14:14,419-Speed 4420.65 samples/sec Loss 0.6233 Epoch: 14 Global Step: 247150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:14:26,525-Speed 4229.74 samples/sec Loss 0.6321 Epoch: 14 Global Step: 247200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:14:38,960-Speed 4117.45 samples/sec Loss 0.6104 Epoch: 14 Global Step: 247250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:14:50,924-Speed 4279.68 samples/sec Loss 0.6289 Epoch: 14 Global Step: 247300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:15:02,696-Speed 4349.34 samples/sec Loss 0.6290 Epoch: 14 Global Step: 247350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:15:14,356-Speed 4391.40 samples/sec Loss 0.6173 Epoch: 14 Global Step: 247400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:15:25,989-Speed 4401.35 samples/sec Loss 0.6141 Epoch: 14 Global Step: 247450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:15:38,055-Speed 4243.73 samples/sec Loss 0.6267 Epoch: 14 Global Step: 247500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:15:49,775-Speed 4368.73 samples/sec Loss 0.6181 Epoch: 14 Global Step: 247550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:16:01,745-Speed 4277.48 samples/sec Loss 0.6149 Epoch: 14 Global Step: 247600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:16:13,329-Speed 4419.83 samples/sec Loss 0.6216 Epoch: 14 Global Step: 247650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:16:25,008-Speed 4384.16 samples/sec Loss 0.6266 Epoch: 14 Global Step: 247700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:16:36,812-Speed 4337.85 samples/sec Loss 0.6207 Epoch: 14 Global Step: 247750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:16:49,732-Speed 3962.94 samples/sec Loss 0.6316 Epoch: 14 Global Step: 247800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:17:02,403-Speed 4041.08 samples/sec Loss 0.6291 Epoch: 14 Global Step: 247850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:17:14,137-Speed 4363.25 samples/sec Loss 0.6021 Epoch: 14 Global Step: 247900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:17:25,654-Speed 4446.10 samples/sec Loss 0.6210 Epoch: 14 Global Step: 247950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:17:37,427-Speed 4348.93 samples/sec Loss 0.6144 Epoch: 14 Global Step: 248000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:18:07,477-[lfw][248000]XNorm: 22.013938 Training: 2021-03-15 17:18:07,478-[lfw][248000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 17:18:07,478-[lfw][248000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:18:42,407-[cfp_fp][248000]XNorm: 21.805950 Training: 2021-03-15 17:18:42,408-[cfp_fp][248000]Accuracy-Flip: 0.99186+-0.00367 Training: 2021-03-15 17:18:42,408-[cfp_fp][248000]Accuracy-Highest: 0.99186 Training: 2021-03-15 17:19:12,499-[agedb_30][248000]XNorm: 22.700745 Training: 2021-03-15 17:19:12,499-[agedb_30][248000]Accuracy-Flip: 0.98233+-0.00772 Training: 2021-03-15 17:19:12,499-[agedb_30][248000]Accuracy-Highest: 0.98333 Training: 2021-03-15 17:19:24,033-Speed 480.27 samples/sec Loss 0.6266 Epoch: 14 Global Step: 248050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:19:36,172-Speed 4218.13 samples/sec Loss 0.6185 Epoch: 14 Global Step: 248100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:19:47,932-Speed 4353.77 samples/sec Loss 0.6203 Epoch: 14 Global Step: 248150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:19:59,555-Speed 4405.22 samples/sec Loss 0.6323 Epoch: 14 Global Step: 248200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:20:11,226-Speed 4387.30 samples/sec Loss 0.6056 Epoch: 14 Global Step: 248250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:20:22,980-Speed 4356.07 samples/sec Loss 0.6316 Epoch: 14 Global Step: 248300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:20:34,974-Speed 4268.85 samples/sec Loss 0.6169 Epoch: 14 Global Step: 248350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:20:46,687-Speed 4371.56 samples/sec Loss 0.6046 Epoch: 14 Global Step: 248400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:20:58,426-Speed 4361.58 samples/sec Loss 0.6080 Epoch: 14 Global Step: 248450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:21:10,840-Speed 4124.72 samples/sec Loss 0.6267 Epoch: 14 Global Step: 248500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:21:22,573-Speed 4363.78 samples/sec Loss 0.6230 Epoch: 14 Global Step: 248550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:21:34,447-Speed 4312.18 samples/sec Loss 0.6223 Epoch: 14 Global Step: 248600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:21:46,329-Speed 4309.11 samples/sec Loss 0.6215 Epoch: 14 Global Step: 248650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:21:57,813-Speed 4458.41 samples/sec Loss 0.6272 Epoch: 14 Global Step: 248700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:22:09,476-Speed 4390.35 samples/sec Loss 0.6134 Epoch: 14 Global Step: 248750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:22:21,230-Speed 4355.94 samples/sec Loss 0.6178 Epoch: 14 Global Step: 248800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:22:33,801-Speed 4073.05 samples/sec Loss 0.6284 Epoch: 14 Global Step: 248850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:22:45,393-Speed 4417.25 samples/sec Loss 0.6165 Epoch: 14 Global Step: 248900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:22:56,930-Speed 4438.17 samples/sec Loss 0.6178 Epoch: 14 Global Step: 248950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:23:08,648-Speed 4369.23 samples/sec Loss 0.6245 Epoch: 14 Global Step: 249000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:23:21,330-Speed 4037.50 samples/sec Loss 0.6109 Epoch: 14 Global Step: 249050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:23:33,094-Speed 4352.46 samples/sec Loss 0.6114 Epoch: 14 Global Step: 249100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:23:44,860-Speed 4351.85 samples/sec Loss 0.6339 Epoch: 14 Global Step: 249150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:23:56,527-Speed 4388.59 samples/sec Loss 0.6097 Epoch: 14 Global Step: 249200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:24:08,184-Speed 4392.23 samples/sec Loss 0.6136 Epoch: 14 Global Step: 249250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:24:20,036-Speed 4319.99 samples/sec Loss 0.6246 Epoch: 14 Global Step: 249300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:24:31,972-Speed 4289.82 samples/sec Loss 0.6159 Epoch: 14 Global Step: 249350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:24:44,682-Speed 4028.63 samples/sec Loss 0.6196 Epoch: 14 Global Step: 249400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:24:56,451-Speed 4350.31 samples/sec Loss 0.6303 Epoch: 14 Global Step: 249450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:25:08,303-Speed 4320.17 samples/sec Loss 0.6047 Epoch: 14 Global Step: 249500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:25:19,891-Speed 4418.55 samples/sec Loss 0.6280 Epoch: 14 Global Step: 249550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:25:31,568-Speed 4385.15 samples/sec Loss 0.6274 Epoch: 14 Global Step: 249600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:25:43,479-Speed 4298.59 samples/sec Loss 0.6237 Epoch: 14 Global Step: 249650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:25:56,094-Speed 4058.86 samples/sec Loss 0.6061 Epoch: 14 Global Step: 249700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:26:08,013-Speed 4295.73 samples/sec Loss 0.6335 Epoch: 14 Global Step: 249750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:26:19,869-Speed 4318.76 samples/sec Loss 0.6177 Epoch: 14 Global Step: 249800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:26:31,694-Speed 4329.77 samples/sec Loss 0.6236 Epoch: 14 Global Step: 249850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:26:43,461-Speed 4351.51 samples/sec Loss 0.6141 Epoch: 14 Global Step: 249900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:26:55,202-Speed 4360.70 samples/sec Loss 0.6126 Epoch: 14 Global Step: 249950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:27:07,862-Speed 4044.40 samples/sec Loss 0.6277 Epoch: 14 Global Step: 250000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:27:38,045-[lfw][250000]XNorm: 21.971225 Training: 2021-03-15 17:27:38,045-[lfw][250000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 17:27:38,045-[lfw][250000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:28:13,084-[cfp_fp][250000]XNorm: 21.872680 Training: 2021-03-15 17:28:13,085-[cfp_fp][250000]Accuracy-Flip: 0.99143+-0.00414 Training: 2021-03-15 17:28:13,085-[cfp_fp][250000]Accuracy-Highest: 0.99186 Training: 2021-03-15 17:28:43,384-[agedb_30][250000]XNorm: 22.718656 Training: 2021-03-15 17:28:43,385-[agedb_30][250000]Accuracy-Flip: 0.98317+-0.00705 Training: 2021-03-15 17:28:43,385-[agedb_30][250000]Accuracy-Highest: 0.98333 Training: 2021-03-15 17:28:55,049-Speed 477.67 samples/sec Loss 0.6088 Epoch: 14 Global Step: 250050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:29:07,038-Speed 4270.83 samples/sec Loss 0.6151 Epoch: 14 Global Step: 250100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:29:18,915-Speed 4311.00 samples/sec Loss 0.6294 Epoch: 14 Global Step: 250150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:29:30,658-Speed 4360.11 samples/sec Loss 0.6181 Epoch: 14 Global Step: 250200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:29:42,297-Speed 4399.38 samples/sec Loss 0.6161 Epoch: 14 Global Step: 250250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:29:54,076-Speed 4346.71 samples/sec Loss 0.6097 Epoch: 14 Global Step: 250300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:30:06,615-Speed 4083.41 samples/sec Loss 0.6259 Epoch: 14 Global Step: 250350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:30:32,350-Speed 1989.59 samples/sec Loss 0.6018 Epoch: 15 Global Step: 250400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:30:45,544-Speed 3880.60 samples/sec Loss 0.6000 Epoch: 15 Global Step: 250450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:30:57,478-Speed 4290.69 samples/sec Loss 0.5993 Epoch: 15 Global Step: 250500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:31:09,287-Speed 4336.03 samples/sec Loss 0.5992 Epoch: 15 Global Step: 250550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:31:20,924-Speed 4399.78 samples/sec Loss 0.6052 Epoch: 15 Global Step: 250600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:31:32,535-Speed 4410.04 samples/sec Loss 0.6117 Epoch: 15 Global Step: 250650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:31:44,315-Speed 4346.60 samples/sec Loss 0.6082 Epoch: 15 Global Step: 250700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:31:56,045-Speed 4365.05 samples/sec Loss 0.6082 Epoch: 15 Global Step: 250750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:32:08,010-Speed 4279.39 samples/sec Loss 0.6168 Epoch: 15 Global Step: 250800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:32:19,965-Speed 4282.85 samples/sec Loss 0.6032 Epoch: 15 Global Step: 250850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:32:31,746-Speed 4346.14 samples/sec Loss 0.6165 Epoch: 15 Global Step: 250900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:32:43,487-Speed 4360.90 samples/sec Loss 0.6076 Epoch: 15 Global Step: 250950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:32:56,006-Speed 4089.79 samples/sec Loss 0.5969 Epoch: 15 Global Step: 251000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:33:07,956-Speed 4284.60 samples/sec Loss 0.6078 Epoch: 15 Global Step: 251050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 17:33:19,666-Speed 4372.80 samples/sec Loss 0.6205 Epoch: 15 Global Step: 251100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:33:31,596-Speed 4291.64 samples/sec Loss 0.6044 Epoch: 15 Global Step: 251150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:33:43,750-Speed 4212.97 samples/sec Loss 0.5959 Epoch: 15 Global Step: 251200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:33:55,547-Speed 4340.36 samples/sec Loss 0.6057 Epoch: 15 Global Step: 251250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:34:07,480-Speed 4290.62 samples/sec Loss 0.6074 Epoch: 15 Global Step: 251300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:34:19,119-Speed 4399.10 samples/sec Loss 0.6066 Epoch: 15 Global Step: 251350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:34:30,841-Speed 4368.04 samples/sec Loss 0.6081 Epoch: 15 Global Step: 251400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:34:42,670-Speed 4328.59 samples/sec Loss 0.6103 Epoch: 15 Global Step: 251450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:34:54,568-Speed 4303.41 samples/sec Loss 0.6111 Epoch: 15 Global Step: 251500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:35:07,139-Speed 4072.91 samples/sec Loss 0.6105 Epoch: 15 Global Step: 251550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:35:19,874-Speed 4020.64 samples/sec Loss 0.6214 Epoch: 15 Global Step: 251600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:35:31,579-Speed 4374.56 samples/sec Loss 0.6007 Epoch: 15 Global Step: 251650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:35:43,220-Speed 4398.42 samples/sec Loss 0.6069 Epoch: 15 Global Step: 251700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:35:55,082-Speed 4316.18 samples/sec Loss 0.6074 Epoch: 15 Global Step: 251750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:36:06,889-Speed 4336.73 samples/sec Loss 0.6056 Epoch: 15 Global Step: 251800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:36:18,567-Speed 4384.58 samples/sec Loss 0.5987 Epoch: 15 Global Step: 251850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:36:30,361-Speed 4341.22 samples/sec Loss 0.6044 Epoch: 15 Global Step: 251900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:36:43,433-Speed 3916.99 samples/sec Loss 0.6083 Epoch: 15 Global Step: 251950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:36:55,298-Speed 4315.33 samples/sec Loss 0.5995 Epoch: 15 Global Step: 252000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:37:25,569-[lfw][252000]XNorm: 21.909586 Training: 2021-03-15 17:37:25,570-[lfw][252000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 17:37:25,570-[lfw][252000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:38:00,631-[cfp_fp][252000]XNorm: 21.836403 Training: 2021-03-15 17:38:00,631-[cfp_fp][252000]Accuracy-Flip: 0.99171+-0.00398 Training: 2021-03-15 17:38:00,631-[cfp_fp][252000]Accuracy-Highest: 0.99186 Training: 2021-03-15 17:38:30,785-[agedb_30][252000]XNorm: 22.626034 Training: 2021-03-15 17:38:30,785-[agedb_30][252000]Accuracy-Flip: 0.98367+-0.00702 Training: 2021-03-15 17:38:30,785-[agedb_30][252000]Accuracy-Highest: 0.98367 Training: 2021-03-15 17:38:42,397-Speed 478.07 samples/sec Loss 0.5933 Epoch: 15 Global Step: 252050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:38:54,108-Speed 4371.97 samples/sec Loss 0.6110 Epoch: 15 Global Step: 252100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:39:05,814-Speed 4373.99 samples/sec Loss 0.6133 Epoch: 15 Global Step: 252150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:39:17,798-Speed 4272.62 samples/sec Loss 0.5969 Epoch: 15 Global Step: 252200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:39:29,572-Speed 4348.85 samples/sec Loss 0.6162 Epoch: 15 Global Step: 252250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:39:41,503-Speed 4291.49 samples/sec Loss 0.5974 Epoch: 15 Global Step: 252300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:39:54,141-Speed 4051.47 samples/sec Loss 0.6077 Epoch: 15 Global Step: 252350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:40:05,853-Speed 4371.61 samples/sec Loss 0.5823 Epoch: 15 Global Step: 252400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:40:17,594-Speed 4360.90 samples/sec Loss 0.5964 Epoch: 15 Global Step: 252450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:40:29,298-Speed 4374.73 samples/sec Loss 0.5979 Epoch: 15 Global Step: 252500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:40:41,055-Speed 4355.10 samples/sec Loss 0.6068 Epoch: 15 Global Step: 252550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:40:52,848-Speed 4341.64 samples/sec Loss 0.6013 Epoch: 15 Global Step: 252600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:41:04,476-Speed 4403.30 samples/sec Loss 0.6075 Epoch: 15 Global Step: 252650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:41:17,267-Speed 4003.11 samples/sec Loss 0.6027 Epoch: 15 Global Step: 252700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:41:28,968-Speed 4376.02 samples/sec Loss 0.6105 Epoch: 15 Global Step: 252750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:41:41,453-Speed 4100.81 samples/sec Loss 0.6160 Epoch: 15 Global Step: 252800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:41:53,385-Speed 4291.19 samples/sec Loss 0.6055 Epoch: 15 Global Step: 252850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:42:05,245-Speed 4317.39 samples/sec Loss 0.6119 Epoch: 15 Global Step: 252900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:42:16,990-Speed 4359.54 samples/sec Loss 0.6029 Epoch: 15 Global Step: 252950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:42:29,011-Speed 4259.17 samples/sec Loss 0.6061 Epoch: 15 Global Step: 253000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:42:40,643-Speed 4401.92 samples/sec Loss 0.6083 Epoch: 15 Global Step: 253050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:42:52,512-Speed 4313.90 samples/sec Loss 0.5981 Epoch: 15 Global Step: 253100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:43:05,100-Speed 4067.58 samples/sec Loss 0.5976 Epoch: 15 Global Step: 253150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:43:16,834-Speed 4363.57 samples/sec Loss 0.6062 Epoch: 15 Global Step: 253200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:43:28,506-Speed 4386.78 samples/sec Loss 0.5992 Epoch: 15 Global Step: 253250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:43:40,365-Speed 4317.26 samples/sec Loss 0.6191 Epoch: 15 Global Step: 253300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:43:51,965-Speed 4414.08 samples/sec Loss 0.5922 Epoch: 15 Global Step: 253350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:44:03,904-Speed 4288.74 samples/sec Loss 0.6069 Epoch: 15 Global Step: 253400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:44:16,549-Speed 4049.05 samples/sec Loss 0.5993 Epoch: 15 Global Step: 253450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:44:28,370-Speed 4331.51 samples/sec Loss 0.6077 Epoch: 15 Global Step: 253500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:44:40,081-Speed 4372.22 samples/sec Loss 0.6013 Epoch: 15 Global Step: 253550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:44:51,748-Speed 4388.60 samples/sec Loss 0.6067 Epoch: 15 Global Step: 253600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:45:03,629-Speed 4309.70 samples/sec Loss 0.6045 Epoch: 15 Global Step: 253650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:45:15,600-Speed 4277.02 samples/sec Loss 0.6124 Epoch: 15 Global Step: 253700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:45:27,265-Speed 4389.52 samples/sec Loss 0.6167 Epoch: 15 Global Step: 253750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:45:39,061-Speed 4340.46 samples/sec Loss 0.5880 Epoch: 15 Global Step: 253800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:45:50,875-Speed 4334.02 samples/sec Loss 0.5910 Epoch: 15 Global Step: 253850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:46:02,615-Speed 4361.19 samples/sec Loss 0.6066 Epoch: 15 Global Step: 253900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:46:14,339-Speed 4367.32 samples/sec Loss 0.6000 Epoch: 15 Global Step: 253950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:46:25,827-Speed 4456.93 samples/sec Loss 0.6054 Epoch: 15 Global Step: 254000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:46:56,138-[lfw][254000]XNorm: 22.060943 Training: 2021-03-15 17:46:56,138-[lfw][254000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-15 17:46:56,138-[lfw][254000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:47:31,278-[cfp_fp][254000]XNorm: 21.873400 Training: 2021-03-15 17:47:31,278-[cfp_fp][254000]Accuracy-Flip: 0.99157+-0.00386 Training: 2021-03-15 17:47:31,278-[cfp_fp][254000]Accuracy-Highest: 0.99186 Training: 2021-03-15 17:48:01,551-[agedb_30][254000]XNorm: 22.696296 Training: 2021-03-15 17:48:01,551-[agedb_30][254000]Accuracy-Flip: 0.98300+-0.00706 Training: 2021-03-15 17:48:01,551-[agedb_30][254000]Accuracy-Highest: 0.98367 Training: 2021-03-15 17:48:13,384-Speed 476.03 samples/sec Loss 0.6043 Epoch: 15 Global Step: 254050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:48:24,977-Speed 4416.94 samples/sec Loss 0.6066 Epoch: 15 Global Step: 254100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:48:37,951-Speed 3946.51 samples/sec Loss 0.6070 Epoch: 15 Global Step: 254150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:48:49,629-Speed 4384.53 samples/sec Loss 0.6164 Epoch: 15 Global Step: 254200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:49:01,222-Speed 4416.66 samples/sec Loss 0.6126 Epoch: 15 Global Step: 254250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:49:14,371-Speed 3893.88 samples/sec Loss 0.6126 Epoch: 15 Global Step: 254300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:49:26,054-Speed 4382.56 samples/sec Loss 0.5983 Epoch: 15 Global Step: 254350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:49:37,641-Speed 4419.09 samples/sec Loss 0.6051 Epoch: 15 Global Step: 254400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:49:50,404-Speed 4011.52 samples/sec Loss 0.6045 Epoch: 15 Global Step: 254450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:50:02,004-Speed 4414.01 samples/sec Loss 0.6029 Epoch: 15 Global Step: 254500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:50:13,952-Speed 4285.42 samples/sec Loss 0.6169 Epoch: 15 Global Step: 254550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:50:25,882-Speed 4291.85 samples/sec Loss 0.6020 Epoch: 15 Global Step: 254600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:50:37,558-Speed 4385.44 samples/sec Loss 0.5959 Epoch: 15 Global Step: 254650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:50:49,499-Speed 4287.83 samples/sec Loss 0.6093 Epoch: 15 Global Step: 254700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:01,186-Speed 4381.12 samples/sec Loss 0.6064 Epoch: 15 Global Step: 254750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:12,965-Speed 4346.89 samples/sec Loss 0.5939 Epoch: 15 Global Step: 254800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:24,497-Speed 4439.92 samples/sec Loss 0.5959 Epoch: 15 Global Step: 254850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:36,297-Speed 4339.02 samples/sec Loss 0.6078 Epoch: 15 Global Step: 254900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:48,028-Speed 4364.70 samples/sec Loss 0.6077 Epoch: 15 Global Step: 254950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:51:59,828-Speed 4339.43 samples/sec Loss 0.6034 Epoch: 15 Global Step: 255000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:52:11,434-Speed 4411.70 samples/sec Loss 0.6083 Epoch: 15 Global Step: 255050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:52:23,804-Speed 4139.10 samples/sec Loss 0.5875 Epoch: 15 Global Step: 255100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:52:35,267-Speed 4466.56 samples/sec Loss 0.5864 Epoch: 15 Global Step: 255150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:52:46,897-Speed 4402.52 samples/sec Loss 0.6048 Epoch: 15 Global Step: 255200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:52:59,362-Speed 4107.75 samples/sec Loss 0.5921 Epoch: 15 Global Step: 255250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:53:11,170-Speed 4336.22 samples/sec Loss 0.5951 Epoch: 15 Global Step: 255300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:53:22,982-Speed 4334.88 samples/sec Loss 0.6071 Epoch: 15 Global Step: 255350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:53:34,773-Speed 4342.50 samples/sec Loss 0.6006 Epoch: 15 Global Step: 255400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:53:47,576-Speed 3999.02 samples/sec Loss 0.6126 Epoch: 15 Global Step: 255450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:53:59,332-Speed 4355.46 samples/sec Loss 0.6001 Epoch: 15 Global Step: 255500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:54:11,117-Speed 4344.65 samples/sec Loss 0.5951 Epoch: 15 Global Step: 255550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:54:22,896-Speed 4347.09 samples/sec Loss 0.6074 Epoch: 15 Global Step: 255600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:54:34,735-Speed 4324.81 samples/sec Loss 0.5985 Epoch: 15 Global Step: 255650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:54:46,606-Speed 4313.23 samples/sec Loss 0.6091 Epoch: 15 Global Step: 255700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:54:58,289-Speed 4382.42 samples/sec Loss 0.6068 Epoch: 15 Global Step: 255750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:55:10,117-Speed 4328.90 samples/sec Loss 0.6054 Epoch: 15 Global Step: 255800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:55:22,866-Speed 4016.34 samples/sec Loss 0.5996 Epoch: 15 Global Step: 255850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:55:34,524-Speed 4391.99 samples/sec Loss 0.6108 Epoch: 15 Global Step: 255900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:55:46,972-Speed 4113.27 samples/sec Loss 0.5988 Epoch: 15 Global Step: 255950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:55:58,832-Speed 4316.96 samples/sec Loss 0.6038 Epoch: 15 Global Step: 256000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:56:29,095-[lfw][256000]XNorm: 22.111687 Training: 2021-03-15 17:56:29,096-[lfw][256000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 17:56:29,096-[lfw][256000]Accuracy-Highest: 0.99833 Training: 2021-03-15 17:57:04,264-[cfp_fp][256000]XNorm: 21.997169 Training: 2021-03-15 17:57:04,264-[cfp_fp][256000]Accuracy-Flip: 0.99200+-0.00379 Training: 2021-03-15 17:57:04,264-[cfp_fp][256000]Accuracy-Highest: 0.99200 Training: 2021-03-15 17:57:34,573-[agedb_30][256000]XNorm: 22.826705 Training: 2021-03-15 17:57:34,573-[agedb_30][256000]Accuracy-Flip: 0.98350+-0.00721 Training: 2021-03-15 17:57:34,573-[agedb_30][256000]Accuracy-Highest: 0.98367 Training: 2021-03-15 17:57:46,284-Speed 476.49 samples/sec Loss 0.6270 Epoch: 15 Global Step: 256050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:57:58,050-Speed 4351.82 samples/sec Loss 0.5950 Epoch: 15 Global Step: 256100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:58:09,769-Speed 4368.89 samples/sec Loss 0.6018 Epoch: 15 Global Step: 256150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:58:21,820-Speed 4248.80 samples/sec Loss 0.6140 Epoch: 15 Global Step: 256200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:58:33,746-Speed 4293.25 samples/sec Loss 0.6090 Epoch: 15 Global Step: 256250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:58:45,591-Speed 4322.88 samples/sec Loss 0.6195 Epoch: 15 Global Step: 256300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:58:57,512-Speed 4295.00 samples/sec Loss 0.5885 Epoch: 15 Global Step: 256350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:59:09,252-Speed 4361.34 samples/sec Loss 0.6085 Epoch: 15 Global Step: 256400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:59:20,820-Speed 4426.22 samples/sec Loss 0.6049 Epoch: 15 Global Step: 256450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:59:32,490-Speed 4387.54 samples/sec Loss 0.5968 Epoch: 15 Global Step: 256500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:59:44,371-Speed 4309.60 samples/sec Loss 0.5958 Epoch: 15 Global Step: 256550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 17:59:56,142-Speed 4349.86 samples/sec Loss 0.6056 Epoch: 15 Global Step: 256600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:00:07,846-Speed 4374.77 samples/sec Loss 0.6061 Epoch: 15 Global Step: 256650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:00:19,594-Speed 4358.12 samples/sec Loss 0.6097 Epoch: 15 Global Step: 256700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:00:32,522-Speed 3960.55 samples/sec Loss 0.5967 Epoch: 15 Global Step: 256750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:00:44,314-Speed 4342.39 samples/sec Loss 0.6019 Epoch: 15 Global Step: 256800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:00:56,129-Speed 4333.39 samples/sec Loss 0.6009 Epoch: 15 Global Step: 256850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:01:08,834-Speed 4030.17 samples/sec Loss 0.5976 Epoch: 15 Global Step: 256900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:01:21,230-Speed 4130.65 samples/sec Loss 0.6046 Epoch: 15 Global Step: 256950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:01:33,117-Speed 4307.28 samples/sec Loss 0.6069 Epoch: 15 Global Step: 257000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:01:44,683-Speed 4427.02 samples/sec Loss 0.6006 Epoch: 15 Global Step: 257050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:01:56,240-Speed 4430.45 samples/sec Loss 0.5963 Epoch: 15 Global Step: 257100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:02:08,085-Speed 4322.71 samples/sec Loss 0.6007 Epoch: 15 Global Step: 257150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:02:19,783-Speed 4376.89 samples/sec Loss 0.5998 Epoch: 15 Global Step: 257200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:02:31,326-Speed 4435.77 samples/sec Loss 0.6106 Epoch: 15 Global Step: 257250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:02:43,055-Speed 4365.62 samples/sec Loss 0.6016 Epoch: 15 Global Step: 257300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:02:54,742-Speed 4380.99 samples/sec Loss 0.6032 Epoch: 15 Global Step: 257350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:03:06,431-Speed 4380.39 samples/sec Loss 0.6011 Epoch: 15 Global Step: 257400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:03:18,119-Speed 4380.56 samples/sec Loss 0.5980 Epoch: 15 Global Step: 257450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:03:29,820-Speed 4375.86 samples/sec Loss 0.6089 Epoch: 15 Global Step: 257500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:03:41,514-Speed 4378.65 samples/sec Loss 0.6005 Epoch: 15 Global Step: 257550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:03:53,256-Speed 4360.52 samples/sec Loss 0.6029 Epoch: 15 Global Step: 257600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:04:05,881-Speed 4055.56 samples/sec Loss 0.5978 Epoch: 15 Global Step: 257650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:04:17,792-Speed 4298.96 samples/sec Loss 0.5943 Epoch: 15 Global Step: 257700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:04:29,770-Speed 4274.36 samples/sec Loss 0.6009 Epoch: 15 Global Step: 257750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:04:42,476-Speed 4029.83 samples/sec Loss 0.6140 Epoch: 15 Global Step: 257800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:04:54,209-Speed 4364.18 samples/sec Loss 0.6121 Epoch: 15 Global Step: 257850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:05:06,078-Speed 4313.76 samples/sec Loss 0.6012 Epoch: 15 Global Step: 257900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:05:18,123-Speed 4250.77 samples/sec Loss 0.5961 Epoch: 15 Global Step: 257950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:05:29,959-Speed 4326.08 samples/sec Loss 0.6101 Epoch: 15 Global Step: 258000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:06:00,341-[lfw][258000]XNorm: 21.889838 Training: 2021-03-15 18:06:00,342-[lfw][258000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:06:00,342-[lfw][258000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:06:35,412-[cfp_fp][258000]XNorm: 21.826863 Training: 2021-03-15 18:06:35,413-[cfp_fp][258000]Accuracy-Flip: 0.99143+-0.00414 Training: 2021-03-15 18:06:35,413-[cfp_fp][258000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:07:05,586-[agedb_30][258000]XNorm: 22.622699 Training: 2021-03-15 18:07:05,586-[agedb_30][258000]Accuracy-Flip: 0.98200+-0.00722 Training: 2021-03-15 18:07:05,586-[agedb_30][258000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:07:17,296-Speed 477.00 samples/sec Loss 0.5980 Epoch: 15 Global Step: 258050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:07:28,973-Speed 4384.95 samples/sec Loss 0.6098 Epoch: 15 Global Step: 258100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:07:41,675-Speed 4031.01 samples/sec Loss 0.6090 Epoch: 15 Global Step: 258150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:07:53,393-Speed 4369.42 samples/sec Loss 0.5939 Epoch: 15 Global Step: 258200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:08:05,209-Speed 4333.29 samples/sec Loss 0.6137 Epoch: 15 Global Step: 258250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:08:16,944-Speed 4363.40 samples/sec Loss 0.5911 Epoch: 15 Global Step: 258300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:08:28,570-Speed 4403.84 samples/sec Loss 0.6012 Epoch: 15 Global Step: 258350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:08:40,299-Speed 4365.46 samples/sec Loss 0.6097 Epoch: 15 Global Step: 258400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:08:53,651-Speed 3834.73 samples/sec Loss 0.5973 Epoch: 15 Global Step: 258450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:09:05,305-Speed 4393.57 samples/sec Loss 0.6067 Epoch: 15 Global Step: 258500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:09:16,906-Speed 4413.74 samples/sec Loss 0.6037 Epoch: 15 Global Step: 258550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:09:28,518-Speed 4409.40 samples/sec Loss 0.6080 Epoch: 15 Global Step: 258600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:09:40,096-Speed 4422.08 samples/sec Loss 0.5964 Epoch: 15 Global Step: 258650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:09:51,749-Speed 4393.87 samples/sec Loss 0.6042 Epoch: 15 Global Step: 258700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:10:03,396-Speed 4396.35 samples/sec Loss 0.6093 Epoch: 15 Global Step: 258750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:10:15,205-Speed 4335.90 samples/sec Loss 0.5992 Epoch: 15 Global Step: 258800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:10:27,049-Speed 4323.00 samples/sec Loss 0.5935 Epoch: 15 Global Step: 258850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:10:38,887-Speed 4325.32 samples/sec Loss 0.6229 Epoch: 15 Global Step: 258900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:10:50,396-Speed 4448.67 samples/sec Loss 0.5968 Epoch: 15 Global Step: 258950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:11:02,079-Speed 4382.61 samples/sec Loss 0.5890 Epoch: 15 Global Step: 259000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:11:13,695-Speed 4407.88 samples/sec Loss 0.6008 Epoch: 15 Global Step: 259050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:11:25,465-Speed 4350.46 samples/sec Loss 0.6026 Epoch: 15 Global Step: 259100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:11:37,280-Speed 4333.43 samples/sec Loss 0.5983 Epoch: 15 Global Step: 259150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:11:49,092-Speed 4334.96 samples/sec Loss 0.6113 Epoch: 15 Global Step: 259200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:12:00,770-Speed 4384.29 samples/sec Loss 0.5914 Epoch: 15 Global Step: 259250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:12:12,504-Speed 4363.46 samples/sec Loss 0.6171 Epoch: 15 Global Step: 259300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:12:24,147-Speed 4398.01 samples/sec Loss 0.5900 Epoch: 15 Global Step: 259350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:12:36,697-Speed 4079.54 samples/sec Loss 0.5967 Epoch: 15 Global Step: 259400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:12:49,332-Speed 4052.47 samples/sec Loss 0.6011 Epoch: 15 Global Step: 259450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:13:01,169-Speed 4325.50 samples/sec Loss 0.5985 Epoch: 15 Global Step: 259500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:13:12,911-Speed 4360.78 samples/sec Loss 0.5965 Epoch: 15 Global Step: 259550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:13:25,532-Speed 4056.89 samples/sec Loss 0.6063 Epoch: 15 Global Step: 259600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:13:37,529-Speed 4267.83 samples/sec Loss 0.5945 Epoch: 15 Global Step: 259650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:13:49,185-Speed 4392.79 samples/sec Loss 0.5876 Epoch: 15 Global Step: 259700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:00,893-Speed 4373.30 samples/sec Loss 0.6080 Epoch: 15 Global Step: 259750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:12,678-Speed 4344.56 samples/sec Loss 0.5937 Epoch: 15 Global Step: 259800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:24,419-Speed 4360.89 samples/sec Loss 0.6105 Epoch: 15 Global Step: 259850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:36,251-Speed 4327.67 samples/sec Loss 0.5980 Epoch: 15 Global Step: 259900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:48,099-Speed 4321.41 samples/sec Loss 0.6067 Epoch: 15 Global Step: 259950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:14:59,726-Speed 4403.74 samples/sec Loss 0.5989 Epoch: 15 Global Step: 260000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:15:29,935-[lfw][260000]XNorm: 21.842881 Training: 2021-03-15 18:15:29,936-[lfw][260000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:15:29,936-[lfw][260000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:16:05,165-[cfp_fp][260000]XNorm: 21.767352 Training: 2021-03-15 18:16:05,166-[cfp_fp][260000]Accuracy-Flip: 0.99114+-0.00388 Training: 2021-03-15 18:16:05,166-[cfp_fp][260000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:16:35,570-[agedb_30][260000]XNorm: 22.553079 Training: 2021-03-15 18:16:35,571-[agedb_30][260000]Accuracy-Flip: 0.98283+-0.00764 Training: 2021-03-15 18:16:35,571-[agedb_30][260000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:16:48,366-Speed 471.28 samples/sec Loss 0.6040 Epoch: 15 Global Step: 260050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:00,098-Speed 4364.21 samples/sec Loss 0.6015 Epoch: 15 Global Step: 260100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:11,877-Speed 4347.13 samples/sec Loss 0.5936 Epoch: 15 Global Step: 260150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:23,518-Speed 4398.20 samples/sec Loss 0.6057 Epoch: 15 Global Step: 260200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:35,093-Speed 4423.72 samples/sec Loss 0.5903 Epoch: 15 Global Step: 260250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:46,831-Speed 4361.93 samples/sec Loss 0.6114 Epoch: 15 Global Step: 260300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:17:58,599-Speed 4350.94 samples/sec Loss 0.6105 Epoch: 15 Global Step: 260350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:18:10,205-Speed 4411.69 samples/sec Loss 0.6062 Epoch: 15 Global Step: 260400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:18:21,994-Speed 4343.11 samples/sec Loss 0.5997 Epoch: 15 Global Step: 260450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:18:33,970-Speed 4275.40 samples/sec Loss 0.6121 Epoch: 15 Global Step: 260500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:18:46,669-Speed 4031.95 samples/sec Loss 0.5929 Epoch: 15 Global Step: 260550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:18:58,486-Speed 4332.93 samples/sec Loss 0.6088 Epoch: 15 Global Step: 260600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:19:10,410-Speed 4294.27 samples/sec Loss 0.5997 Epoch: 15 Global Step: 260650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:19:22,175-Speed 4351.91 samples/sec Loss 0.6008 Epoch: 15 Global Step: 260700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:19:33,954-Speed 4346.75 samples/sec Loss 0.5986 Epoch: 15 Global Step: 260750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:19:45,735-Speed 4346.34 samples/sec Loss 0.5925 Epoch: 15 Global Step: 260800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:19:59,216-Speed 3798.16 samples/sec Loss 0.6134 Epoch: 15 Global Step: 260850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:20:10,941-Speed 4366.84 samples/sec Loss 0.6243 Epoch: 15 Global Step: 260900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:20:22,723-Speed 4345.65 samples/sec Loss 0.5955 Epoch: 15 Global Step: 260950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:20:34,427-Speed 4374.63 samples/sec Loss 0.5901 Epoch: 15 Global Step: 261000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:20:46,245-Speed 4332.70 samples/sec Loss 0.6044 Epoch: 15 Global Step: 261050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:20:57,914-Speed 4387.92 samples/sec Loss 0.6086 Epoch: 15 Global Step: 261100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:21:10,511-Speed 4064.62 samples/sec Loss 0.5850 Epoch: 15 Global Step: 261150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:21:22,407-Speed 4303.99 samples/sec Loss 0.6044 Epoch: 15 Global Step: 261200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:21:34,031-Speed 4405.00 samples/sec Loss 0.6025 Epoch: 15 Global Step: 261250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:21:45,808-Speed 4347.77 samples/sec Loss 0.6076 Epoch: 15 Global Step: 261300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:21:57,538-Speed 4364.72 samples/sec Loss 0.5978 Epoch: 15 Global Step: 261350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:22:09,425-Speed 4307.43 samples/sec Loss 0.6043 Epoch: 15 Global Step: 261400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:22:21,047-Speed 4405.67 samples/sec Loss 0.6055 Epoch: 15 Global Step: 261450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:22:32,816-Speed 4350.48 samples/sec Loss 0.5910 Epoch: 15 Global Step: 261500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:22:44,595-Speed 4347.08 samples/sec Loss 0.6015 Epoch: 15 Global Step: 261550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:22:56,429-Speed 4326.74 samples/sec Loss 0.6135 Epoch: 15 Global Step: 261600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:23:08,230-Speed 4338.79 samples/sec Loss 0.5943 Epoch: 15 Global Step: 261650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:23:19,980-Speed 4357.43 samples/sec Loss 0.6192 Epoch: 15 Global Step: 261700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:23:31,658-Speed 4384.69 samples/sec Loss 0.6108 Epoch: 15 Global Step: 261750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:23:43,372-Speed 4371.08 samples/sec Loss 0.5894 Epoch: 15 Global Step: 261800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:23:55,776-Speed 4127.76 samples/sec Loss 0.6012 Epoch: 15 Global Step: 261850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:24:07,559-Speed 4345.52 samples/sec Loss 0.6001 Epoch: 15 Global Step: 261900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:24:19,214-Speed 4392.87 samples/sec Loss 0.5893 Epoch: 15 Global Step: 261950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:24:31,002-Speed 4343.69 samples/sec Loss 0.6025 Epoch: 15 Global Step: 262000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:25:01,359-[lfw][262000]XNorm: 22.004014 Training: 2021-03-15 18:25:01,360-[lfw][262000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:25:01,360-[lfw][262000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:25:36,594-[cfp_fp][262000]XNorm: 21.957093 Training: 2021-03-15 18:25:36,594-[cfp_fp][262000]Accuracy-Flip: 0.99143+-0.00399 Training: 2021-03-15 18:25:36,595-[cfp_fp][262000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:26:06,915-[agedb_30][262000]XNorm: 22.745923 Training: 2021-03-15 18:26:06,915-[agedb_30][262000]Accuracy-Flip: 0.98300+-0.00763 Training: 2021-03-15 18:26:06,915-[agedb_30][262000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:26:18,598-Speed 475.86 samples/sec Loss 0.6119 Epoch: 15 Global Step: 262050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:26:31,262-Speed 4043.20 samples/sec Loss 0.5952 Epoch: 15 Global Step: 262100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:26:43,069-Speed 4336.74 samples/sec Loss 0.5966 Epoch: 15 Global Step: 262150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:26:54,809-Speed 4361.02 samples/sec Loss 0.5935 Epoch: 15 Global Step: 262200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:27:06,529-Speed 4368.78 samples/sec Loss 0.5908 Epoch: 15 Global Step: 262250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:27:19,150-Speed 4056.89 samples/sec Loss 0.6065 Epoch: 15 Global Step: 262300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:27:30,911-Speed 4353.59 samples/sec Loss 0.5999 Epoch: 15 Global Step: 262350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:27:42,527-Speed 4407.81 samples/sec Loss 0.6090 Epoch: 15 Global Step: 262400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:27:55,038-Speed 4092.68 samples/sec Loss 0.6014 Epoch: 15 Global Step: 262450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:28:06,653-Speed 4408.44 samples/sec Loss 0.6096 Epoch: 15 Global Step: 262500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:28:18,263-Speed 4409.82 samples/sec Loss 0.6013 Epoch: 15 Global Step: 262550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:28:30,106-Speed 4323.71 samples/sec Loss 0.6055 Epoch: 15 Global Step: 262600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:28:41,860-Speed 4355.95 samples/sec Loss 0.5977 Epoch: 15 Global Step: 262650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:28:53,483-Speed 4405.37 samples/sec Loss 0.6042 Epoch: 15 Global Step: 262700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:29:05,155-Speed 4386.63 samples/sec Loss 0.6096 Epoch: 15 Global Step: 262750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:29:16,686-Speed 4440.38 samples/sec Loss 0.6030 Epoch: 15 Global Step: 262800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:29:28,400-Speed 4371.13 samples/sec Loss 0.6017 Epoch: 15 Global Step: 262850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:29:40,463-Speed 4244.40 samples/sec Loss 0.6078 Epoch: 15 Global Step: 262900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:29:52,206-Speed 4360.20 samples/sec Loss 0.6103 Epoch: 15 Global Step: 262950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:30:04,130-Speed 4293.94 samples/sec Loss 0.5886 Epoch: 15 Global Step: 263000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:30:15,972-Speed 4323.86 samples/sec Loss 0.6035 Epoch: 15 Global Step: 263050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:30:27,520-Speed 4433.99 samples/sec Loss 0.6075 Epoch: 15 Global Step: 263100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:30:39,243-Speed 4367.73 samples/sec Loss 0.6074 Epoch: 15 Global Step: 263150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:30:51,867-Speed 4055.89 samples/sec Loss 0.6069 Epoch: 15 Global Step: 263200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:31:03,618-Speed 4357.13 samples/sec Loss 0.5941 Epoch: 15 Global Step: 263250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:31:15,310-Speed 4379.37 samples/sec Loss 0.6165 Epoch: 15 Global Step: 263300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:31:27,869-Speed 4076.92 samples/sec Loss 0.5957 Epoch: 15 Global Step: 263350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:31:39,406-Speed 4437.97 samples/sec Loss 0.5995 Epoch: 15 Global Step: 263400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:31:51,090-Speed 4382.27 samples/sec Loss 0.5997 Epoch: 15 Global Step: 263450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:32:02,864-Speed 4348.78 samples/sec Loss 0.6035 Epoch: 15 Global Step: 263500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:32:15,710-Speed 3985.84 samples/sec Loss 0.6156 Epoch: 15 Global Step: 263550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:32:27,378-Speed 4387.95 samples/sec Loss 0.6004 Epoch: 15 Global Step: 263600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:32:38,927-Speed 4433.50 samples/sec Loss 0.6041 Epoch: 15 Global Step: 263650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:32:50,458-Speed 4440.42 samples/sec Loss 0.6060 Epoch: 15 Global Step: 263700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:33:02,229-Speed 4350.04 samples/sec Loss 0.6112 Epoch: 15 Global Step: 263750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 18:33:14,869-Speed 4050.73 samples/sec Loss 0.6049 Epoch: 15 Global Step: 263800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:33:26,636-Speed 4351.27 samples/sec Loss 0.5919 Epoch: 15 Global Step: 263850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:33:38,352-Speed 4370.17 samples/sec Loss 0.5940 Epoch: 15 Global Step: 263900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:33:50,158-Speed 4337.08 samples/sec Loss 0.5954 Epoch: 15 Global Step: 263950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:34:01,877-Speed 4368.90 samples/sec Loss 0.5974 Epoch: 15 Global Step: 264000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:34:32,196-[lfw][264000]XNorm: 22.054531 Training: 2021-03-15 18:34:32,197-[lfw][264000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:34:32,197-[lfw][264000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:35:07,389-[cfp_fp][264000]XNorm: 21.959084 Training: 2021-03-15 18:35:07,389-[cfp_fp][264000]Accuracy-Flip: 0.99171+-0.00413 Training: 2021-03-15 18:35:07,389-[cfp_fp][264000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:35:37,713-[agedb_30][264000]XNorm: 22.773952 Training: 2021-03-15 18:35:37,713-[agedb_30][264000]Accuracy-Flip: 0.98283+-0.00738 Training: 2021-03-15 18:35:37,713-[agedb_30][264000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:35:49,369-Speed 476.32 samples/sec Loss 0.6023 Epoch: 15 Global Step: 264050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:36:01,231-Speed 4316.33 samples/sec Loss 0.5981 Epoch: 15 Global Step: 264100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:36:13,119-Speed 4307.14 samples/sec Loss 0.5956 Epoch: 15 Global Step: 264150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:36:24,811-Speed 4379.19 samples/sec Loss 0.5938 Epoch: 15 Global Step: 264200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:36:37,403-Speed 4066.37 samples/sec Loss 0.6094 Epoch: 15 Global Step: 264250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:36:49,097-Speed 4378.46 samples/sec Loss 0.6088 Epoch: 15 Global Step: 264300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:00,691-Speed 4416.01 samples/sec Loss 0.6191 Epoch: 15 Global Step: 264350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:12,349-Speed 4392.20 samples/sec Loss 0.5942 Epoch: 15 Global Step: 264400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:24,261-Speed 4298.31 samples/sec Loss 0.6031 Epoch: 15 Global Step: 264450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:35,640-Speed 4499.57 samples/sec Loss 0.6071 Epoch: 15 Global Step: 264500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:47,524-Speed 4308.52 samples/sec Loss 0.6057 Epoch: 15 Global Step: 264550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:37:59,173-Speed 4395.45 samples/sec Loss 0.5977 Epoch: 15 Global Step: 264600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:38:10,850-Speed 4384.76 samples/sec Loss 0.6039 Epoch: 15 Global Step: 264650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:38:22,741-Speed 4306.17 samples/sec Loss 0.5896 Epoch: 15 Global Step: 264700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:38:35,305-Speed 4075.08 samples/sec Loss 0.6105 Epoch: 15 Global Step: 264750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:38:47,107-Speed 4338.49 samples/sec Loss 0.5997 Epoch: 15 Global Step: 264800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:38:59,520-Speed 4124.86 samples/sec Loss 0.5961 Epoch: 15 Global Step: 264850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:39:11,487-Speed 4278.52 samples/sec Loss 0.6110 Epoch: 15 Global Step: 264900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:39:23,963-Speed 4104.11 samples/sec Loss 0.5982 Epoch: 15 Global Step: 264950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:39:35,866-Speed 4301.48 samples/sec Loss 0.5996 Epoch: 15 Global Step: 265000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:39:47,758-Speed 4305.56 samples/sec Loss 0.6059 Epoch: 15 Global Step: 265050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:39:59,552-Speed 4341.43 samples/sec Loss 0.6073 Epoch: 15 Global Step: 265100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:40:11,283-Speed 4364.79 samples/sec Loss 0.5976 Epoch: 15 Global Step: 265150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:40:23,304-Speed 4259.39 samples/sec Loss 0.6091 Epoch: 15 Global Step: 265200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:40:35,088-Speed 4345.18 samples/sec Loss 0.5940 Epoch: 15 Global Step: 265250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:40:46,831-Speed 4360.05 samples/sec Loss 0.6072 Epoch: 15 Global Step: 265300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:40:58,587-Speed 4355.48 samples/sec Loss 0.6084 Epoch: 15 Global Step: 265350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:41:10,418-Speed 4327.71 samples/sec Loss 0.5955 Epoch: 15 Global Step: 265400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:41:22,133-Speed 4370.45 samples/sec Loss 0.5898 Epoch: 15 Global Step: 265450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:41:33,736-Speed 4413.16 samples/sec Loss 0.5965 Epoch: 15 Global Step: 265500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:41:45,509-Speed 4349.05 samples/sec Loss 0.6017 Epoch: 15 Global Step: 265550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:41:57,404-Speed 4304.24 samples/sec Loss 0.6038 Epoch: 15 Global Step: 265600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:42:09,067-Speed 4390.28 samples/sec Loss 0.5988 Epoch: 15 Global Step: 265650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:42:20,793-Speed 4366.65 samples/sec Loss 0.6078 Epoch: 15 Global Step: 265700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:42:33,281-Speed 4100.00 samples/sec Loss 0.5960 Epoch: 15 Global Step: 265750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:42:44,916-Speed 4400.62 samples/sec Loss 0.5919 Epoch: 15 Global Step: 265800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:42:56,659-Speed 4360.28 samples/sec Loss 0.5944 Epoch: 15 Global Step: 265850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:43:09,322-Speed 4043.43 samples/sec Loss 0.6047 Epoch: 15 Global Step: 265900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:43:21,110-Speed 4343.63 samples/sec Loss 0.6047 Epoch: 15 Global Step: 265950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:43:33,067-Speed 4281.98 samples/sec Loss 0.5910 Epoch: 15 Global Step: 266000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:44:03,211-[lfw][266000]XNorm: 21.765539 Training: 2021-03-15 18:44:03,211-[lfw][266000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:44:03,211-[lfw][266000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:44:38,444-[cfp_fp][266000]XNorm: 21.769770 Training: 2021-03-15 18:44:38,444-[cfp_fp][266000]Accuracy-Flip: 0.99186+-0.00384 Training: 2021-03-15 18:44:38,446-[cfp_fp][266000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:45:08,797-[agedb_30][266000]XNorm: 22.575009 Training: 2021-03-15 18:45:08,798-[agedb_30][266000]Accuracy-Flip: 0.98300+-0.00763 Training: 2021-03-15 18:45:08,798-[agedb_30][266000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:45:20,375-Speed 477.13 samples/sec Loss 0.5986 Epoch: 15 Global Step: 266050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:45:32,132-Speed 4355.27 samples/sec Loss 0.5925 Epoch: 15 Global Step: 266100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:45:43,856-Speed 4367.36 samples/sec Loss 0.5861 Epoch: 15 Global Step: 266150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:45:55,478-Speed 4405.30 samples/sec Loss 0.5837 Epoch: 15 Global Step: 266200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:46:07,265-Speed 4344.09 samples/sec Loss 0.5910 Epoch: 15 Global Step: 266250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:46:20,089-Speed 3992.52 samples/sec Loss 0.5936 Epoch: 15 Global Step: 266300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:46:32,029-Speed 4288.42 samples/sec Loss 0.5895 Epoch: 15 Global Step: 266350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:46:43,925-Speed 4304.19 samples/sec Loss 0.5966 Epoch: 15 Global Step: 266400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:46:56,563-Speed 4051.33 samples/sec Loss 0.6012 Epoch: 15 Global Step: 266450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:47:08,324-Speed 4353.83 samples/sec Loss 0.5908 Epoch: 15 Global Step: 266500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:47:20,091-Speed 4351.17 samples/sec Loss 0.6085 Epoch: 15 Global Step: 266550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:47:31,696-Speed 4412.13 samples/sec Loss 0.5968 Epoch: 15 Global Step: 266600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:47:43,559-Speed 4316.07 samples/sec Loss 0.6070 Epoch: 15 Global Step: 266650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:47:55,214-Speed 4393.26 samples/sec Loss 0.6021 Epoch: 15 Global Step: 266700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:48:07,560-Speed 4147.10 samples/sec Loss 0.5969 Epoch: 15 Global Step: 266750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:48:19,193-Speed 4401.62 samples/sec Loss 0.6070 Epoch: 15 Global Step: 266800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:48:30,982-Speed 4342.91 samples/sec Loss 0.6020 Epoch: 15 Global Step: 266850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:48:42,934-Speed 4284.02 samples/sec Loss 0.5970 Epoch: 15 Global Step: 266900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:48:54,707-Speed 4348.97 samples/sec Loss 0.5955 Epoch: 15 Global Step: 266950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:49:06,568-Speed 4317.12 samples/sec Loss 0.6061 Epoch: 15 Global Step: 267000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:49:18,458-Speed 4306.19 samples/sec Loss 0.6088 Epoch: 15 Global Step: 267050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:49:43,493-Speed 2045.20 samples/sec Loss 0.5879 Epoch: 16 Global Step: 267100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:49:55,704-Speed 4193.16 samples/sec Loss 0.5971 Epoch: 16 Global Step: 267150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:50:07,509-Speed 4337.03 samples/sec Loss 0.5897 Epoch: 16 Global Step: 267200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:50:19,276-Speed 4351.34 samples/sec Loss 0.5817 Epoch: 16 Global Step: 267250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:50:31,928-Speed 4046.95 samples/sec Loss 0.5772 Epoch: 16 Global Step: 267300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:50:44,525-Speed 4064.86 samples/sec Loss 0.5887 Epoch: 16 Global Step: 267350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:50:56,324-Speed 4339.38 samples/sec Loss 0.5798 Epoch: 16 Global Step: 267400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:51:08,260-Speed 4289.69 samples/sec Loss 0.5907 Epoch: 16 Global Step: 267450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:51:20,030-Speed 4350.22 samples/sec Loss 0.5780 Epoch: 16 Global Step: 267500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:51:31,889-Speed 4317.60 samples/sec Loss 0.5951 Epoch: 16 Global Step: 267550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:51:44,590-Speed 4031.31 samples/sec Loss 0.5812 Epoch: 16 Global Step: 267600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:51:56,276-Speed 4381.64 samples/sec Loss 0.5893 Epoch: 16 Global Step: 267650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:52:07,888-Speed 4409.44 samples/sec Loss 0.6045 Epoch: 16 Global Step: 267700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:52:19,658-Speed 4350.08 samples/sec Loss 0.5683 Epoch: 16 Global Step: 267750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:52:31,468-Speed 4335.78 samples/sec Loss 0.5895 Epoch: 16 Global Step: 267800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:52:43,277-Speed 4335.61 samples/sec Loss 0.5902 Epoch: 16 Global Step: 267850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:52:55,107-Speed 4328.31 samples/sec Loss 0.5805 Epoch: 16 Global Step: 267900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:53:06,793-Speed 4381.29 samples/sec Loss 0.5828 Epoch: 16 Global Step: 267950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:53:18,557-Speed 4352.72 samples/sec Loss 0.5830 Epoch: 16 Global Step: 268000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:53:48,803-[lfw][268000]XNorm: 22.104463 Training: 2021-03-15 18:53:48,804-[lfw][268000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 18:53:48,804-[lfw][268000]Accuracy-Highest: 0.99833 Training: 2021-03-15 18:54:23,739-[cfp_fp][268000]XNorm: 21.983164 Training: 2021-03-15 18:54:23,739-[cfp_fp][268000]Accuracy-Flip: 0.99129+-0.00426 Training: 2021-03-15 18:54:23,739-[cfp_fp][268000]Accuracy-Highest: 0.99200 Training: 2021-03-15 18:54:53,861-[agedb_30][268000]XNorm: 22.744577 Training: 2021-03-15 18:54:53,861-[agedb_30][268000]Accuracy-Flip: 0.98333+-0.00715 Training: 2021-03-15 18:54:53,861-[agedb_30][268000]Accuracy-Highest: 0.98367 Training: 2021-03-15 18:55:05,678-Speed 477.96 samples/sec Loss 0.5997 Epoch: 16 Global Step: 268050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:55:17,278-Speed 4414.20 samples/sec Loss 0.5850 Epoch: 16 Global Step: 268100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:55:29,772-Speed 4097.85 samples/sec Loss 0.5744 Epoch: 16 Global Step: 268150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:55:41,280-Speed 4449.57 samples/sec Loss 0.5768 Epoch: 16 Global Step: 268200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:55:53,019-Speed 4361.49 samples/sec Loss 0.5851 Epoch: 16 Global Step: 268250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:56:05,047-Speed 4256.79 samples/sec Loss 0.5943 Epoch: 16 Global Step: 268300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:56:16,788-Speed 4361.02 samples/sec Loss 0.5904 Epoch: 16 Global Step: 268350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:56:28,298-Speed 4448.70 samples/sec Loss 0.5887 Epoch: 16 Global Step: 268400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:56:40,079-Speed 4346.15 samples/sec Loss 0.5863 Epoch: 16 Global Step: 268450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:56:53,186-Speed 3906.29 samples/sec Loss 0.5885 Epoch: 16 Global Step: 268500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:57:04,992-Speed 4336.95 samples/sec Loss 0.5921 Epoch: 16 Global Step: 268550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:57:16,612-Speed 4406.54 samples/sec Loss 0.5877 Epoch: 16 Global Step: 268600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:57:28,126-Speed 4446.72 samples/sec Loss 0.5878 Epoch: 16 Global Step: 268650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:57:39,847-Speed 4368.37 samples/sec Loss 0.5945 Epoch: 16 Global Step: 268700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:57:51,494-Speed 4396.21 samples/sec Loss 0.5771 Epoch: 16 Global Step: 268750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:58:03,085-Speed 4417.36 samples/sec Loss 0.5824 Epoch: 16 Global Step: 268800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:58:14,939-Speed 4319.40 samples/sec Loss 0.5851 Epoch: 16 Global Step: 268850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:58:26,772-Speed 4327.18 samples/sec Loss 0.5941 Epoch: 16 Global Step: 268900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:58:39,850-Speed 3915.06 samples/sec Loss 0.5884 Epoch: 16 Global Step: 268950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:58:51,749-Speed 4302.93 samples/sec Loss 0.5787 Epoch: 16 Global Step: 269000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:59:04,506-Speed 4013.67 samples/sec Loss 0.5851 Epoch: 16 Global Step: 269050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:59:16,232-Speed 4366.71 samples/sec Loss 0.5937 Epoch: 16 Global Step: 269100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:59:27,745-Speed 4447.25 samples/sec Loss 0.5907 Epoch: 16 Global Step: 269150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:59:39,361-Speed 4407.93 samples/sec Loss 0.5974 Epoch: 16 Global Step: 269200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 18:59:51,857-Speed 4097.38 samples/sec Loss 0.5900 Epoch: 16 Global Step: 269250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:00:03,650-Speed 4341.61 samples/sec Loss 0.5832 Epoch: 16 Global Step: 269300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:00:15,225-Speed 4423.84 samples/sec Loss 0.5856 Epoch: 16 Global Step: 269350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:00:27,002-Speed 4347.47 samples/sec Loss 0.5856 Epoch: 16 Global Step: 269400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:00:38,642-Speed 4398.94 samples/sec Loss 0.5908 Epoch: 16 Global Step: 269450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:00:50,469-Speed 4328.94 samples/sec Loss 0.5976 Epoch: 16 Global Step: 269500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:01:02,211-Speed 4360.58 samples/sec Loss 0.5871 Epoch: 16 Global Step: 269550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:01:13,811-Speed 4414.05 samples/sec Loss 0.5983 Epoch: 16 Global Step: 269600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:01:25,302-Speed 4455.98 samples/sec Loss 0.5970 Epoch: 16 Global Step: 269650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:01:36,803-Speed 4451.73 samples/sec Loss 0.5983 Epoch: 16 Global Step: 269700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:01:49,335-Speed 4085.89 samples/sec Loss 0.5909 Epoch: 16 Global Step: 269750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:02:01,016-Speed 4383.35 samples/sec Loss 0.6068 Epoch: 16 Global Step: 269800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:02:12,650-Speed 4401.10 samples/sec Loss 0.5932 Epoch: 16 Global Step: 269850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:02:25,284-Speed 4052.74 samples/sec Loss 0.5902 Epoch: 16 Global Step: 269900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:02:36,948-Speed 4389.66 samples/sec Loss 0.5832 Epoch: 16 Global Step: 269950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:02:48,749-Speed 4338.62 samples/sec Loss 0.5962 Epoch: 16 Global Step: 270000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:03:18,904-[lfw][270000]XNorm: 21.931279 Training: 2021-03-15 19:03:18,904-[lfw][270000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:03:18,905-[lfw][270000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:03:53,887-[cfp_fp][270000]XNorm: 21.889325 Training: 2021-03-15 19:03:53,887-[cfp_fp][270000]Accuracy-Flip: 0.99114+-0.00442 Training: 2021-03-15 19:03:53,887-[cfp_fp][270000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:04:24,119-[agedb_30][270000]XNorm: 22.649428 Training: 2021-03-15 19:04:24,119-[agedb_30][270000]Accuracy-Flip: 0.98183+-0.00773 Training: 2021-03-15 19:04:24,119-[agedb_30][270000]Accuracy-Highest: 0.98367 Training: 2021-03-15 19:04:35,649-Speed 478.96 samples/sec Loss 0.5962 Epoch: 16 Global Step: 270050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:04:47,623-Speed 4276.10 samples/sec Loss 0.5815 Epoch: 16 Global Step: 270100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:04:59,242-Speed 4406.58 samples/sec Loss 0.5802 Epoch: 16 Global Step: 270150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:05:10,941-Speed 4376.61 samples/sec Loss 0.5895 Epoch: 16 Global Step: 270200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:05:23,505-Speed 4075.45 samples/sec Loss 0.5905 Epoch: 16 Global Step: 270250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:05:35,091-Speed 4419.27 samples/sec Loss 0.5722 Epoch: 16 Global Step: 270300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:05:46,697-Speed 4411.52 samples/sec Loss 0.5845 Epoch: 16 Global Step: 270350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:05:58,398-Speed 4376.15 samples/sec Loss 0.5942 Epoch: 16 Global Step: 270400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:06:10,078-Speed 4383.44 samples/sec Loss 0.5846 Epoch: 16 Global Step: 270450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:06:21,692-Speed 4408.77 samples/sec Loss 0.6004 Epoch: 16 Global Step: 270500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:06:33,457-Speed 4352.00 samples/sec Loss 0.5842 Epoch: 16 Global Step: 270550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:06:45,132-Speed 4385.54 samples/sec Loss 0.5876 Epoch: 16 Global Step: 270600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:06:56,872-Speed 4361.40 samples/sec Loss 0.5871 Epoch: 16 Global Step: 270650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:07:09,368-Speed 4097.56 samples/sec Loss 0.5851 Epoch: 16 Global Step: 270700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:07:21,066-Speed 4376.76 samples/sec Loss 0.5854 Epoch: 16 Global Step: 270750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:07:32,908-Speed 4324.13 samples/sec Loss 0.5881 Epoch: 16 Global Step: 270800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:07:44,835-Speed 4292.71 samples/sec Loss 0.5939 Epoch: 16 Global Step: 270850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:07:56,502-Speed 4388.68 samples/sec Loss 0.5749 Epoch: 16 Global Step: 270900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:08:08,121-Speed 4406.65 samples/sec Loss 0.5919 Epoch: 16 Global Step: 270950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:08:19,735-Speed 4408.67 samples/sec Loss 0.5800 Epoch: 16 Global Step: 271000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:08:31,528-Speed 4341.74 samples/sec Loss 0.5935 Epoch: 16 Global Step: 271050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:08:43,308-Speed 4346.52 samples/sec Loss 0.5926 Epoch: 16 Global Step: 271100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:08:55,093-Speed 4344.89 samples/sec Loss 0.5985 Epoch: 16 Global Step: 271150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:09:06,836-Speed 4360.09 samples/sec Loss 0.5878 Epoch: 16 Global Step: 271200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:09:19,326-Speed 4099.31 samples/sec Loss 0.6036 Epoch: 16 Global Step: 271250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:09:31,079-Speed 4356.55 samples/sec Loss 0.5809 Epoch: 16 Global Step: 271300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:09:42,846-Speed 4351.34 samples/sec Loss 0.5980 Epoch: 16 Global Step: 271350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:09:54,582-Speed 4362.77 samples/sec Loss 0.5836 Epoch: 16 Global Step: 271400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:10:06,378-Speed 4340.80 samples/sec Loss 0.5901 Epoch: 16 Global Step: 271450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:10:18,067-Speed 4380.48 samples/sec Loss 0.5889 Epoch: 16 Global Step: 271500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:10:29,849-Speed 4345.54 samples/sec Loss 0.5825 Epoch: 16 Global Step: 271550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:10:42,940-Speed 3911.30 samples/sec Loss 0.5845 Epoch: 16 Global Step: 271600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:10:54,647-Speed 4373.73 samples/sec Loss 0.5868 Epoch: 16 Global Step: 271650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:11:08,042-Speed 3822.25 samples/sec Loss 0.5853 Epoch: 16 Global Step: 271700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:11:19,847-Speed 4337.51 samples/sec Loss 0.5981 Epoch: 16 Global Step: 271750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:11:31,712-Speed 4315.25 samples/sec Loss 0.6010 Epoch: 16 Global Step: 271800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:11:43,553-Speed 4324.22 samples/sec Loss 0.5989 Epoch: 16 Global Step: 271850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:11:55,240-Speed 4381.01 samples/sec Loss 0.5926 Epoch: 16 Global Step: 271900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:12:06,901-Speed 4390.92 samples/sec Loss 0.5827 Epoch: 16 Global Step: 271950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:12:18,897-Speed 4268.31 samples/sec Loss 0.5849 Epoch: 16 Global Step: 272000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:12:49,143-[lfw][272000]XNorm: 21.978001 Training: 2021-03-15 19:12:49,144-[lfw][272000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:12:49,144-[lfw][272000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:13:24,267-[cfp_fp][272000]XNorm: 21.912475 Training: 2021-03-15 19:13:24,268-[cfp_fp][272000]Accuracy-Flip: 0.99157+-0.00401 Training: 2021-03-15 19:13:24,268-[cfp_fp][272000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:13:54,643-[agedb_30][272000]XNorm: 22.685418 Training: 2021-03-15 19:13:54,643-[agedb_30][272000]Accuracy-Flip: 0.98383+-0.00683 Training: 2021-03-15 19:13:54,643-[agedb_30][272000]Accuracy-Highest: 0.98383 Training: 2021-03-15 19:14:06,639-Speed 475.21 samples/sec Loss 0.5878 Epoch: 16 Global Step: 272050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:14:18,308-Speed 4388.02 samples/sec Loss 0.5917 Epoch: 16 Global Step: 272100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:14:29,988-Speed 4383.68 samples/sec Loss 0.5865 Epoch: 16 Global Step: 272150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:14:41,733-Speed 4359.57 samples/sec Loss 0.5901 Epoch: 16 Global Step: 272200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:14:53,401-Speed 4388.16 samples/sec Loss 0.6133 Epoch: 16 Global Step: 272250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:15:06,008-Speed 4061.32 samples/sec Loss 0.5696 Epoch: 16 Global Step: 272300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:15:17,764-Speed 4355.42 samples/sec Loss 0.5893 Epoch: 16 Global Step: 272350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:15:29,574-Speed 4335.63 samples/sec Loss 0.5906 Epoch: 16 Global Step: 272400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:15:41,421-Speed 4321.73 samples/sec Loss 0.5902 Epoch: 16 Global Step: 272450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:15:53,381-Speed 4281.18 samples/sec Loss 0.5722 Epoch: 16 Global Step: 272500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:16:06,203-Speed 3993.36 samples/sec Loss 0.5930 Epoch: 16 Global Step: 272550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:16:18,233-Speed 4256.13 samples/sec Loss 0.5901 Epoch: 16 Global Step: 272600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:16:30,015-Speed 4345.74 samples/sec Loss 0.5955 Epoch: 16 Global Step: 272650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:16:41,590-Speed 4423.56 samples/sec Loss 0.5900 Epoch: 16 Global Step: 272700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:16:53,369-Speed 4346.98 samples/sec Loss 0.5884 Epoch: 16 Global Step: 272750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:17:05,221-Speed 4319.97 samples/sec Loss 0.5969 Epoch: 16 Global Step: 272800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:17:17,237-Speed 4261.26 samples/sec Loss 0.5981 Epoch: 16 Global Step: 272850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:17:29,832-Speed 4065.32 samples/sec Loss 0.5809 Epoch: 16 Global Step: 272900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:17:41,771-Speed 4288.70 samples/sec Loss 0.5875 Epoch: 16 Global Step: 272950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:17:53,416-Speed 4396.66 samples/sec Loss 0.6010 Epoch: 16 Global Step: 273000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:18:05,935-Speed 4089.96 samples/sec Loss 0.5949 Epoch: 16 Global Step: 273050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:18:17,719-Speed 4345.23 samples/sec Loss 0.5914 Epoch: 16 Global Step: 273100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:18:29,394-Speed 4385.64 samples/sec Loss 0.5805 Epoch: 16 Global Step: 273150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:18:41,154-Speed 4353.84 samples/sec Loss 0.5911 Epoch: 16 Global Step: 273200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:18:52,754-Speed 4413.99 samples/sec Loss 0.5885 Epoch: 16 Global Step: 273250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:19:04,448-Speed 4378.50 samples/sec Loss 0.5935 Epoch: 16 Global Step: 273300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:19:16,047-Speed 4414.36 samples/sec Loss 0.5808 Epoch: 16 Global Step: 273350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:19:27,645-Speed 4414.65 samples/sec Loss 0.5943 Epoch: 16 Global Step: 273400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:19:39,237-Speed 4417.00 samples/sec Loss 0.5929 Epoch: 16 Global Step: 273450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:19:51,025-Speed 4343.62 samples/sec Loss 0.6003 Epoch: 16 Global Step: 273500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:20:02,810-Speed 4344.83 samples/sec Loss 0.5878 Epoch: 16 Global Step: 273550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:20:14,583-Speed 4349.07 samples/sec Loss 0.5941 Epoch: 16 Global Step: 273600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:20:26,380-Speed 4339.98 samples/sec Loss 0.5782 Epoch: 16 Global Step: 273650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:20:38,091-Speed 4372.16 samples/sec Loss 0.5886 Epoch: 16 Global Step: 273700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:20:49,824-Speed 4364.16 samples/sec Loss 0.5956 Epoch: 16 Global Step: 273750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:21:01,769-Speed 4286.44 samples/sec Loss 0.5905 Epoch: 16 Global Step: 273800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:21:13,591-Speed 4330.97 samples/sec Loss 0.5746 Epoch: 16 Global Step: 273850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:21:25,306-Speed 4370.62 samples/sec Loss 0.5914 Epoch: 16 Global Step: 273900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:21:37,014-Speed 4373.27 samples/sec Loss 0.5881 Epoch: 16 Global Step: 273950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:21:49,502-Speed 4100.04 samples/sec Loss 0.5819 Epoch: 16 Global Step: 274000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:22:19,769-[lfw][274000]XNorm: 22.019325 Training: 2021-03-15 19:22:19,769-[lfw][274000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:22:19,769-[lfw][274000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:22:54,658-[cfp_fp][274000]XNorm: 21.941980 Training: 2021-03-15 19:22:54,659-[cfp_fp][274000]Accuracy-Flip: 0.99171+-0.00393 Training: 2021-03-15 19:22:54,659-[cfp_fp][274000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:23:24,680-[agedb_30][274000]XNorm: 22.791182 Training: 2021-03-15 19:23:24,681-[agedb_30][274000]Accuracy-Flip: 0.98233+-0.00800 Training: 2021-03-15 19:23:24,681-[agedb_30][274000]Accuracy-Highest: 0.98383 Training: 2021-03-15 19:23:36,506-Speed 478.49 samples/sec Loss 0.5927 Epoch: 16 Global Step: 274050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:23:49,036-Speed 4086.45 samples/sec Loss 0.5973 Epoch: 16 Global Step: 274100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:24:00,813-Speed 4347.67 samples/sec Loss 0.5946 Epoch: 16 Global Step: 274150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:24:12,465-Speed 4393.99 samples/sec Loss 0.5860 Epoch: 16 Global Step: 274200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:24:24,309-Speed 4323.14 samples/sec Loss 0.5917 Epoch: 16 Global Step: 274250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:24:36,079-Speed 4350.42 samples/sec Loss 0.5936 Epoch: 16 Global Step: 274300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:24:49,064-Speed 3942.92 samples/sec Loss 0.5948 Epoch: 16 Global Step: 274350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:25:01,456-Speed 4131.97 samples/sec Loss 0.5960 Epoch: 16 Global Step: 274400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:25:13,180-Speed 4367.25 samples/sec Loss 0.5895 Epoch: 16 Global Step: 274450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:25:24,938-Speed 4354.77 samples/sec Loss 0.5945 Epoch: 16 Global Step: 274500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:25:36,631-Speed 4378.79 samples/sec Loss 0.5745 Epoch: 16 Global Step: 274550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:25:48,252-Speed 4406.11 samples/sec Loss 0.5897 Epoch: 16 Global Step: 274600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:00,084-Speed 4327.42 samples/sec Loss 0.5900 Epoch: 16 Global Step: 274650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:11,817-Speed 4363.99 samples/sec Loss 0.5896 Epoch: 16 Global Step: 274700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:24,311-Speed 4098.01 samples/sec Loss 0.5917 Epoch: 16 Global Step: 274750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:35,934-Speed 4405.09 samples/sec Loss 0.5818 Epoch: 16 Global Step: 274800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:47,663-Speed 4365.57 samples/sec Loss 0.5966 Epoch: 16 Global Step: 274850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:26:59,456-Speed 4341.80 samples/sec Loss 0.5891 Epoch: 16 Global Step: 274900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:27:11,148-Speed 4379.18 samples/sec Loss 0.5858 Epoch: 16 Global Step: 274950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:27:22,923-Speed 4348.50 samples/sec Loss 0.5992 Epoch: 16 Global Step: 275000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:27:34,715-Speed 4341.85 samples/sec Loss 0.6013 Epoch: 16 Global Step: 275050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:27:46,325-Speed 4410.16 samples/sec Loss 0.5815 Epoch: 16 Global Step: 275100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:27:58,126-Speed 4339.03 samples/sec Loss 0.5986 Epoch: 16 Global Step: 275150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:28:10,674-Speed 4080.44 samples/sec Loss 0.5773 Epoch: 16 Global Step: 275200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:28:22,452-Speed 4346.96 samples/sec Loss 0.5839 Epoch: 16 Global Step: 275250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:28:34,263-Speed 4335.15 samples/sec Loss 0.5838 Epoch: 16 Global Step: 275300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:28:46,014-Speed 4357.41 samples/sec Loss 0.5958 Epoch: 16 Global Step: 275350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:28:57,720-Speed 4374.15 samples/sec Loss 0.5860 Epoch: 16 Global Step: 275400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:29:10,374-Speed 4046.15 samples/sec Loss 0.5757 Epoch: 16 Global Step: 275450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:29:22,036-Speed 4390.53 samples/sec Loss 0.5948 Epoch: 16 Global Step: 275500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:29:33,648-Speed 4409.40 samples/sec Loss 0.5950 Epoch: 16 Global Step: 275550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:29:46,196-Speed 4080.53 samples/sec Loss 0.5921 Epoch: 16 Global Step: 275600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:29:57,960-Speed 4352.53 samples/sec Loss 0.6057 Epoch: 16 Global Step: 275650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:30:09,715-Speed 4355.62 samples/sec Loss 0.5986 Epoch: 16 Global Step: 275700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:30:21,549-Speed 4326.84 samples/sec Loss 0.5811 Epoch: 16 Global Step: 275750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:30:33,406-Speed 4318.11 samples/sec Loss 0.5958 Epoch: 16 Global Step: 275800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:30:44,982-Speed 4423.20 samples/sec Loss 0.5970 Epoch: 16 Global Step: 275850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:30:56,563-Speed 4421.05 samples/sec Loss 0.5889 Epoch: 16 Global Step: 275900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:31:08,455-Speed 4305.82 samples/sec Loss 0.5929 Epoch: 16 Global Step: 275950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:31:20,559-Speed 4229.95 samples/sec Loss 0.5925 Epoch: 16 Global Step: 276000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:31:50,681-[lfw][276000]XNorm: 21.896180 Training: 2021-03-15 19:31:50,681-[lfw][276000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:31:50,681-[lfw][276000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:32:25,674-[cfp_fp][276000]XNorm: 21.793652 Training: 2021-03-15 19:32:25,674-[cfp_fp][276000]Accuracy-Flip: 0.99186+-0.00384 Training: 2021-03-15 19:32:25,674-[cfp_fp][276000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:32:55,985-[agedb_30][276000]XNorm: 22.607115 Training: 2021-03-15 19:32:55,985-[agedb_30][276000]Accuracy-Flip: 0.98283+-0.00760 Training: 2021-03-15 19:32:55,985-[agedb_30][276000]Accuracy-Highest: 0.98383 Training: 2021-03-15 19:33:07,887-Speed 477.05 samples/sec Loss 0.5865 Epoch: 16 Global Step: 276050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:33:19,471-Speed 4419.82 samples/sec Loss 0.5810 Epoch: 16 Global Step: 276100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:33:31,169-Speed 4377.05 samples/sec Loss 0.5906 Epoch: 16 Global Step: 276150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:33:43,078-Speed 4299.48 samples/sec Loss 0.5844 Epoch: 16 Global Step: 276200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:33:54,809-Speed 4364.58 samples/sec Loss 0.5849 Epoch: 16 Global Step: 276250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:34:06,506-Speed 4377.47 samples/sec Loss 0.5892 Epoch: 16 Global Step: 276300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:34:18,293-Speed 4343.99 samples/sec Loss 0.5835 Epoch: 16 Global Step: 276350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:34:30,200-Speed 4300.00 samples/sec Loss 0.5876 Epoch: 16 Global Step: 276400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:34:41,828-Speed 4403.57 samples/sec Loss 0.5845 Epoch: 16 Global Step: 276450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:34:53,667-Speed 4324.77 samples/sec Loss 0.5815 Epoch: 16 Global Step: 276500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:35:05,434-Speed 4351.31 samples/sec Loss 0.5750 Epoch: 16 Global Step: 276550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:35:17,817-Speed 4134.67 samples/sec Loss 0.5896 Epoch: 16 Global Step: 276600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 19:35:30,404-Speed 4067.92 samples/sec Loss 0.5955 Epoch: 16 Global Step: 276650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:35:42,153-Speed 4358.21 samples/sec Loss 0.5846 Epoch: 16 Global Step: 276700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:35:54,041-Speed 4307.05 samples/sec Loss 0.5958 Epoch: 16 Global Step: 276750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:36:05,771-Speed 4364.74 samples/sec Loss 0.5989 Epoch: 16 Global Step: 276800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:36:17,605-Speed 4326.90 samples/sec Loss 0.5886 Epoch: 16 Global Step: 276850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:36:29,557-Speed 4284.02 samples/sec Loss 0.5929 Epoch: 16 Global Step: 276900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:36:41,361-Speed 4337.71 samples/sec Loss 0.5866 Epoch: 16 Global Step: 276950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:36:53,122-Speed 4353.42 samples/sec Loss 0.5720 Epoch: 16 Global Step: 277000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:37:05,982-Speed 3981.71 samples/sec Loss 0.5889 Epoch: 16 Global Step: 277050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:37:17,584-Speed 4412.97 samples/sec Loss 0.5828 Epoch: 16 Global Step: 277100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:37:29,452-Speed 4314.26 samples/sec Loss 0.5874 Epoch: 16 Global Step: 277150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:37:41,209-Speed 4354.94 samples/sec Loss 0.5891 Epoch: 16 Global Step: 277200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:37:53,923-Speed 4027.40 samples/sec Loss 0.5858 Epoch: 16 Global Step: 277250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:38:05,934-Speed 4262.94 samples/sec Loss 0.5806 Epoch: 16 Global Step: 277300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:38:17,751-Speed 4332.81 samples/sec Loss 0.5865 Epoch: 16 Global Step: 277350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:38:29,614-Speed 4315.93 samples/sec Loss 0.5895 Epoch: 16 Global Step: 277400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:38:41,356-Speed 4360.85 samples/sec Loss 0.5902 Epoch: 16 Global Step: 277450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:38:53,220-Speed 4315.46 samples/sec Loss 0.5868 Epoch: 16 Global Step: 277500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:39:05,207-Speed 4271.74 samples/sec Loss 0.6027 Epoch: 16 Global Step: 277550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:39:17,004-Speed 4340.02 samples/sec Loss 0.5797 Epoch: 16 Global Step: 277600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:39:28,816-Speed 4334.95 samples/sec Loss 0.5977 Epoch: 16 Global Step: 277650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:39:40,503-Speed 4380.97 samples/sec Loss 0.5904 Epoch: 16 Global Step: 277700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:39:52,277-Speed 4348.84 samples/sec Loss 0.5909 Epoch: 16 Global Step: 277750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:40:04,940-Speed 4043.34 samples/sec Loss 0.5941 Epoch: 16 Global Step: 277800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:40:16,760-Speed 4331.96 samples/sec Loss 0.6033 Epoch: 16 Global Step: 277850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:40:28,588-Speed 4328.85 samples/sec Loss 0.5951 Epoch: 16 Global Step: 277900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:40:41,070-Speed 4102.16 samples/sec Loss 0.5772 Epoch: 16 Global Step: 277950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:40:52,673-Speed 4412.70 samples/sec Loss 0.5981 Epoch: 16 Global Step: 278000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:41:23,003-[lfw][278000]XNorm: 21.916007 Training: 2021-03-15 19:41:23,004-[lfw][278000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:41:23,004-[lfw][278000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:41:58,026-[cfp_fp][278000]XNorm: 21.763370 Training: 2021-03-15 19:41:58,027-[cfp_fp][278000]Accuracy-Flip: 0.99186+-0.00389 Training: 2021-03-15 19:41:58,027-[cfp_fp][278000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:42:28,235-[agedb_30][278000]XNorm: 22.610896 Training: 2021-03-15 19:42:28,236-[agedb_30][278000]Accuracy-Flip: 0.98283+-0.00742 Training: 2021-03-15 19:42:28,236-[agedb_30][278000]Accuracy-Highest: 0.98383 Training: 2021-03-15 19:42:40,156-Speed 476.36 samples/sec Loss 0.5850 Epoch: 16 Global Step: 278050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:42:51,912-Speed 4355.15 samples/sec Loss 0.5999 Epoch: 16 Global Step: 278100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:43:03,665-Speed 4356.80 samples/sec Loss 0.5921 Epoch: 16 Global Step: 278150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:43:15,454-Speed 4343.00 samples/sec Loss 0.5936 Epoch: 16 Global Step: 278200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:43:27,938-Speed 4101.37 samples/sec Loss 0.5887 Epoch: 16 Global Step: 278250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:43:39,763-Speed 4330.08 samples/sec Loss 0.5830 Epoch: 16 Global Step: 278300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:43:51,383-Speed 4406.29 samples/sec Loss 0.5721 Epoch: 16 Global Step: 278350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:44:03,076-Speed 4379.16 samples/sec Loss 0.5895 Epoch: 16 Global Step: 278400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:44:14,740-Speed 4389.67 samples/sec Loss 0.5943 Epoch: 16 Global Step: 278450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:44:26,555-Speed 4333.37 samples/sec Loss 0.5847 Epoch: 16 Global Step: 278500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:44:38,226-Speed 4387.19 samples/sec Loss 0.5895 Epoch: 16 Global Step: 278550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:44:49,890-Speed 4389.66 samples/sec Loss 0.5931 Epoch: 16 Global Step: 278600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:01,765-Speed 4311.97 samples/sec Loss 0.5845 Epoch: 16 Global Step: 278650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:13,380-Speed 4408.22 samples/sec Loss 0.5862 Epoch: 16 Global Step: 278700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:24,958-Speed 4422.32 samples/sec Loss 0.5809 Epoch: 16 Global Step: 278750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:36,533-Speed 4423.57 samples/sec Loss 0.5900 Epoch: 16 Global Step: 278800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:48,131-Speed 4414.88 samples/sec Loss 0.5948 Epoch: 16 Global Step: 278850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:45:59,900-Speed 4350.47 samples/sec Loss 0.5803 Epoch: 16 Global Step: 278900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:46:11,707-Speed 4336.47 samples/sec Loss 0.5953 Epoch: 16 Global Step: 278950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:46:23,548-Speed 4324.18 samples/sec Loss 0.5907 Epoch: 16 Global Step: 279000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:46:35,153-Speed 4411.91 samples/sec Loss 0.5966 Epoch: 16 Global Step: 279050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:46:47,486-Speed 4151.84 samples/sec Loss 0.5930 Epoch: 16 Global Step: 279100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:46:59,193-Speed 4373.55 samples/sec Loss 0.5827 Epoch: 16 Global Step: 279150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:47:10,845-Speed 4394.17 samples/sec Loss 0.5885 Epoch: 16 Global Step: 279200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:47:22,652-Speed 4336.69 samples/sec Loss 0.5844 Epoch: 16 Global Step: 279250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:47:34,421-Speed 4350.69 samples/sec Loss 0.5890 Epoch: 16 Global Step: 279300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:47:46,202-Speed 4345.88 samples/sec Loss 0.5905 Epoch: 16 Global Step: 279350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:47:58,525-Speed 4155.21 samples/sec Loss 0.5814 Epoch: 16 Global Step: 279400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:48:10,115-Speed 4417.66 samples/sec Loss 0.5842 Epoch: 16 Global Step: 279450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:48:21,898-Speed 4345.44 samples/sec Loss 0.5937 Epoch: 16 Global Step: 279500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:48:33,682-Speed 4344.81 samples/sec Loss 0.5932 Epoch: 16 Global Step: 279550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:48:45,592-Speed 4299.08 samples/sec Loss 0.5888 Epoch: 16 Global Step: 279600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:48:57,546-Speed 4283.33 samples/sec Loss 0.5940 Epoch: 16 Global Step: 279650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:49:09,952-Speed 4127.38 samples/sec Loss 0.5767 Epoch: 16 Global Step: 279700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:49:23,273-Speed 3843.67 samples/sec Loss 0.5948 Epoch: 16 Global Step: 279750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:49:34,985-Speed 4371.49 samples/sec Loss 0.5853 Epoch: 16 Global Step: 279800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:49:46,757-Speed 4349.58 samples/sec Loss 0.6084 Epoch: 16 Global Step: 279850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:49:58,364-Speed 4411.24 samples/sec Loss 0.5998 Epoch: 16 Global Step: 279900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:50:10,149-Speed 4344.89 samples/sec Loss 0.5958 Epoch: 16 Global Step: 279950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:50:21,753-Speed 4412.33 samples/sec Loss 0.5904 Epoch: 16 Global Step: 280000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:50:51,979-[lfw][280000]XNorm: 22.044226 Training: 2021-03-15 19:50:51,979-[lfw][280000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 19:50:51,979-[lfw][280000]Accuracy-Highest: 0.99833 Training: 2021-03-15 19:51:27,197-[cfp_fp][280000]XNorm: 21.937221 Training: 2021-03-15 19:51:27,198-[cfp_fp][280000]Accuracy-Flip: 0.99143+-0.00414 Training: 2021-03-15 19:51:27,198-[cfp_fp][280000]Accuracy-Highest: 0.99200 Training: 2021-03-15 19:51:57,478-[agedb_30][280000]XNorm: 22.722081 Training: 2021-03-15 19:51:57,478-[agedb_30][280000]Accuracy-Flip: 0.98317+-0.00740 Training: 2021-03-15 19:51:57,478-[agedb_30][280000]Accuracy-Highest: 0.98383 Training: 2021-03-15 19:52:09,169-Speed 476.66 samples/sec Loss 0.6015 Epoch: 16 Global Step: 280050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:52:20,923-Speed 4355.96 samples/sec Loss 0.5848 Epoch: 16 Global Step: 280100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:52:32,750-Speed 4329.23 samples/sec Loss 0.5898 Epoch: 16 Global Step: 280150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:52:44,408-Speed 4391.96 samples/sec Loss 0.5922 Epoch: 16 Global Step: 280200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:52:56,052-Speed 4397.48 samples/sec Loss 0.5845 Epoch: 16 Global Step: 280250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:53:07,615-Speed 4427.88 samples/sec Loss 0.5890 Epoch: 16 Global Step: 280300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:53:19,299-Speed 4382.21 samples/sec Loss 0.5809 Epoch: 16 Global Step: 280350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:53:32,758-Speed 3804.45 samples/sec Loss 0.5968 Epoch: 16 Global Step: 280400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:53:44,484-Speed 4366.58 samples/sec Loss 0.5897 Epoch: 16 Global Step: 280450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:53:56,213-Speed 4365.54 samples/sec Loss 0.5848 Epoch: 16 Global Step: 280500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:54:08,021-Speed 4336.15 samples/sec Loss 0.5965 Epoch: 16 Global Step: 280550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:54:19,848-Speed 4329.27 samples/sec Loss 0.5917 Epoch: 16 Global Step: 280600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:54:31,605-Speed 4355.02 samples/sec Loss 0.5814 Epoch: 16 Global Step: 280650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:54:43,457-Speed 4320.11 samples/sec Loss 0.5883 Epoch: 16 Global Step: 280700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:54:55,034-Speed 4422.82 samples/sec Loss 0.5849 Epoch: 16 Global Step: 280750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:55:06,769-Speed 4362.92 samples/sec Loss 0.5920 Epoch: 16 Global Step: 280800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:55:18,553-Speed 4345.33 samples/sec Loss 0.5741 Epoch: 16 Global Step: 280850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:55:30,290-Speed 4362.25 samples/sec Loss 0.5985 Epoch: 16 Global Step: 280900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:55:42,847-Speed 4077.45 samples/sec Loss 0.5927 Epoch: 16 Global Step: 280950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:55:54,575-Speed 4366.12 samples/sec Loss 0.5876 Epoch: 16 Global Step: 281000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:56:06,222-Speed 4395.98 samples/sec Loss 0.5815 Epoch: 16 Global Step: 281050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:56:17,767-Speed 4435.00 samples/sec Loss 0.5875 Epoch: 16 Global Step: 281100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:56:29,749-Speed 4273.13 samples/sec Loss 0.6003 Epoch: 16 Global Step: 281150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:56:41,494-Speed 4359.64 samples/sec Loss 0.5794 Epoch: 16 Global Step: 281200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:56:53,179-Speed 4381.72 samples/sec Loss 0.5838 Epoch: 16 Global Step: 281250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:57:04,772-Speed 4416.80 samples/sec Loss 0.5865 Epoch: 16 Global Step: 281300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:57:16,602-Speed 4327.92 samples/sec Loss 0.5875 Epoch: 16 Global Step: 281350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:57:28,466-Speed 4315.91 samples/sec Loss 0.5984 Epoch: 16 Global Step: 281400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:57:40,588-Speed 4223.81 samples/sec Loss 0.5858 Epoch: 16 Global Step: 281450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:57:53,165-Speed 4071.01 samples/sec Loss 0.5865 Epoch: 16 Global Step: 281500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:58:04,743-Speed 4422.57 samples/sec Loss 0.5781 Epoch: 16 Global Step: 281550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:58:16,357-Speed 4408.50 samples/sec Loss 0.5910 Epoch: 16 Global Step: 281600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:58:27,900-Speed 4435.72 samples/sec Loss 0.5793 Epoch: 16 Global Step: 281650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:58:39,733-Speed 4327.26 samples/sec Loss 0.5981 Epoch: 16 Global Step: 281700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:58:51,608-Speed 4311.79 samples/sec Loss 0.5894 Epoch: 16 Global Step: 281750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:59:03,241-Speed 4401.30 samples/sec Loss 0.5833 Epoch: 16 Global Step: 281800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:59:14,844-Speed 4412.85 samples/sec Loss 0.5819 Epoch: 16 Global Step: 281850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:59:26,709-Speed 4315.32 samples/sec Loss 0.5900 Epoch: 16 Global Step: 281900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:59:38,522-Speed 4334.48 samples/sec Loss 0.5888 Epoch: 16 Global Step: 281950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 19:59:50,238-Speed 4370.11 samples/sec Loss 0.5844 Epoch: 16 Global Step: 282000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:00:20,542-[lfw][282000]XNorm: 21.926741 Training: 2021-03-15 20:00:20,542-[lfw][282000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-15 20:00:20,542-[lfw][282000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:00:55,570-[cfp_fp][282000]XNorm: 21.786393 Training: 2021-03-15 20:00:55,571-[cfp_fp][282000]Accuracy-Flip: 0.99186+-0.00399 Training: 2021-03-15 20:00:55,571-[cfp_fp][282000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:01:25,741-[agedb_30][282000]XNorm: 22.625588 Training: 2021-03-15 20:01:25,742-[agedb_30][282000]Accuracy-Flip: 0.98250+-0.00779 Training: 2021-03-15 20:01:25,742-[agedb_30][282000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:01:38,176-Speed 474.35 samples/sec Loss 0.5823 Epoch: 16 Global Step: 282050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:01:49,948-Speed 4349.44 samples/sec Loss 0.5975 Epoch: 16 Global Step: 282100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:02:01,584-Speed 4400.54 samples/sec Loss 0.5787 Epoch: 16 Global Step: 282150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:02:14,096-Speed 4092.06 samples/sec Loss 0.5815 Epoch: 16 Global Step: 282200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:02:25,717-Speed 4406.29 samples/sec Loss 0.5991 Epoch: 16 Global Step: 282250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:02:37,476-Speed 4354.30 samples/sec Loss 0.5858 Epoch: 16 Global Step: 282300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:02:49,112-Speed 4400.29 samples/sec Loss 0.5920 Epoch: 16 Global Step: 282350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:03:00,804-Speed 4378.91 samples/sec Loss 0.5820 Epoch: 16 Global Step: 282400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:03:13,238-Speed 4118.03 samples/sec Loss 0.5870 Epoch: 16 Global Step: 282450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:03:25,784-Speed 4081.03 samples/sec Loss 0.5769 Epoch: 16 Global Step: 282500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:03:37,475-Speed 4379.82 samples/sec Loss 0.5841 Epoch: 16 Global Step: 282550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:03:49,279-Speed 4337.76 samples/sec Loss 0.5970 Epoch: 16 Global Step: 282600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:04:00,804-Speed 4442.59 samples/sec Loss 0.5908 Epoch: 16 Global Step: 282650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:04:12,770-Speed 4278.95 samples/sec Loss 0.5848 Epoch: 16 Global Step: 282700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:04:24,305-Speed 4438.89 samples/sec Loss 0.5835 Epoch: 16 Global Step: 282750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:04:36,013-Speed 4373.20 samples/sec Loss 0.5923 Epoch: 16 Global Step: 282800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:04:48,416-Speed 4128.11 samples/sec Loss 0.5899 Epoch: 16 Global Step: 282850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:00,320-Speed 4301.16 samples/sec Loss 0.5916 Epoch: 16 Global Step: 282900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:11,979-Speed 4391.73 samples/sec Loss 0.5866 Epoch: 16 Global Step: 282950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:24,388-Speed 4126.28 samples/sec Loss 0.5841 Epoch: 16 Global Step: 283000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:36,221-Speed 4327.12 samples/sec Loss 0.5936 Epoch: 16 Global Step: 283050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:47,952-Speed 4364.43 samples/sec Loss 0.5885 Epoch: 16 Global Step: 283100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:05:59,715-Speed 4352.68 samples/sec Loss 0.5901 Epoch: 16 Global Step: 283150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:06:11,568-Speed 4320.07 samples/sec Loss 0.5964 Epoch: 16 Global Step: 283200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:06:23,327-Speed 4354.22 samples/sec Loss 0.5918 Epoch: 16 Global Step: 283250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:06:34,928-Speed 4413.38 samples/sec Loss 0.5841 Epoch: 16 Global Step: 283300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:06:46,755-Speed 4329.40 samples/sec Loss 0.5881 Epoch: 16 Global Step: 283350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:06:58,571-Speed 4333.05 samples/sec Loss 0.5933 Epoch: 16 Global Step: 283400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:07:10,433-Speed 4316.52 samples/sec Loss 0.5931 Epoch: 16 Global Step: 283450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:07:22,281-Speed 4321.80 samples/sec Loss 0.5906 Epoch: 16 Global Step: 283500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:07:34,016-Speed 4363.05 samples/sec Loss 0.5975 Epoch: 16 Global Step: 283550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:07:46,642-Speed 4055.25 samples/sec Loss 0.5890 Epoch: 16 Global Step: 283600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:07:58,429-Speed 4344.22 samples/sec Loss 0.5776 Epoch: 16 Global Step: 283650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:08:10,066-Speed 4399.55 samples/sec Loss 0.5844 Epoch: 16 Global Step: 283700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:08:33,658-Speed 2170.36 samples/sec Loss 0.5907 Epoch: 17 Global Step: 283750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:08:46,502-Speed 3986.28 samples/sec Loss 0.5707 Epoch: 17 Global Step: 283800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:08:58,690-Speed 4201.33 samples/sec Loss 0.5757 Epoch: 17 Global Step: 283850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:09:10,429-Speed 4361.76 samples/sec Loss 0.5717 Epoch: 17 Global Step: 283900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:09:22,193-Speed 4352.39 samples/sec Loss 0.5850 Epoch: 17 Global Step: 283950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:09:34,928-Speed 4020.35 samples/sec Loss 0.5684 Epoch: 17 Global Step: 284000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:10:05,204-[lfw][284000]XNorm: 21.895995 Training: 2021-03-15 20:10:05,205-[lfw][284000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 20:10:05,205-[lfw][284000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:10:40,374-[cfp_fp][284000]XNorm: 21.822878 Training: 2021-03-15 20:10:40,375-[cfp_fp][284000]Accuracy-Flip: 0.99029+-0.00464 Training: 2021-03-15 20:10:40,375-[cfp_fp][284000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:11:10,862-[agedb_30][284000]XNorm: 22.648354 Training: 2021-03-15 20:11:10,862-[agedb_30][284000]Accuracy-Flip: 0.98350+-0.00697 Training: 2021-03-15 20:11:10,862-[agedb_30][284000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:11:22,692-Speed 475.12 samples/sec Loss 0.5685 Epoch: 17 Global Step: 284050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:11:34,671-Speed 4274.35 samples/sec Loss 0.5771 Epoch: 17 Global Step: 284100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:11:46,311-Speed 4398.90 samples/sec Loss 0.5732 Epoch: 17 Global Step: 284150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:11:58,123-Speed 4334.57 samples/sec Loss 0.5708 Epoch: 17 Global Step: 284200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:12:09,752-Speed 4402.94 samples/sec Loss 0.5795 Epoch: 17 Global Step: 284250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:12:21,541-Speed 4343.38 samples/sec Loss 0.5754 Epoch: 17 Global Step: 284300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:12:33,259-Speed 4369.25 samples/sec Loss 0.5790 Epoch: 17 Global Step: 284350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:12:44,991-Speed 4364.45 samples/sec Loss 0.5706 Epoch: 17 Global Step: 284400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:12:56,846-Speed 4318.82 samples/sec Loss 0.5715 Epoch: 17 Global Step: 284450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:13:08,498-Speed 4394.62 samples/sec Loss 0.5670 Epoch: 17 Global Step: 284500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:13:20,349-Speed 4320.48 samples/sec Loss 0.5737 Epoch: 17 Global Step: 284550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:13:32,095-Speed 4359.00 samples/sec Loss 0.5701 Epoch: 17 Global Step: 284600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:13:43,628-Speed 4439.77 samples/sec Loss 0.5816 Epoch: 17 Global Step: 284650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:13:56,031-Speed 4128.02 samples/sec Loss 0.5627 Epoch: 17 Global Step: 284700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:14:07,660-Speed 4402.93 samples/sec Loss 0.5815 Epoch: 17 Global Step: 284750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:14:20,175-Speed 4091.29 samples/sec Loss 0.5731 Epoch: 17 Global Step: 284800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:14:32,183-Speed 4263.80 samples/sec Loss 0.5605 Epoch: 17 Global Step: 284850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:14:44,008-Speed 4329.94 samples/sec Loss 0.5566 Epoch: 17 Global Step: 284900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:14:55,674-Speed 4389.17 samples/sec Loss 0.5696 Epoch: 17 Global Step: 284950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:15:07,425-Speed 4357.37 samples/sec Loss 0.5849 Epoch: 17 Global Step: 285000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:15:19,243-Speed 4332.34 samples/sec Loss 0.5752 Epoch: 17 Global Step: 285050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:15:32,423-Speed 3884.94 samples/sec Loss 0.5738 Epoch: 17 Global Step: 285100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:15:44,370-Speed 4285.76 samples/sec Loss 0.5782 Epoch: 17 Global Step: 285150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:15:56,580-Speed 4193.32 samples/sec Loss 0.5758 Epoch: 17 Global Step: 285200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:16:09,031-Speed 4112.30 samples/sec Loss 0.5720 Epoch: 17 Global Step: 285250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:16:20,872-Speed 4324.13 samples/sec Loss 0.5801 Epoch: 17 Global Step: 285300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:16:32,529-Speed 4392.40 samples/sec Loss 0.5882 Epoch: 17 Global Step: 285350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:16:44,496-Speed 4278.70 samples/sec Loss 0.5561 Epoch: 17 Global Step: 285400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:16:56,198-Speed 4375.56 samples/sec Loss 0.5759 Epoch: 17 Global Step: 285450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:17:07,828-Speed 4402.46 samples/sec Loss 0.5728 Epoch: 17 Global Step: 285500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:17:19,547-Speed 4369.29 samples/sec Loss 0.5679 Epoch: 17 Global Step: 285550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:17:32,062-Speed 4091.18 samples/sec Loss 0.5719 Epoch: 17 Global Step: 285600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:17:43,788-Speed 4366.55 samples/sec Loss 0.5613 Epoch: 17 Global Step: 285650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:17:55,564-Speed 4347.91 samples/sec Loss 0.5620 Epoch: 17 Global Step: 285700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:18:07,392-Speed 4328.74 samples/sec Loss 0.5728 Epoch: 17 Global Step: 285750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:18:19,075-Speed 4382.80 samples/sec Loss 0.5821 Epoch: 17 Global Step: 285800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:18:30,767-Speed 4379.04 samples/sec Loss 0.5628 Epoch: 17 Global Step: 285850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:18:42,545-Speed 4347.43 samples/sec Loss 0.5818 Epoch: 17 Global Step: 285900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:18:54,261-Speed 4370.19 samples/sec Loss 0.5845 Epoch: 17 Global Step: 285950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:19:05,935-Speed 4386.25 samples/sec Loss 0.5554 Epoch: 17 Global Step: 286000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:19:36,196-[lfw][286000]XNorm: 21.988558 Training: 2021-03-15 20:19:36,197-[lfw][286000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 20:19:36,197-[lfw][286000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:20:11,350-[cfp_fp][286000]XNorm: 21.931819 Training: 2021-03-15 20:20:11,351-[cfp_fp][286000]Accuracy-Flip: 0.99114+-0.00432 Training: 2021-03-15 20:20:11,351-[cfp_fp][286000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:20:41,640-[agedb_30][286000]XNorm: 22.763087 Training: 2021-03-15 20:20:41,640-[agedb_30][286000]Accuracy-Flip: 0.98333+-0.00745 Training: 2021-03-15 20:20:41,640-[agedb_30][286000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:20:53,326-Speed 476.76 samples/sec Loss 0.5812 Epoch: 17 Global Step: 286050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:21:05,045-Speed 4369.01 samples/sec Loss 0.5704 Epoch: 17 Global Step: 286100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:21:17,147-Speed 4231.10 samples/sec Loss 0.5714 Epoch: 17 Global Step: 286150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:21:28,724-Speed 4422.62 samples/sec Loss 0.5699 Epoch: 17 Global Step: 286200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:21:41,317-Speed 4065.99 samples/sec Loss 0.5753 Epoch: 17 Global Step: 286250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:21:52,985-Speed 4388.29 samples/sec Loss 0.5700 Epoch: 17 Global Step: 286300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:22:04,786-Speed 4338.84 samples/sec Loss 0.5773 Epoch: 17 Global Step: 286350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:22:16,549-Speed 4352.66 samples/sec Loss 0.5651 Epoch: 17 Global Step: 286400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:22:29,200-Speed 4047.10 samples/sec Loss 0.5780 Epoch: 17 Global Step: 286450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:22:41,038-Speed 4325.31 samples/sec Loss 0.5707 Epoch: 17 Global Step: 286500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:22:52,848-Speed 4335.42 samples/sec Loss 0.5743 Epoch: 17 Global Step: 286550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:23:04,554-Speed 4374.06 samples/sec Loss 0.5671 Epoch: 17 Global Step: 286600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:23:16,383-Speed 4328.52 samples/sec Loss 0.5677 Epoch: 17 Global Step: 286650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:23:28,127-Speed 4359.88 samples/sec Loss 0.5588 Epoch: 17 Global Step: 286700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:23:39,566-Speed 4476.30 samples/sec Loss 0.5744 Epoch: 17 Global Step: 286750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:23:51,244-Speed 4384.37 samples/sec Loss 0.5876 Epoch: 17 Global Step: 286800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:24:02,907-Speed 4390.30 samples/sec Loss 0.5739 Epoch: 17 Global Step: 286850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:24:14,670-Speed 4352.61 samples/sec Loss 0.5666 Epoch: 17 Global Step: 286900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:24:26,326-Speed 4392.70 samples/sec Loss 0.5635 Epoch: 17 Global Step: 286950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:24:37,949-Speed 4405.33 samples/sec Loss 0.5786 Epoch: 17 Global Step: 287000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:24:49,586-Speed 4399.85 samples/sec Loss 0.5664 Epoch: 17 Global Step: 287050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:25:01,297-Speed 4372.32 samples/sec Loss 0.5824 Epoch: 17 Global Step: 287100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:25:13,279-Speed 4273.12 samples/sec Loss 0.5665 Epoch: 17 Global Step: 287150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:25:25,867-Speed 4067.53 samples/sec Loss 0.5801 Epoch: 17 Global Step: 287200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:25:37,509-Speed 4398.21 samples/sec Loss 0.5709 Epoch: 17 Global Step: 287250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:25:49,269-Speed 4353.56 samples/sec Loss 0.5839 Epoch: 17 Global Step: 287300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:26:00,910-Speed 4398.67 samples/sec Loss 0.5667 Epoch: 17 Global Step: 287350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:26:12,676-Speed 4351.61 samples/sec Loss 0.5690 Epoch: 17 Global Step: 287400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:26:24,412-Speed 4362.85 samples/sec Loss 0.5650 Epoch: 17 Global Step: 287450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:26:36,840-Speed 4119.91 samples/sec Loss 0.5662 Epoch: 17 Global Step: 287500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:26:48,567-Speed 4365.99 samples/sec Loss 0.5701 Epoch: 17 Global Step: 287550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:27:00,364-Speed 4340.51 samples/sec Loss 0.5596 Epoch: 17 Global Step: 287600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:27:12,142-Speed 4347.02 samples/sec Loss 0.5641 Epoch: 17 Global Step: 287650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:27:24,682-Speed 4083.26 samples/sec Loss 0.5776 Epoch: 17 Global Step: 287700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:27:36,329-Speed 4395.84 samples/sec Loss 0.5791 Epoch: 17 Global Step: 287750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:27:48,865-Speed 4084.56 samples/sec Loss 0.5744 Epoch: 17 Global Step: 287800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:28:00,654-Speed 4343.21 samples/sec Loss 0.5784 Epoch: 17 Global Step: 287850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:28:13,237-Speed 4069.09 samples/sec Loss 0.5629 Epoch: 17 Global Step: 287900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:28:24,711-Speed 4462.53 samples/sec Loss 0.5658 Epoch: 17 Global Step: 287950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:28:36,349-Speed 4399.56 samples/sec Loss 0.5707 Epoch: 17 Global Step: 288000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:29:06,592-[lfw][288000]XNorm: 21.875605 Training: 2021-03-15 20:29:06,593-[lfw][288000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 20:29:06,593-[lfw][288000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:29:41,750-[cfp_fp][288000]XNorm: 21.858439 Training: 2021-03-15 20:29:41,750-[cfp_fp][288000]Accuracy-Flip: 0.99129+-0.00431 Training: 2021-03-15 20:29:41,750-[cfp_fp][288000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:30:12,064-[agedb_30][288000]XNorm: 22.655605 Training: 2021-03-15 20:30:12,064-[agedb_30][288000]Accuracy-Flip: 0.98383+-0.00687 Training: 2021-03-15 20:30:12,064-[agedb_30][288000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:30:23,673-Speed 477.06 samples/sec Loss 0.5608 Epoch: 17 Global Step: 288050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:30:35,325-Speed 4394.55 samples/sec Loss 0.5664 Epoch: 17 Global Step: 288100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:30:47,187-Speed 4316.40 samples/sec Loss 0.5569 Epoch: 17 Global Step: 288150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:30:59,016-Speed 4328.24 samples/sec Loss 0.5716 Epoch: 17 Global Step: 288200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:31:10,977-Speed 4281.06 samples/sec Loss 0.5671 Epoch: 17 Global Step: 288250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:31:23,625-Speed 4048.07 samples/sec Loss 0.5811 Epoch: 17 Global Step: 288300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:31:35,420-Speed 4341.02 samples/sec Loss 0.5496 Epoch: 17 Global Step: 288350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:31:47,258-Speed 4325.12 samples/sec Loss 0.5667 Epoch: 17 Global Step: 288400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:31:59,069-Speed 4335.30 samples/sec Loss 0.5707 Epoch: 17 Global Step: 288450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:32:10,874-Speed 4337.09 samples/sec Loss 0.5700 Epoch: 17 Global Step: 288500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:32:22,612-Speed 4362.24 samples/sec Loss 0.5717 Epoch: 17 Global Step: 288550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:32:34,257-Speed 4396.84 samples/sec Loss 0.5816 Epoch: 17 Global Step: 288600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:32:45,903-Speed 4396.47 samples/sec Loss 0.5800 Epoch: 17 Global Step: 288650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:32:57,645-Speed 4360.68 samples/sec Loss 0.5693 Epoch: 17 Global Step: 288700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:33:09,382-Speed 4362.28 samples/sec Loss 0.5662 Epoch: 17 Global Step: 288750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:33:21,256-Speed 4312.37 samples/sec Loss 0.5718 Epoch: 17 Global Step: 288800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:33:33,055-Speed 4339.47 samples/sec Loss 0.5618 Epoch: 17 Global Step: 288850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:33:46,631-Speed 3771.30 samples/sec Loss 0.5625 Epoch: 17 Global Step: 288900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:33:58,710-Speed 4239.14 samples/sec Loss 0.5604 Epoch: 17 Global Step: 288950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:34:10,507-Speed 4340.04 samples/sec Loss 0.5769 Epoch: 17 Global Step: 289000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:34:22,231-Speed 4367.29 samples/sec Loss 0.5582 Epoch: 17 Global Step: 289050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:34:33,969-Speed 4362.10 samples/sec Loss 0.5830 Epoch: 17 Global Step: 289100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:34:45,757-Speed 4343.60 samples/sec Loss 0.5764 Epoch: 17 Global Step: 289150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:34:57,386-Speed 4402.94 samples/sec Loss 0.5561 Epoch: 17 Global Step: 289200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:35:08,985-Speed 4414.30 samples/sec Loss 0.5686 Epoch: 17 Global Step: 289250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:35:20,724-Speed 4362.01 samples/sec Loss 0.5609 Epoch: 17 Global Step: 289300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 20:35:32,587-Speed 4316.05 samples/sec Loss 0.5727 Epoch: 17 Global Step: 289350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:35:44,351-Speed 4352.38 samples/sec Loss 0.5668 Epoch: 17 Global Step: 289400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:35:56,072-Speed 4368.28 samples/sec Loss 0.5815 Epoch: 17 Global Step: 289450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:36:07,686-Speed 4408.92 samples/sec Loss 0.5510 Epoch: 17 Global Step: 289500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:36:19,172-Speed 4457.79 samples/sec Loss 0.5761 Epoch: 17 Global Step: 289550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:36:31,505-Speed 4151.61 samples/sec Loss 0.5612 Epoch: 17 Global Step: 289600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:36:43,152-Speed 4395.99 samples/sec Loss 0.5689 Epoch: 17 Global Step: 289650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:36:54,855-Speed 4375.14 samples/sec Loss 0.5665 Epoch: 17 Global Step: 289700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:37:06,554-Speed 4376.52 samples/sec Loss 0.5719 Epoch: 17 Global Step: 289750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:37:18,385-Speed 4327.82 samples/sec Loss 0.5686 Epoch: 17 Global Step: 289800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:37:30,014-Speed 4403.07 samples/sec Loss 0.5747 Epoch: 17 Global Step: 289850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:37:41,481-Speed 4465.15 samples/sec Loss 0.5678 Epoch: 17 Global Step: 289900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:37:53,050-Speed 4425.82 samples/sec Loss 0.5781 Epoch: 17 Global Step: 289950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:38:04,821-Speed 4349.77 samples/sec Loss 0.5761 Epoch: 17 Global Step: 290000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:38:35,008-[lfw][290000]XNorm: 21.961160 Training: 2021-03-15 20:38:35,008-[lfw][290000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 20:38:35,008-[lfw][290000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:39:09,973-[cfp_fp][290000]XNorm: 21.866103 Training: 2021-03-15 20:39:09,973-[cfp_fp][290000]Accuracy-Flip: 0.99057+-0.00470 Training: 2021-03-15 20:39:09,973-[cfp_fp][290000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:39:40,045-[agedb_30][290000]XNorm: 22.672583 Training: 2021-03-15 20:39:40,046-[agedb_30][290000]Accuracy-Flip: 0.98333+-0.00753 Training: 2021-03-15 20:39:40,046-[agedb_30][290000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:39:52,772-Speed 474.29 samples/sec Loss 0.5673 Epoch: 17 Global Step: 290050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:40:04,552-Speed 4346.37 samples/sec Loss 0.5736 Epoch: 17 Global Step: 290100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:40:16,309-Speed 4355.24 samples/sec Loss 0.5781 Epoch: 17 Global Step: 290150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:40:28,145-Speed 4325.76 samples/sec Loss 0.5738 Epoch: 17 Global Step: 290200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:40:40,737-Speed 4066.24 samples/sec Loss 0.5881 Epoch: 17 Global Step: 290250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:40:52,484-Speed 4358.73 samples/sec Loss 0.5612 Epoch: 17 Global Step: 290300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:41:05,248-Speed 4011.59 samples/sec Loss 0.5619 Epoch: 17 Global Step: 290350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:41:16,918-Speed 4387.61 samples/sec Loss 0.5756 Epoch: 17 Global Step: 290400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:41:28,535-Speed 4407.28 samples/sec Loss 0.5729 Epoch: 17 Global Step: 290450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:41:40,300-Speed 4352.04 samples/sec Loss 0.5715 Epoch: 17 Global Step: 290500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:41:51,845-Speed 4435.24 samples/sec Loss 0.5866 Epoch: 17 Global Step: 290550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:42:04,310-Speed 4107.46 samples/sec Loss 0.5634 Epoch: 17 Global Step: 290600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:42:16,168-Speed 4318.17 samples/sec Loss 0.5684 Epoch: 17 Global Step: 290650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:42:27,857-Speed 4380.32 samples/sec Loss 0.5514 Epoch: 17 Global Step: 290700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:42:39,428-Speed 4424.95 samples/sec Loss 0.5686 Epoch: 17 Global Step: 290750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:42:51,235-Speed 4336.56 samples/sec Loss 0.5691 Epoch: 17 Global Step: 290800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:43:02,911-Speed 4385.07 samples/sec Loss 0.5794 Epoch: 17 Global Step: 290850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:43:15,216-Speed 4161.08 samples/sec Loss 0.5597 Epoch: 17 Global Step: 290900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:43:26,917-Speed 4375.95 samples/sec Loss 0.5737 Epoch: 17 Global Step: 290950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:43:39,061-Speed 4216.17 samples/sec Loss 0.5625 Epoch: 17 Global Step: 291000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:43:50,774-Speed 4371.51 samples/sec Loss 0.5696 Epoch: 17 Global Step: 291050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:44:02,506-Speed 4364.15 samples/sec Loss 0.5697 Epoch: 17 Global Step: 291100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:44:14,151-Speed 4396.95 samples/sec Loss 0.5646 Epoch: 17 Global Step: 291150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:44:25,845-Speed 4378.65 samples/sec Loss 0.5609 Epoch: 17 Global Step: 291200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:44:37,491-Speed 4396.22 samples/sec Loss 0.5671 Epoch: 17 Global Step: 291250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:44:49,204-Speed 4371.65 samples/sec Loss 0.5696 Epoch: 17 Global Step: 291300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:45:01,700-Speed 4097.24 samples/sec Loss 0.5618 Epoch: 17 Global Step: 291350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:45:13,267-Speed 4426.64 samples/sec Loss 0.5712 Epoch: 17 Global Step: 291400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:45:24,923-Speed 4392.83 samples/sec Loss 0.5628 Epoch: 17 Global Step: 291450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:45:36,827-Speed 4301.38 samples/sec Loss 0.5799 Epoch: 17 Global Step: 291500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:45:48,701-Speed 4311.89 samples/sec Loss 0.5778 Epoch: 17 Global Step: 291550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:46:01,403-Speed 4030.96 samples/sec Loss 0.5740 Epoch: 17 Global Step: 291600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:46:13,145-Speed 4360.65 samples/sec Loss 0.5700 Epoch: 17 Global Step: 291650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:46:24,967-Speed 4331.26 samples/sec Loss 0.5587 Epoch: 17 Global Step: 291700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:46:36,542-Speed 4423.50 samples/sec Loss 0.5868 Epoch: 17 Global Step: 291750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:46:48,374-Speed 4327.29 samples/sec Loss 0.5607 Epoch: 17 Global Step: 291800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:47:00,256-Speed 4309.39 samples/sec Loss 0.5771 Epoch: 17 Global Step: 291850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:47:11,941-Speed 4381.78 samples/sec Loss 0.5790 Epoch: 17 Global Step: 291900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:47:23,785-Speed 4322.93 samples/sec Loss 0.5569 Epoch: 17 Global Step: 291950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:47:35,888-Speed 4230.69 samples/sec Loss 0.5617 Epoch: 17 Global Step: 292000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:48:06,140-[lfw][292000]XNorm: 21.959268 Training: 2021-03-15 20:48:06,140-[lfw][292000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 20:48:06,140-[lfw][292000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:48:41,331-[cfp_fp][292000]XNorm: 21.859277 Training: 2021-03-15 20:48:41,331-[cfp_fp][292000]Accuracy-Flip: 0.99200+-0.00400 Training: 2021-03-15 20:48:41,333-[cfp_fp][292000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:49:11,553-[agedb_30][292000]XNorm: 22.689525 Training: 2021-03-15 20:49:11,553-[agedb_30][292000]Accuracy-Flip: 0.98333+-0.00749 Training: 2021-03-15 20:49:11,553-[agedb_30][292000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:49:23,277-Speed 476.77 samples/sec Loss 0.5696 Epoch: 17 Global Step: 292050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:49:35,880-Speed 4062.74 samples/sec Loss 0.5889 Epoch: 17 Global Step: 292100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:49:47,480-Speed 4413.95 samples/sec Loss 0.5729 Epoch: 17 Global Step: 292150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:49:59,194-Speed 4371.03 samples/sec Loss 0.5628 Epoch: 17 Global Step: 292200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:50:11,001-Speed 4336.43 samples/sec Loss 0.5767 Epoch: 17 Global Step: 292250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:50:22,913-Speed 4298.39 samples/sec Loss 0.5811 Epoch: 17 Global Step: 292300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:50:34,721-Speed 4336.35 samples/sec Loss 0.5650 Epoch: 17 Global Step: 292350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:50:46,322-Speed 4413.60 samples/sec Loss 0.5752 Epoch: 17 Global Step: 292400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:50:58,769-Speed 4113.37 samples/sec Loss 0.5567 Epoch: 17 Global Step: 292450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:51:10,579-Speed 4335.49 samples/sec Loss 0.5735 Epoch: 17 Global Step: 292500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:51:22,311-Speed 4364.58 samples/sec Loss 0.5646 Epoch: 17 Global Step: 292550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:51:34,080-Speed 4350.31 samples/sec Loss 0.5823 Epoch: 17 Global Step: 292600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:51:45,598-Speed 4445.64 samples/sec Loss 0.5770 Epoch: 17 Global Step: 292650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:51:57,409-Speed 4335.16 samples/sec Loss 0.5703 Epoch: 17 Global Step: 292700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:52:09,177-Speed 4350.78 samples/sec Loss 0.5735 Epoch: 17 Global Step: 292750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:52:20,805-Speed 4403.50 samples/sec Loss 0.5627 Epoch: 17 Global Step: 292800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:52:32,401-Speed 4415.28 samples/sec Loss 0.5775 Epoch: 17 Global Step: 292850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:52:44,102-Speed 4375.83 samples/sec Loss 0.5697 Epoch: 17 Global Step: 292900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:52:55,607-Speed 4450.64 samples/sec Loss 0.5675 Epoch: 17 Global Step: 292950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:53:08,861-Speed 3863.00 samples/sec Loss 0.5697 Epoch: 17 Global Step: 293000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:53:20,603-Speed 4360.59 samples/sec Loss 0.5651 Epoch: 17 Global Step: 293050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:53:32,165-Speed 4428.55 samples/sec Loss 0.5701 Epoch: 17 Global Step: 293100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:53:43,755-Speed 4417.62 samples/sec Loss 0.5746 Epoch: 17 Global Step: 293150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:53:55,544-Speed 4343.34 samples/sec Loss 0.5679 Epoch: 17 Global Step: 293200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:54:08,025-Speed 4102.43 samples/sec Loss 0.5598 Epoch: 17 Global Step: 293250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:54:19,773-Speed 4358.23 samples/sec Loss 0.5664 Epoch: 17 Global Step: 293300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:54:31,564-Speed 4342.49 samples/sec Loss 0.5680 Epoch: 17 Global Step: 293350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:54:43,368-Speed 4337.73 samples/sec Loss 0.5740 Epoch: 17 Global Step: 293400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:54:55,196-Speed 4328.87 samples/sec Loss 0.5666 Epoch: 17 Global Step: 293450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:55:06,852-Speed 4392.61 samples/sec Loss 0.5713 Epoch: 17 Global Step: 293500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:55:18,565-Speed 4371.64 samples/sec Loss 0.5754 Epoch: 17 Global Step: 293550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:55:31,189-Speed 4055.82 samples/sec Loss 0.5684 Epoch: 17 Global Step: 293600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:55:42,792-Speed 4412.69 samples/sec Loss 0.5768 Epoch: 17 Global Step: 293650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:55:54,391-Speed 4414.42 samples/sec Loss 0.5593 Epoch: 17 Global Step: 293700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:56:06,310-Speed 4295.86 samples/sec Loss 0.5704 Epoch: 17 Global Step: 293750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:56:18,878-Speed 4073.96 samples/sec Loss 0.5636 Epoch: 17 Global Step: 293800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:56:30,517-Speed 4399.50 samples/sec Loss 0.5596 Epoch: 17 Global Step: 293850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:56:42,310-Speed 4341.55 samples/sec Loss 0.5743 Epoch: 17 Global Step: 293900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:56:54,116-Speed 4336.80 samples/sec Loss 0.5691 Epoch: 17 Global Step: 293950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:57:05,904-Speed 4343.78 samples/sec Loss 0.5664 Epoch: 17 Global Step: 294000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:57:36,216-[lfw][294000]XNorm: 21.872684 Training: 2021-03-15 20:57:36,217-[lfw][294000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 20:57:36,217-[lfw][294000]Accuracy-Highest: 0.99833 Training: 2021-03-15 20:58:11,263-[cfp_fp][294000]XNorm: 21.817520 Training: 2021-03-15 20:58:11,263-[cfp_fp][294000]Accuracy-Flip: 0.99157+-0.00426 Training: 2021-03-15 20:58:11,263-[cfp_fp][294000]Accuracy-Highest: 0.99200 Training: 2021-03-15 20:58:41,517-[agedb_30][294000]XNorm: 22.621681 Training: 2021-03-15 20:58:41,517-[agedb_30][294000]Accuracy-Flip: 0.98367+-0.00702 Training: 2021-03-15 20:58:41,517-[agedb_30][294000]Accuracy-Highest: 0.98383 Training: 2021-03-15 20:58:53,100-Speed 477.63 samples/sec Loss 0.5614 Epoch: 17 Global Step: 294050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:59:04,881-Speed 4346.11 samples/sec Loss 0.5608 Epoch: 17 Global Step: 294100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:59:16,778-Speed 4303.81 samples/sec Loss 0.5650 Epoch: 17 Global Step: 294150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:59:28,501-Speed 4367.78 samples/sec Loss 0.5768 Epoch: 17 Global Step: 294200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:59:40,716-Speed 4191.65 samples/sec Loss 0.5899 Epoch: 17 Global Step: 294250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 20:59:53,239-Speed 4088.68 samples/sec Loss 0.5607 Epoch: 17 Global Step: 294300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:00:04,823-Speed 4419.95 samples/sec Loss 0.5611 Epoch: 17 Global Step: 294350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:00:16,650-Speed 4329.45 samples/sec Loss 0.5850 Epoch: 17 Global Step: 294400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:00:28,336-Speed 4381.50 samples/sec Loss 0.5575 Epoch: 17 Global Step: 294450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:00:39,917-Speed 4421.14 samples/sec Loss 0.5687 Epoch: 17 Global Step: 294500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:00:52,629-Speed 4027.92 samples/sec Loss 0.5712 Epoch: 17 Global Step: 294550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:01:04,287-Speed 4391.81 samples/sec Loss 0.5604 Epoch: 17 Global Step: 294600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:01:15,964-Speed 4384.75 samples/sec Loss 0.5601 Epoch: 17 Global Step: 294650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:01:27,797-Speed 4327.26 samples/sec Loss 0.5707 Epoch: 17 Global Step: 294700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:01:39,491-Speed 4378.34 samples/sec Loss 0.5671 Epoch: 17 Global Step: 294750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:01:51,000-Speed 4448.90 samples/sec Loss 0.5698 Epoch: 17 Global Step: 294800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:02:02,613-Speed 4409.19 samples/sec Loss 0.5597 Epoch: 17 Global Step: 294850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:02:14,989-Speed 4137.16 samples/sec Loss 0.5697 Epoch: 17 Global Step: 294900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:02:26,529-Speed 4436.78 samples/sec Loss 0.5700 Epoch: 17 Global Step: 294950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:02:38,246-Speed 4370.09 samples/sec Loss 0.5629 Epoch: 17 Global Step: 295000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:02:50,109-Speed 4315.89 samples/sec Loss 0.5671 Epoch: 17 Global Step: 295050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:03:01,793-Speed 4382.45 samples/sec Loss 0.5730 Epoch: 17 Global Step: 295100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:03:13,668-Speed 4311.56 samples/sec Loss 0.5661 Epoch: 17 Global Step: 295150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:03:25,447-Speed 4347.07 samples/sec Loss 0.5651 Epoch: 17 Global Step: 295200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:03:37,006-Speed 4429.69 samples/sec Loss 0.5656 Epoch: 17 Global Step: 295250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:03:48,921-Speed 4297.07 samples/sec Loss 0.5696 Epoch: 17 Global Step: 295300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:00,678-Speed 4354.96 samples/sec Loss 0.5710 Epoch: 17 Global Step: 295350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:12,502-Speed 4330.52 samples/sec Loss 0.5741 Epoch: 17 Global Step: 295400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:24,118-Speed 4407.84 samples/sec Loss 0.5734 Epoch: 17 Global Step: 295450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:35,771-Speed 4393.69 samples/sec Loss 0.5654 Epoch: 17 Global Step: 295500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:47,391-Speed 4406.72 samples/sec Loss 0.5579 Epoch: 17 Global Step: 295550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:04:59,063-Speed 4386.68 samples/sec Loss 0.5582 Epoch: 17 Global Step: 295600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:05:10,734-Speed 4387.12 samples/sec Loss 0.5749 Epoch: 17 Global Step: 295650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:05:23,234-Speed 4096.05 samples/sec Loss 0.5699 Epoch: 17 Global Step: 295700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:05:35,923-Speed 4035.21 samples/sec Loss 0.5653 Epoch: 17 Global Step: 295750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:05:47,772-Speed 4321.21 samples/sec Loss 0.5636 Epoch: 17 Global Step: 295800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:05:59,429-Speed 4392.28 samples/sec Loss 0.5694 Epoch: 17 Global Step: 295850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:06:11,154-Speed 4366.93 samples/sec Loss 0.5736 Epoch: 17 Global Step: 295900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:06:22,754-Speed 4414.00 samples/sec Loss 0.5656 Epoch: 17 Global Step: 295950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:06:35,317-Speed 4075.64 samples/sec Loss 0.5694 Epoch: 17 Global Step: 296000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:07:05,443-[lfw][296000]XNorm: 21.933718 Training: 2021-03-15 21:07:05,443-[lfw][296000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 21:07:05,443-[lfw][296000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:07:40,600-[cfp_fp][296000]XNorm: 21.874434 Training: 2021-03-15 21:07:40,601-[cfp_fp][296000]Accuracy-Flip: 0.99114+-0.00428 Training: 2021-03-15 21:07:40,601-[cfp_fp][296000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:08:10,824-[agedb_30][296000]XNorm: 22.674727 Training: 2021-03-15 21:08:10,824-[agedb_30][296000]Accuracy-Flip: 0.98217+-0.00746 Training: 2021-03-15 21:08:10,824-[agedb_30][296000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:08:22,500-Speed 477.69 samples/sec Loss 0.5730 Epoch: 17 Global Step: 296050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:08:34,114-Speed 4408.69 samples/sec Loss 0.5635 Epoch: 17 Global Step: 296100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:08:45,900-Speed 4344.42 samples/sec Loss 0.5672 Epoch: 17 Global Step: 296150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:08:57,533-Speed 4401.17 samples/sec Loss 0.5497 Epoch: 17 Global Step: 296200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:09:11,055-Speed 3786.52 samples/sec Loss 0.5617 Epoch: 17 Global Step: 296250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:09:22,640-Speed 4420.00 samples/sec Loss 0.5834 Epoch: 17 Global Step: 296300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:09:34,364-Speed 4367.07 samples/sec Loss 0.5649 Epoch: 17 Global Step: 296350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:09:46,188-Speed 4330.50 samples/sec Loss 0.5646 Epoch: 17 Global Step: 296400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:09:58,253-Speed 4243.66 samples/sec Loss 0.5601 Epoch: 17 Global Step: 296450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:10:10,020-Speed 4351.43 samples/sec Loss 0.5705 Epoch: 17 Global Step: 296500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:10:21,900-Speed 4309.91 samples/sec Loss 0.5706 Epoch: 17 Global Step: 296550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:10:33,725-Speed 4330.12 samples/sec Loss 0.5680 Epoch: 17 Global Step: 296600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:10:45,643-Speed 4296.02 samples/sec Loss 0.5693 Epoch: 17 Global Step: 296650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:10:57,295-Speed 4394.27 samples/sec Loss 0.5730 Epoch: 17 Global Step: 296700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:11:09,101-Speed 4336.87 samples/sec Loss 0.5790 Epoch: 17 Global Step: 296750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:11:20,949-Speed 4321.73 samples/sec Loss 0.5781 Epoch: 17 Global Step: 296800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:11:32,561-Speed 4409.28 samples/sec Loss 0.5614 Epoch: 17 Global Step: 296850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:11:44,226-Speed 4389.58 samples/sec Loss 0.5735 Epoch: 17 Global Step: 296900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:11:56,632-Speed 4126.99 samples/sec Loss 0.5700 Epoch: 17 Global Step: 296950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:12:08,288-Speed 4392.69 samples/sec Loss 0.5609 Epoch: 17 Global Step: 297000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:12:19,999-Speed 4372.45 samples/sec Loss 0.5730 Epoch: 17 Global Step: 297050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:12:32,824-Speed 3992.11 samples/sec Loss 0.5658 Epoch: 17 Global Step: 297100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:12:44,405-Speed 4421.26 samples/sec Loss 0.5785 Epoch: 17 Global Step: 297150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:12:55,939-Speed 4439.09 samples/sec Loss 0.5759 Epoch: 17 Global Step: 297200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:13:07,820-Speed 4309.81 samples/sec Loss 0.5709 Epoch: 17 Global Step: 297250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:13:19,477-Speed 4392.11 samples/sec Loss 0.5765 Epoch: 17 Global Step: 297300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:13:32,046-Speed 4073.96 samples/sec Loss 0.5750 Epoch: 17 Global Step: 297350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:13:43,976-Speed 4291.74 samples/sec Loss 0.5746 Epoch: 17 Global Step: 297400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:13:55,771-Speed 4340.88 samples/sec Loss 0.5697 Epoch: 17 Global Step: 297450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:14:07,405-Speed 4401.21 samples/sec Loss 0.5624 Epoch: 17 Global Step: 297500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:14:19,030-Speed 4404.22 samples/sec Loss 0.5737 Epoch: 17 Global Step: 297550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:14:30,756-Speed 4366.71 samples/sec Loss 0.5618 Epoch: 17 Global Step: 297600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:14:42,442-Speed 4381.60 samples/sec Loss 0.5829 Epoch: 17 Global Step: 297650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:14:54,201-Speed 4354.17 samples/sec Loss 0.5782 Epoch: 17 Global Step: 297700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:15:05,985-Speed 4345.15 samples/sec Loss 0.5546 Epoch: 17 Global Step: 297750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:15:17,640-Speed 4392.97 samples/sec Loss 0.5771 Epoch: 17 Global Step: 297800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:15:29,262-Speed 4405.73 samples/sec Loss 0.5696 Epoch: 17 Global Step: 297850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:15:40,976-Speed 4371.04 samples/sec Loss 0.5783 Epoch: 17 Global Step: 297900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:15:52,495-Speed 4444.74 samples/sec Loss 0.5620 Epoch: 17 Global Step: 297950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:16:04,158-Speed 4390.19 samples/sec Loss 0.5859 Epoch: 17 Global Step: 298000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:16:34,439-[lfw][298000]XNorm: 21.921282 Training: 2021-03-15 21:16:34,439-[lfw][298000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 21:16:34,439-[lfw][298000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:17:09,574-[cfp_fp][298000]XNorm: 21.892866 Training: 2021-03-15 21:17:09,574-[cfp_fp][298000]Accuracy-Flip: 0.99043+-0.00461 Training: 2021-03-15 21:17:09,574-[cfp_fp][298000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:17:39,884-[agedb_30][298000]XNorm: 22.683492 Training: 2021-03-15 21:17:39,884-[agedb_30][298000]Accuracy-Flip: 0.98367+-0.00702 Training: 2021-03-15 21:17:39,884-[agedb_30][298000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:17:51,672-Speed 476.22 samples/sec Loss 0.5670 Epoch: 17 Global Step: 298050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:18:03,304-Speed 4401.61 samples/sec Loss 0.5621 Epoch: 17 Global Step: 298100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:18:14,981-Speed 4384.88 samples/sec Loss 0.5646 Epoch: 17 Global Step: 298150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:18:26,732-Speed 4357.49 samples/sec Loss 0.5677 Epoch: 17 Global Step: 298200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:18:38,393-Speed 4390.76 samples/sec Loss 0.5668 Epoch: 17 Global Step: 298250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:18:50,070-Speed 4384.73 samples/sec Loss 0.5550 Epoch: 17 Global Step: 298300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:19:01,638-Speed 4426.47 samples/sec Loss 0.5760 Epoch: 17 Global Step: 298350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:19:14,364-Speed 4023.22 samples/sec Loss 0.5717 Epoch: 17 Global Step: 298400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:19:26,125-Speed 4353.61 samples/sec Loss 0.5690 Epoch: 17 Global Step: 298450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:19:37,803-Speed 4384.58 samples/sec Loss 0.5589 Epoch: 17 Global Step: 298500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:19:50,355-Speed 4079.09 samples/sec Loss 0.5718 Epoch: 17 Global Step: 298550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:20:02,023-Speed 4388.12 samples/sec Loss 0.5663 Epoch: 17 Global Step: 298600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:20:13,723-Speed 4376.53 samples/sec Loss 0.5776 Epoch: 17 Global Step: 298650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:20:27,195-Speed 3800.48 samples/sec Loss 0.5627 Epoch: 17 Global Step: 298700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:20:38,917-Speed 4367.89 samples/sec Loss 0.5627 Epoch: 17 Global Step: 298750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:20:50,957-Speed 4252.79 samples/sec Loss 0.5828 Epoch: 17 Global Step: 298800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:21:03,541-Speed 4068.83 samples/sec Loss 0.5740 Epoch: 17 Global Step: 298850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:21:15,243-Speed 4375.37 samples/sec Loss 0.5723 Epoch: 17 Global Step: 298900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:21:26,885-Speed 4398.00 samples/sec Loss 0.5736 Epoch: 17 Global Step: 298950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:21:38,515-Speed 4402.74 samples/sec Loss 0.5729 Epoch: 17 Global Step: 299000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:21:50,140-Speed 4404.61 samples/sec Loss 0.5718 Epoch: 17 Global Step: 299050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:01,481-Speed 4514.77 samples/sec Loss 0.5642 Epoch: 17 Global Step: 299100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:13,404-Speed 4294.29 samples/sec Loss 0.5693 Epoch: 17 Global Step: 299150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:25,156-Speed 4356.70 samples/sec Loss 0.5650 Epoch: 17 Global Step: 299200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:36,708-Speed 4432.37 samples/sec Loss 0.5661 Epoch: 17 Global Step: 299250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:48,369-Speed 4391.05 samples/sec Loss 0.5614 Epoch: 17 Global Step: 299300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:22:59,995-Speed 4404.08 samples/sec Loss 0.5691 Epoch: 17 Global Step: 299350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:23:11,725-Speed 4364.98 samples/sec Loss 0.5729 Epoch: 17 Global Step: 299400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:23:22,977-Speed 4550.40 samples/sec Loss 0.5746 Epoch: 17 Global Step: 299450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:23:34,564-Speed 4419.09 samples/sec Loss 0.5793 Epoch: 17 Global Step: 299500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:23:47,978-Speed 3816.85 samples/sec Loss 0.5801 Epoch: 17 Global Step: 299550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:23:59,783-Speed 4337.55 samples/sec Loss 0.5614 Epoch: 17 Global Step: 299600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:24:11,535-Speed 4356.89 samples/sec Loss 0.5751 Epoch: 17 Global Step: 299650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:24:23,112-Speed 4422.78 samples/sec Loss 0.5814 Epoch: 17 Global Step: 299700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:24:34,947-Speed 4326.23 samples/sec Loss 0.5666 Epoch: 17 Global Step: 299750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:24:47,542-Speed 4065.10 samples/sec Loss 0.5614 Epoch: 17 Global Step: 299800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:24:59,175-Speed 4401.48 samples/sec Loss 0.5762 Epoch: 17 Global Step: 299850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:25:11,165-Speed 4270.36 samples/sec Loss 0.5693 Epoch: 17 Global Step: 299900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:25:22,872-Speed 4373.60 samples/sec Loss 0.5669 Epoch: 17 Global Step: 299950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:25:34,721-Speed 4321.39 samples/sec Loss 0.5543 Epoch: 17 Global Step: 300000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:26:04,985-[lfw][300000]XNorm: 21.875979 Training: 2021-03-15 21:26:04,986-[lfw][300000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 21:26:04,986-[lfw][300000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:26:39,859-[cfp_fp][300000]XNorm: 21.847178 Training: 2021-03-15 21:26:39,860-[cfp_fp][300000]Accuracy-Flip: 0.99114+-0.00428 Training: 2021-03-15 21:26:39,860-[cfp_fp][300000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:27:10,047-[agedb_30][300000]XNorm: 22.636613 Training: 2021-03-15 21:27:10,047-[agedb_30][300000]Accuracy-Flip: 0.98383+-0.00687 Training: 2021-03-15 21:27:10,047-[agedb_30][300000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:27:21,801-Speed 478.15 samples/sec Loss 0.5663 Epoch: 17 Global Step: 300050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:27:33,518-Speed 4369.64 samples/sec Loss 0.5898 Epoch: 17 Global Step: 300100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:27:45,145-Speed 4403.90 samples/sec Loss 0.5728 Epoch: 17 Global Step: 300150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:27:56,817-Speed 4386.72 samples/sec Loss 0.5646 Epoch: 17 Global Step: 300200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:28:08,692-Speed 4311.76 samples/sec Loss 0.5771 Epoch: 17 Global Step: 300250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:28:20,414-Speed 4368.04 samples/sec Loss 0.5777 Epoch: 17 Global Step: 300300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:28:31,961-Speed 4434.31 samples/sec Loss 0.5743 Epoch: 17 Global Step: 300350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:28:43,586-Speed 4404.23 samples/sec Loss 0.5531 Epoch: 17 Global Step: 300400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:29:08,095-Speed 2089.13 samples/sec Loss 0.5701 Epoch: 18 Global Step: 300450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:29:20,681-Speed 4068.15 samples/sec Loss 0.5646 Epoch: 18 Global Step: 300500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:29:32,794-Speed 4227.06 samples/sec Loss 0.5862 Epoch: 18 Global Step: 300550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:29:44,615-Speed 4331.42 samples/sec Loss 0.5697 Epoch: 18 Global Step: 300600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:29:56,422-Speed 4336.59 samples/sec Loss 0.5651 Epoch: 18 Global Step: 300650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:30:08,326-Speed 4301.34 samples/sec Loss 0.5612 Epoch: 18 Global Step: 300700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:30:20,280-Speed 4283.11 samples/sec Loss 0.5662 Epoch: 18 Global Step: 300750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:30:32,134-Speed 4319.47 samples/sec Loss 0.5753 Epoch: 18 Global Step: 300800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:30:44,047-Speed 4297.84 samples/sec Loss 0.5753 Epoch: 18 Global Step: 300850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:30:56,059-Speed 4262.64 samples/sec Loss 0.5633 Epoch: 18 Global Step: 300900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:31:07,851-Speed 4342.19 samples/sec Loss 0.5743 Epoch: 18 Global Step: 300950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:31:19,673-Speed 4331.29 samples/sec Loss 0.5566 Epoch: 18 Global Step: 301000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:31:31,346-Speed 4386.29 samples/sec Loss 0.5657 Epoch: 18 Global Step: 301050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:31:43,035-Speed 4380.14 samples/sec Loss 0.5683 Epoch: 18 Global Step: 301100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:31:56,765-Speed 3729.42 samples/sec Loss 0.5631 Epoch: 18 Global Step: 301150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:32:08,389-Speed 4404.46 samples/sec Loss 0.5616 Epoch: 18 Global Step: 301200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:32:21,249-Speed 3981.65 samples/sec Loss 0.5602 Epoch: 18 Global Step: 301250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:32:32,953-Speed 4374.75 samples/sec Loss 0.5683 Epoch: 18 Global Step: 301300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:32:44,699-Speed 4358.90 samples/sec Loss 0.5710 Epoch: 18 Global Step: 301350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:32:57,724-Speed 3931.31 samples/sec Loss 0.5632 Epoch: 18 Global Step: 301400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:33:09,464-Speed 4361.05 samples/sec Loss 0.5677 Epoch: 18 Global Step: 301450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:33:21,081-Speed 4407.52 samples/sec Loss 0.5638 Epoch: 18 Global Step: 301500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:33:33,685-Speed 4062.48 samples/sec Loss 0.5701 Epoch: 18 Global Step: 301550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:33:45,475-Speed 4342.72 samples/sec Loss 0.5596 Epoch: 18 Global Step: 301600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:33:57,236-Speed 4353.59 samples/sec Loss 0.5707 Epoch: 18 Global Step: 301650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:34:09,257-Speed 4259.27 samples/sec Loss 0.5664 Epoch: 18 Global Step: 301700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:34:20,945-Speed 4381.08 samples/sec Loss 0.5592 Epoch: 18 Global Step: 301750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:34:32,652-Speed 4373.35 samples/sec Loss 0.5699 Epoch: 18 Global Step: 301800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:34:44,351-Speed 4376.66 samples/sec Loss 0.5702 Epoch: 18 Global Step: 301850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:34:56,708-Speed 4143.68 samples/sec Loss 0.5599 Epoch: 18 Global Step: 301900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:35:08,670-Speed 4280.43 samples/sec Loss 0.5732 Epoch: 18 Global Step: 301950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:35:20,394-Speed 4367.35 samples/sec Loss 0.5661 Epoch: 18 Global Step: 302000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:35:50,733-[lfw][302000]XNorm: 21.936079 Training: 2021-03-15 21:35:50,733-[lfw][302000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 21:35:50,734-[lfw][302000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:36:25,838-[cfp_fp][302000]XNorm: 21.900437 Training: 2021-03-15 21:36:25,839-[cfp_fp][302000]Accuracy-Flip: 0.99071+-0.00439 Training: 2021-03-15 21:36:25,839-[cfp_fp][302000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:36:56,188-[agedb_30][302000]XNorm: 22.705552 Training: 2021-03-15 21:36:56,188-[agedb_30][302000]Accuracy-Flip: 0.98300+-0.00706 Training: 2021-03-15 21:36:56,188-[agedb_30][302000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:37:07,770-Speed 476.83 samples/sec Loss 0.5722 Epoch: 18 Global Step: 302050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 21:37:19,480-Speed 4372.57 samples/sec Loss 0.5656 Epoch: 18 Global Step: 302100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:37:32,103-Speed 4056.13 samples/sec Loss 0.5632 Epoch: 18 Global Step: 302150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:37:43,889-Speed 4344.20 samples/sec Loss 0.5722 Epoch: 18 Global Step: 302200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:37:55,540-Speed 4394.64 samples/sec Loss 0.5569 Epoch: 18 Global Step: 302250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:38:08,026-Speed 4100.93 samples/sec Loss 0.5685 Epoch: 18 Global Step: 302300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:38:19,721-Speed 4377.97 samples/sec Loss 0.5613 Epoch: 18 Global Step: 302350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:38:31,490-Speed 4350.43 samples/sec Loss 0.5568 Epoch: 18 Global Step: 302400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:38:43,343-Speed 4320.09 samples/sec Loss 0.5776 Epoch: 18 Global Step: 302450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:38:55,121-Speed 4347.26 samples/sec Loss 0.5792 Epoch: 18 Global Step: 302500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:39:06,902-Speed 4345.90 samples/sec Loss 0.5663 Epoch: 18 Global Step: 302550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:39:18,677-Speed 4348.51 samples/sec Loss 0.5649 Epoch: 18 Global Step: 302600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:39:30,236-Speed 4429.70 samples/sec Loss 0.5649 Epoch: 18 Global Step: 302650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:39:42,016-Speed 4346.39 samples/sec Loss 0.5681 Epoch: 18 Global Step: 302700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:39:53,732-Speed 4370.20 samples/sec Loss 0.5629 Epoch: 18 Global Step: 302750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:40:05,559-Speed 4329.41 samples/sec Loss 0.5603 Epoch: 18 Global Step: 302800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:40:17,375-Speed 4333.29 samples/sec Loss 0.5627 Epoch: 18 Global Step: 302850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:40:29,032-Speed 4392.14 samples/sec Loss 0.5639 Epoch: 18 Global Step: 302900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:40:40,775-Speed 4360.27 samples/sec Loss 0.5637 Epoch: 18 Global Step: 302950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:40:52,658-Speed 4308.93 samples/sec Loss 0.5533 Epoch: 18 Global Step: 303000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:41:04,355-Speed 4377.49 samples/sec Loss 0.5663 Epoch: 18 Global Step: 303050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:41:16,047-Speed 4379.32 samples/sec Loss 0.5677 Epoch: 18 Global Step: 303100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:41:27,836-Speed 4343.07 samples/sec Loss 0.5594 Epoch: 18 Global Step: 303150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:41:39,745-Speed 4299.20 samples/sec Loss 0.5559 Epoch: 18 Global Step: 303200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:41:51,394-Speed 4395.49 samples/sec Loss 0.5557 Epoch: 18 Global Step: 303250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:42:03,040-Speed 4396.76 samples/sec Loss 0.5753 Epoch: 18 Global Step: 303300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:42:14,870-Speed 4328.02 samples/sec Loss 0.5569 Epoch: 18 Global Step: 303350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:42:26,682-Speed 4334.63 samples/sec Loss 0.5596 Epoch: 18 Global Step: 303400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:42:38,442-Speed 4354.17 samples/sec Loss 0.5685 Epoch: 18 Global Step: 303450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:42:50,313-Speed 4313.09 samples/sec Loss 0.5652 Epoch: 18 Global Step: 303500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:43:02,242-Speed 4292.22 samples/sec Loss 0.5573 Epoch: 18 Global Step: 303550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:43:13,997-Speed 4355.87 samples/sec Loss 0.5700 Epoch: 18 Global Step: 303600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:43:26,691-Speed 4033.36 samples/sec Loss 0.5524 Epoch: 18 Global Step: 303650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:43:38,646-Speed 4282.84 samples/sec Loss 0.5765 Epoch: 18 Global Step: 303700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:43:50,323-Speed 4384.85 samples/sec Loss 0.5806 Epoch: 18 Global Step: 303750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:44:02,217-Speed 4305.07 samples/sec Loss 0.5793 Epoch: 18 Global Step: 303800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:44:14,945-Speed 4022.62 samples/sec Loss 0.5710 Epoch: 18 Global Step: 303850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:44:26,820-Speed 4311.81 samples/sec Loss 0.5679 Epoch: 18 Global Step: 303900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:44:39,634-Speed 3995.70 samples/sec Loss 0.5555 Epoch: 18 Global Step: 303950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:44:51,570-Speed 4289.78 samples/sec Loss 0.5712 Epoch: 18 Global Step: 304000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:45:21,822-[lfw][304000]XNorm: 21.932844 Training: 2021-03-15 21:45:21,823-[lfw][304000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 21:45:21,823-[lfw][304000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:45:56,912-[cfp_fp][304000]XNorm: 21.880623 Training: 2021-03-15 21:45:56,912-[cfp_fp][304000]Accuracy-Flip: 0.99100+-0.00443 Training: 2021-03-15 21:45:56,912-[cfp_fp][304000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:46:27,224-[agedb_30][304000]XNorm: 22.692628 Training: 2021-03-15 21:46:27,224-[agedb_30][304000]Accuracy-Flip: 0.98267+-0.00684 Training: 2021-03-15 21:46:27,224-[agedb_30][304000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:46:38,895-Speed 477.06 samples/sec Loss 0.5616 Epoch: 18 Global Step: 304050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:46:50,677-Speed 4345.62 samples/sec Loss 0.5575 Epoch: 18 Global Step: 304100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:47:03,192-Speed 4091.45 samples/sec Loss 0.5690 Epoch: 18 Global Step: 304150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:47:14,825-Speed 4401.30 samples/sec Loss 0.5614 Epoch: 18 Global Step: 304200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:47:27,338-Speed 4091.88 samples/sec Loss 0.5564 Epoch: 18 Global Step: 304250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:47:39,020-Speed 4382.99 samples/sec Loss 0.5644 Epoch: 18 Global Step: 304300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:47:51,926-Speed 3967.29 samples/sec Loss 0.5662 Epoch: 18 Global Step: 304350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:48:03,591-Speed 4389.54 samples/sec Loss 0.5644 Epoch: 18 Global Step: 304400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:48:15,195-Speed 4412.41 samples/sec Loss 0.5609 Epoch: 18 Global Step: 304450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:48:27,116-Speed 4295.30 samples/sec Loss 0.5667 Epoch: 18 Global Step: 304500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:48:38,969-Speed 4319.44 samples/sec Loss 0.5686 Epoch: 18 Global Step: 304550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:48:50,702-Speed 4364.03 samples/sec Loss 0.5768 Epoch: 18 Global Step: 304600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:49:02,272-Speed 4425.44 samples/sec Loss 0.5650 Epoch: 18 Global Step: 304650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:49:13,954-Speed 4382.85 samples/sec Loss 0.5634 Epoch: 18 Global Step: 304700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:49:25,552-Speed 4414.96 samples/sec Loss 0.5686 Epoch: 18 Global Step: 304750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:49:38,848-Speed 3850.80 samples/sec Loss 0.5699 Epoch: 18 Global Step: 304800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:49:50,574-Speed 4366.52 samples/sec Loss 0.5763 Epoch: 18 Global Step: 304850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:50:02,326-Speed 4356.94 samples/sec Loss 0.5710 Epoch: 18 Global Step: 304900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:50:14,120-Speed 4341.32 samples/sec Loss 0.5654 Epoch: 18 Global Step: 304950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:50:26,014-Speed 4304.75 samples/sec Loss 0.5785 Epoch: 18 Global Step: 305000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:50:37,886-Speed 4313.05 samples/sec Loss 0.5643 Epoch: 18 Global Step: 305050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:50:49,917-Speed 4255.69 samples/sec Loss 0.5676 Epoch: 18 Global Step: 305100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:01,651-Speed 4363.82 samples/sec Loss 0.5777 Epoch: 18 Global Step: 305150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:13,297-Speed 4396.25 samples/sec Loss 0.5651 Epoch: 18 Global Step: 305200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:24,777-Speed 4460.37 samples/sec Loss 0.5616 Epoch: 18 Global Step: 305250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:36,340-Speed 4428.04 samples/sec Loss 0.5625 Epoch: 18 Global Step: 305300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:48,081-Speed 4360.93 samples/sec Loss 0.5467 Epoch: 18 Global Step: 305350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:51:59,730-Speed 4395.30 samples/sec Loss 0.5687 Epoch: 18 Global Step: 305400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:52:11,450-Speed 4368.70 samples/sec Loss 0.5694 Epoch: 18 Global Step: 305450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:52:23,293-Speed 4323.58 samples/sec Loss 0.5651 Epoch: 18 Global Step: 305500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:52:35,153-Speed 4317.34 samples/sec Loss 0.5650 Epoch: 18 Global Step: 305550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:52:46,817-Speed 4389.56 samples/sec Loss 0.5729 Epoch: 18 Global Step: 305600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:52:58,630-Speed 4334.34 samples/sec Loss 0.5671 Epoch: 18 Global Step: 305650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:53:10,356-Speed 4366.51 samples/sec Loss 0.5648 Epoch: 18 Global Step: 305700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:53:21,964-Speed 4411.02 samples/sec Loss 0.5679 Epoch: 18 Global Step: 305750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:53:33,676-Speed 4371.56 samples/sec Loss 0.5762 Epoch: 18 Global Step: 305800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:53:45,451-Speed 4348.39 samples/sec Loss 0.5601 Epoch: 18 Global Step: 305850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:53:57,213-Speed 4353.43 samples/sec Loss 0.5616 Epoch: 18 Global Step: 305900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:54:08,831-Speed 4407.06 samples/sec Loss 0.5720 Epoch: 18 Global Step: 305950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:54:20,487-Speed 4392.88 samples/sec Loss 0.5658 Epoch: 18 Global Step: 306000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:54:50,690-[lfw][306000]XNorm: 21.950462 Training: 2021-03-15 21:54:50,690-[lfw][306000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 21:54:50,690-[lfw][306000]Accuracy-Highest: 0.99833 Training: 2021-03-15 21:55:25,839-[cfp_fp][306000]XNorm: 21.846069 Training: 2021-03-15 21:55:25,840-[cfp_fp][306000]Accuracy-Flip: 0.99086+-0.00434 Training: 2021-03-15 21:55:25,840-[cfp_fp][306000]Accuracy-Highest: 0.99200 Training: 2021-03-15 21:55:56,254-[agedb_30][306000]XNorm: 22.677579 Training: 2021-03-15 21:55:56,254-[agedb_30][306000]Accuracy-Flip: 0.98383+-0.00687 Training: 2021-03-15 21:55:56,254-[agedb_30][306000]Accuracy-Highest: 0.98383 Training: 2021-03-15 21:56:07,960-Speed 476.40 samples/sec Loss 0.5717 Epoch: 18 Global Step: 306050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:56:19,740-Speed 4346.26 samples/sec Loss 0.5666 Epoch: 18 Global Step: 306100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:56:31,583-Speed 4323.55 samples/sec Loss 0.5648 Epoch: 18 Global Step: 306150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:56:44,244-Speed 4043.98 samples/sec Loss 0.5691 Epoch: 18 Global Step: 306200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:56:55,951-Speed 4373.62 samples/sec Loss 0.5549 Epoch: 18 Global Step: 306250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:57:07,760-Speed 4336.01 samples/sec Loss 0.5686 Epoch: 18 Global Step: 306300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:57:19,733-Speed 4276.38 samples/sec Loss 0.5615 Epoch: 18 Global Step: 306350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:57:31,411-Speed 4384.64 samples/sec Loss 0.5783 Epoch: 18 Global Step: 306400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:57:43,193-Speed 4345.62 samples/sec Loss 0.5705 Epoch: 18 Global Step: 306450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:57:55,786-Speed 4065.89 samples/sec Loss 0.5551 Epoch: 18 Global Step: 306500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:58:07,566-Speed 4346.47 samples/sec Loss 0.5584 Epoch: 18 Global Step: 306550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:58:19,250-Speed 4382.32 samples/sec Loss 0.5673 Epoch: 18 Global Step: 306600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:58:31,940-Speed 4034.96 samples/sec Loss 0.5642 Epoch: 18 Global Step: 306650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:58:43,738-Speed 4339.77 samples/sec Loss 0.5617 Epoch: 18 Global Step: 306700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:58:55,456-Speed 4369.52 samples/sec Loss 0.5722 Epoch: 18 Global Step: 306750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:59:08,787-Speed 3840.71 samples/sec Loss 0.5732 Epoch: 18 Global Step: 306800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:59:20,405-Speed 4407.08 samples/sec Loss 0.5554 Epoch: 18 Global Step: 306850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:59:32,199-Speed 4341.67 samples/sec Loss 0.5761 Epoch: 18 Global Step: 306900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:59:43,999-Speed 4338.97 samples/sec Loss 0.5724 Epoch: 18 Global Step: 306950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 21:59:56,781-Speed 4005.69 samples/sec Loss 0.5726 Epoch: 18 Global Step: 307000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:00:08,550-Speed 4350.53 samples/sec Loss 0.5714 Epoch: 18 Global Step: 307050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:00:20,338-Speed 4343.82 samples/sec Loss 0.5631 Epoch: 18 Global Step: 307100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:00:32,810-Speed 4105.14 samples/sec Loss 0.5588 Epoch: 18 Global Step: 307150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:00:44,561-Speed 4357.46 samples/sec Loss 0.5736 Epoch: 18 Global Step: 307200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:00:56,484-Speed 4294.22 samples/sec Loss 0.5679 Epoch: 18 Global Step: 307250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:01:08,024-Speed 4436.97 samples/sec Loss 0.5709 Epoch: 18 Global Step: 307300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:01:19,819-Speed 4341.02 samples/sec Loss 0.5741 Epoch: 18 Global Step: 307350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:01:31,539-Speed 4368.88 samples/sec Loss 0.5641 Epoch: 18 Global Step: 307400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:01:44,107-Speed 4073.94 samples/sec Loss 0.5594 Epoch: 18 Global Step: 307450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:01:55,857-Speed 4357.53 samples/sec Loss 0.5586 Epoch: 18 Global Step: 307500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:02:07,610-Speed 4356.40 samples/sec Loss 0.5714 Epoch: 18 Global Step: 307550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:02:19,463-Speed 4319.90 samples/sec Loss 0.5757 Epoch: 18 Global Step: 307600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:02:31,022-Speed 4429.67 samples/sec Loss 0.5671 Epoch: 18 Global Step: 307650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:02:42,754-Speed 4364.21 samples/sec Loss 0.5668 Epoch: 18 Global Step: 307700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:02:54,441-Speed 4381.40 samples/sec Loss 0.5588 Epoch: 18 Global Step: 307750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:03:06,231-Speed 4342.81 samples/sec Loss 0.5602 Epoch: 18 Global Step: 307800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:03:17,973-Speed 4360.63 samples/sec Loss 0.5639 Epoch: 18 Global Step: 307850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:03:29,630-Speed 4392.36 samples/sec Loss 0.5677 Epoch: 18 Global Step: 307900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:03:41,337-Speed 4373.56 samples/sec Loss 0.5743 Epoch: 18 Global Step: 307950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:03:53,220-Speed 4308.87 samples/sec Loss 0.5730 Epoch: 18 Global Step: 308000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:04:23,334-[lfw][308000]XNorm: 21.867518 Training: 2021-03-15 22:04:23,334-[lfw][308000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:04:23,334-[lfw][308000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:04:58,348-[cfp_fp][308000]XNorm: 21.839351 Training: 2021-03-15 22:04:58,349-[cfp_fp][308000]Accuracy-Flip: 0.99157+-0.00396 Training: 2021-03-15 22:04:58,349-[cfp_fp][308000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:05:28,515-[agedb_30][308000]XNorm: 22.646200 Training: 2021-03-15 22:05:28,516-[agedb_30][308000]Accuracy-Flip: 0.98333+-0.00707 Training: 2021-03-15 22:05:28,516-[agedb_30][308000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:05:40,216-Speed 478.52 samples/sec Loss 0.5626 Epoch: 18 Global Step: 308050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:05:51,988-Speed 4349.52 samples/sec Loss 0.5635 Epoch: 18 Global Step: 308100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:06:03,750-Speed 4353.25 samples/sec Loss 0.5620 Epoch: 18 Global Step: 308150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:06:15,456-Speed 4373.84 samples/sec Loss 0.5668 Epoch: 18 Global Step: 308200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:06:27,359-Speed 4301.46 samples/sec Loss 0.5704 Epoch: 18 Global Step: 308250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:06:39,025-Speed 4389.06 samples/sec Loss 0.5617 Epoch: 18 Global Step: 308300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:06:50,966-Speed 4287.83 samples/sec Loss 0.5630 Epoch: 18 Global Step: 308350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:07:02,639-Speed 4386.63 samples/sec Loss 0.5658 Epoch: 18 Global Step: 308400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:07:14,171-Speed 4439.87 samples/sec Loss 0.5742 Epoch: 18 Global Step: 308450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:07:25,924-Speed 4356.51 samples/sec Loss 0.5643 Epoch: 18 Global Step: 308500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:07:37,385-Speed 4467.44 samples/sec Loss 0.5646 Epoch: 18 Global Step: 308550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:07:50,366-Speed 3944.61 samples/sec Loss 0.5726 Epoch: 18 Global Step: 308600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:08:01,989-Speed 4405.22 samples/sec Loss 0.5665 Epoch: 18 Global Step: 308650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:08:13,785-Speed 4340.45 samples/sec Loss 0.5624 Epoch: 18 Global Step: 308700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:08:25,276-Speed 4455.95 samples/sec Loss 0.5527 Epoch: 18 Global Step: 308750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:08:36,861-Speed 4419.63 samples/sec Loss 0.5617 Epoch: 18 Global Step: 308800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:08:48,690-Speed 4328.34 samples/sec Loss 0.5606 Epoch: 18 Global Step: 308850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:00,473-Speed 4345.48 samples/sec Loss 0.5703 Epoch: 18 Global Step: 308900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:12,162-Speed 4380.42 samples/sec Loss 0.5544 Epoch: 18 Global Step: 308950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:24,051-Speed 4306.70 samples/sec Loss 0.5633 Epoch: 18 Global Step: 309000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:35,653-Speed 4413.19 samples/sec Loss 0.5727 Epoch: 18 Global Step: 309050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:46,990-Speed 4516.24 samples/sec Loss 0.5621 Epoch: 18 Global Step: 309100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:09:59,446-Speed 4110.72 samples/sec Loss 0.5801 Epoch: 18 Global Step: 309150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:10:11,016-Speed 4425.63 samples/sec Loss 0.5583 Epoch: 18 Global Step: 309200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:10:22,733-Speed 4369.81 samples/sec Loss 0.5786 Epoch: 18 Global Step: 309250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:10:36,249-Speed 3788.26 samples/sec Loss 0.5657 Epoch: 18 Global Step: 309300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:10:48,070-Speed 4331.29 samples/sec Loss 0.5806 Epoch: 18 Global Step: 309350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:10:59,750-Speed 4383.93 samples/sec Loss 0.5753 Epoch: 18 Global Step: 309400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:11:11,268-Speed 4445.37 samples/sec Loss 0.5598 Epoch: 18 Global Step: 309450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:11:23,753-Speed 4101.06 samples/sec Loss 0.5628 Epoch: 18 Global Step: 309500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:11:35,607-Speed 4319.29 samples/sec Loss 0.5610 Epoch: 18 Global Step: 309550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:11:47,305-Speed 4376.94 samples/sec Loss 0.5686 Epoch: 18 Global Step: 309600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:11:59,957-Speed 4047.09 samples/sec Loss 0.5664 Epoch: 18 Global Step: 309650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:12:12,720-Speed 4011.56 samples/sec Loss 0.5616 Epoch: 18 Global Step: 309700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:12:24,455-Speed 4363.41 samples/sec Loss 0.5631 Epoch: 18 Global Step: 309750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:12:36,009-Speed 4431.58 samples/sec Loss 0.5673 Epoch: 18 Global Step: 309800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:12:47,637-Speed 4403.14 samples/sec Loss 0.5623 Epoch: 18 Global Step: 309850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:12:59,342-Speed 4374.38 samples/sec Loss 0.5532 Epoch: 18 Global Step: 309900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:13:10,985-Speed 4397.67 samples/sec Loss 0.5674 Epoch: 18 Global Step: 309950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:13:22,621-Speed 4400.32 samples/sec Loss 0.5618 Epoch: 18 Global Step: 310000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:13:52,933-[lfw][310000]XNorm: 21.904289 Training: 2021-03-15 22:13:52,933-[lfw][310000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:13:52,933-[lfw][310000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:14:27,986-[cfp_fp][310000]XNorm: 21.872727 Training: 2021-03-15 22:14:27,987-[cfp_fp][310000]Accuracy-Flip: 0.99114+-0.00423 Training: 2021-03-15 22:14:27,987-[cfp_fp][310000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:14:58,264-[agedb_30][310000]XNorm: 22.660017 Training: 2021-03-15 22:14:58,265-[agedb_30][310000]Accuracy-Flip: 0.98383+-0.00687 Training: 2021-03-15 22:14:58,265-[agedb_30][310000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:15:09,932-Speed 477.12 samples/sec Loss 0.5620 Epoch: 18 Global Step: 310050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:15:22,426-Speed 4098.16 samples/sec Loss 0.5717 Epoch: 18 Global Step: 310100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:15:33,995-Speed 4425.81 samples/sec Loss 0.5782 Epoch: 18 Global Step: 310150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:15:45,697-Speed 4375.58 samples/sec Loss 0.5578 Epoch: 18 Global Step: 310200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:15:57,413-Speed 4370.29 samples/sec Loss 0.5720 Epoch: 18 Global Step: 310250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:16:09,226-Speed 4334.11 samples/sec Loss 0.5825 Epoch: 18 Global Step: 310300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:16:20,939-Speed 4371.62 samples/sec Loss 0.5546 Epoch: 18 Global Step: 310350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:16:32,545-Speed 4411.63 samples/sec Loss 0.5514 Epoch: 18 Global Step: 310400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:16:44,337-Speed 4342.05 samples/sec Loss 0.5653 Epoch: 18 Global Step: 310450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:16:55,960-Speed 4405.15 samples/sec Loss 0.5680 Epoch: 18 Global Step: 310500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:17:07,822-Speed 4316.57 samples/sec Loss 0.5798 Epoch: 18 Global Step: 310550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:17:19,480-Speed 4391.90 samples/sec Loss 0.5806 Epoch: 18 Global Step: 310600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:17:31,361-Speed 4309.55 samples/sec Loss 0.5710 Epoch: 18 Global Step: 310650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:17:43,153-Speed 4342.13 samples/sec Loss 0.5731 Epoch: 18 Global Step: 310700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:17:54,917-Speed 4352.43 samples/sec Loss 0.5655 Epoch: 18 Global Step: 310750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:18:06,684-Speed 4351.57 samples/sec Loss 0.5730 Epoch: 18 Global Step: 310800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:18:18,376-Speed 4379.26 samples/sec Loss 0.5751 Epoch: 18 Global Step: 310850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:18:30,162-Speed 4344.24 samples/sec Loss 0.5706 Epoch: 18 Global Step: 310900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:18:41,663-Speed 4451.83 samples/sec Loss 0.5794 Epoch: 18 Global Step: 310950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:18:53,302-Speed 4399.07 samples/sec Loss 0.5678 Epoch: 18 Global Step: 311000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:19:04,863-Speed 4429.14 samples/sec Loss 0.5656 Epoch: 18 Global Step: 311050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:19:17,459-Speed 4064.72 samples/sec Loss 0.5750 Epoch: 18 Global Step: 311100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:19:29,101-Speed 4398.27 samples/sec Loss 0.5731 Epoch: 18 Global Step: 311150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:19:40,833-Speed 4364.32 samples/sec Loss 0.5672 Epoch: 18 Global Step: 311200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:19:52,636-Speed 4338.09 samples/sec Loss 0.5634 Epoch: 18 Global Step: 311250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:20:04,319-Speed 4382.27 samples/sec Loss 0.5599 Epoch: 18 Global Step: 311300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:20:15,926-Speed 4411.55 samples/sec Loss 0.5654 Epoch: 18 Global Step: 311350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:20:27,662-Speed 4362.58 samples/sec Loss 0.5734 Epoch: 18 Global Step: 311400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:20:39,596-Speed 4290.61 samples/sec Loss 0.5670 Epoch: 18 Global Step: 311450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:20:51,338-Speed 4360.42 samples/sec Loss 0.5639 Epoch: 18 Global Step: 311500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:21:03,063-Speed 4366.87 samples/sec Loss 0.5620 Epoch: 18 Global Step: 311550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:21:14,632-Speed 4426.14 samples/sec Loss 0.5723 Epoch: 18 Global Step: 311600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:21:26,428-Speed 4340.31 samples/sec Loss 0.5618 Epoch: 18 Global Step: 311650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:21:38,239-Speed 4335.11 samples/sec Loss 0.5764 Epoch: 18 Global Step: 311700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:21:50,689-Speed 4112.79 samples/sec Loss 0.5656 Epoch: 18 Global Step: 311750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:22:02,473-Speed 4344.87 samples/sec Loss 0.5677 Epoch: 18 Global Step: 311800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:22:15,171-Speed 4032.38 samples/sec Loss 0.5592 Epoch: 18 Global Step: 311850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:22:26,917-Speed 4359.02 samples/sec Loss 0.5616 Epoch: 18 Global Step: 311900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:22:38,526-Speed 4410.81 samples/sec Loss 0.5639 Epoch: 18 Global Step: 311950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:22:50,190-Speed 4389.62 samples/sec Loss 0.5696 Epoch: 18 Global Step: 312000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:23:20,391-[lfw][312000]XNorm: 21.985398 Training: 2021-03-15 22:23:20,391-[lfw][312000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:23:20,391-[lfw][312000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:23:55,342-[cfp_fp][312000]XNorm: 21.898330 Training: 2021-03-15 22:23:55,343-[cfp_fp][312000]Accuracy-Flip: 0.99143+-0.00409 Training: 2021-03-15 22:23:55,343-[cfp_fp][312000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:24:25,595-[agedb_30][312000]XNorm: 22.710942 Training: 2021-03-15 22:24:25,595-[agedb_30][312000]Accuracy-Flip: 0.98200+-0.00714 Training: 2021-03-15 22:24:25,596-[agedb_30][312000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:24:39,556-Speed 468.16 samples/sec Loss 0.5689 Epoch: 18 Global Step: 312050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:24:51,239-Speed 4382.49 samples/sec Loss 0.5620 Epoch: 18 Global Step: 312100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:25:02,808-Speed 4425.83 samples/sec Loss 0.5511 Epoch: 18 Global Step: 312150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:25:15,513-Speed 4030.12 samples/sec Loss 0.5671 Epoch: 18 Global Step: 312200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:25:27,183-Speed 4387.41 samples/sec Loss 0.5600 Epoch: 18 Global Step: 312250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:25:39,734-Speed 4079.49 samples/sec Loss 0.5680 Epoch: 18 Global Step: 312300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:25:51,451-Speed 4369.87 samples/sec Loss 0.5663 Epoch: 18 Global Step: 312350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:26:03,078-Speed 4403.70 samples/sec Loss 0.5729 Epoch: 18 Global Step: 312400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:26:14,793-Speed 4370.99 samples/sec Loss 0.5594 Epoch: 18 Global Step: 312450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:26:26,459-Speed 4388.90 samples/sec Loss 0.5651 Epoch: 18 Global Step: 312500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:26:38,150-Speed 4379.48 samples/sec Loss 0.5753 Epoch: 18 Global Step: 312550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:26:49,974-Speed 4330.51 samples/sec Loss 0.5624 Epoch: 18 Global Step: 312600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:27:01,637-Speed 4390.08 samples/sec Loss 0.5597 Epoch: 18 Global Step: 312650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:27:13,272-Speed 4400.54 samples/sec Loss 0.5638 Epoch: 18 Global Step: 312700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:27:25,760-Speed 4100.08 samples/sec Loss 0.5709 Epoch: 18 Global Step: 312750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:27:37,415-Speed 4393.23 samples/sec Loss 0.5649 Epoch: 18 Global Step: 312800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:27:48,992-Speed 4422.67 samples/sec Loss 0.5684 Epoch: 18 Global Step: 312850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:00,701-Speed 4372.89 samples/sec Loss 0.5578 Epoch: 18 Global Step: 312900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:12,614-Speed 4297.87 samples/sec Loss 0.5692 Epoch: 18 Global Step: 312950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:24,367-Speed 4356.46 samples/sec Loss 0.5669 Epoch: 18 Global Step: 313000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:36,009-Speed 4398.28 samples/sec Loss 0.5623 Epoch: 18 Global Step: 313050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:47,699-Speed 4379.84 samples/sec Loss 0.5774 Epoch: 18 Global Step: 313100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:28:59,310-Speed 4409.92 samples/sec Loss 0.5550 Epoch: 18 Global Step: 313150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:29:11,115-Speed 4337.26 samples/sec Loss 0.5600 Epoch: 18 Global Step: 313200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:29:22,902-Speed 4344.01 samples/sec Loss 0.5683 Epoch: 18 Global Step: 313250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:29:34,762-Speed 4317.29 samples/sec Loss 0.5757 Epoch: 18 Global Step: 313300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:29:46,523-Speed 4353.22 samples/sec Loss 0.5670 Epoch: 18 Global Step: 313350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:29:58,195-Speed 4387.08 samples/sec Loss 0.5590 Epoch: 18 Global Step: 313400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:30:09,999-Speed 4337.44 samples/sec Loss 0.5705 Epoch: 18 Global Step: 313450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:30:21,707-Speed 4373.46 samples/sec Loss 0.5659 Epoch: 18 Global Step: 313500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:30:33,623-Speed 4296.95 samples/sec Loss 0.5516 Epoch: 18 Global Step: 313550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:30:46,064-Speed 4115.35 samples/sec Loss 0.5731 Epoch: 18 Global Step: 313600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:30:57,765-Speed 4376.00 samples/sec Loss 0.5760 Epoch: 18 Global Step: 313650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:31:09,551-Speed 4344.17 samples/sec Loss 0.5708 Epoch: 18 Global Step: 313700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:31:21,075-Speed 4443.05 samples/sec Loss 0.5661 Epoch: 18 Global Step: 313750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:31:32,880-Speed 4337.35 samples/sec Loss 0.5747 Epoch: 18 Global Step: 313800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:31:44,541-Speed 4390.91 samples/sec Loss 0.5752 Epoch: 18 Global Step: 313850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:31:56,147-Speed 4411.61 samples/sec Loss 0.5689 Epoch: 18 Global Step: 313900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:32:07,828-Speed 4383.38 samples/sec Loss 0.5574 Epoch: 18 Global Step: 313950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:32:19,484-Speed 4392.87 samples/sec Loss 0.5721 Epoch: 18 Global Step: 314000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:32:49,821-[lfw][314000]XNorm: 21.803750 Training: 2021-03-15 22:32:49,821-[lfw][314000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:32:49,821-[lfw][314000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:33:24,936-[cfp_fp][314000]XNorm: 21.783921 Training: 2021-03-15 22:33:24,936-[cfp_fp][314000]Accuracy-Flip: 0.99071+-0.00425 Training: 2021-03-15 22:33:24,936-[cfp_fp][314000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:33:55,203-[agedb_30][314000]XNorm: 22.567487 Training: 2021-03-15 22:33:55,203-[agedb_30][314000]Accuracy-Flip: 0.98250+-0.00761 Training: 2021-03-15 22:33:55,203-[agedb_30][314000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:34:06,982-Speed 476.29 samples/sec Loss 0.5701 Epoch: 18 Global Step: 314050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:34:18,804-Speed 4331.22 samples/sec Loss 0.5749 Epoch: 18 Global Step: 314100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:34:30,708-Speed 4301.31 samples/sec Loss 0.5637 Epoch: 18 Global Step: 314150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:34:43,163-Speed 4110.66 samples/sec Loss 0.5713 Epoch: 18 Global Step: 314200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:34:55,014-Speed 4320.48 samples/sec Loss 0.5690 Epoch: 18 Global Step: 314250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:35:06,668-Speed 4393.51 samples/sec Loss 0.5710 Epoch: 18 Global Step: 314300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:35:18,353-Speed 4382.18 samples/sec Loss 0.5807 Epoch: 18 Global Step: 314350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:35:30,005-Speed 4393.95 samples/sec Loss 0.5762 Epoch: 18 Global Step: 314400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:35:42,664-Speed 4044.74 samples/sec Loss 0.5705 Epoch: 18 Global Step: 314450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:35:54,423-Speed 4354.32 samples/sec Loss 0.5760 Epoch: 18 Global Step: 314500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:36:07,108-Speed 4036.59 samples/sec Loss 0.5714 Epoch: 18 Global Step: 314550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:36:18,848-Speed 4361.20 samples/sec Loss 0.5843 Epoch: 18 Global Step: 314600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:36:31,103-Speed 4178.17 samples/sec Loss 0.5531 Epoch: 18 Global Step: 314650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:36:42,779-Speed 4384.95 samples/sec Loss 0.5642 Epoch: 18 Global Step: 314700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:36:54,589-Speed 4335.43 samples/sec Loss 0.5742 Epoch: 18 Global Step: 314750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 22:37:07,402-Speed 3996.11 samples/sec Loss 0.5748 Epoch: 18 Global Step: 314800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:37:19,057-Speed 4393.07 samples/sec Loss 0.5769 Epoch: 18 Global Step: 314850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:37:30,703-Speed 4396.57 samples/sec Loss 0.5684 Epoch: 18 Global Step: 314900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:37:43,219-Speed 4090.92 samples/sec Loss 0.5714 Epoch: 18 Global Step: 314950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:37:55,746-Speed 4087.29 samples/sec Loss 0.5724 Epoch: 18 Global Step: 315000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:38:07,397-Speed 4394.72 samples/sec Loss 0.5641 Epoch: 18 Global Step: 315050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:38:19,118-Speed 4368.52 samples/sec Loss 0.5636 Epoch: 18 Global Step: 315100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:38:30,806-Speed 4380.86 samples/sec Loss 0.5698 Epoch: 18 Global Step: 315150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:38:42,378-Speed 4424.46 samples/sec Loss 0.5691 Epoch: 18 Global Step: 315200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:38:54,112-Speed 4363.43 samples/sec Loss 0.5679 Epoch: 18 Global Step: 315250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:39:05,815-Speed 4375.18 samples/sec Loss 0.5697 Epoch: 18 Global Step: 315300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:39:17,594-Speed 4346.98 samples/sec Loss 0.5608 Epoch: 18 Global Step: 315350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:39:30,350-Speed 4013.93 samples/sec Loss 0.5705 Epoch: 18 Global Step: 315400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:39:42,168-Speed 4332.69 samples/sec Loss 0.5651 Epoch: 18 Global Step: 315450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:39:54,021-Speed 4319.57 samples/sec Loss 0.5657 Epoch: 18 Global Step: 315500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:40:05,760-Speed 4361.70 samples/sec Loss 0.5723 Epoch: 18 Global Step: 315550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:40:17,658-Speed 4303.37 samples/sec Loss 0.5706 Epoch: 18 Global Step: 315600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:40:29,329-Speed 4387.26 samples/sec Loss 0.5626 Epoch: 18 Global Step: 315650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:40:41,213-Speed 4308.34 samples/sec Loss 0.5709 Epoch: 18 Global Step: 315700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:40:52,823-Speed 4410.31 samples/sec Loss 0.5723 Epoch: 18 Global Step: 315750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:41:04,510-Speed 4380.90 samples/sec Loss 0.5698 Epoch: 18 Global Step: 315800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:41:16,097-Speed 4418.95 samples/sec Loss 0.5680 Epoch: 18 Global Step: 315850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:41:28,824-Speed 4023.22 samples/sec Loss 0.5622 Epoch: 18 Global Step: 315900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:41:40,515-Speed 4379.60 samples/sec Loss 0.5608 Epoch: 18 Global Step: 315950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:41:52,311-Speed 4340.84 samples/sec Loss 0.5716 Epoch: 18 Global Step: 316000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:42:22,572-[lfw][316000]XNorm: 21.903220 Training: 2021-03-15 22:42:22,572-[lfw][316000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:42:22,573-[lfw][316000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:42:57,666-[cfp_fp][316000]XNorm: 21.849233 Training: 2021-03-15 22:42:57,667-[cfp_fp][316000]Accuracy-Flip: 0.99057+-0.00448 Training: 2021-03-15 22:42:57,667-[cfp_fp][316000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:43:27,801-[agedb_30][316000]XNorm: 22.642484 Training: 2021-03-15 22:43:27,801-[agedb_30][316000]Accuracy-Flip: 0.98233+-0.00720 Training: 2021-03-15 22:43:27,801-[agedb_30][316000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:43:39,329-Speed 478.43 samples/sec Loss 0.5487 Epoch: 18 Global Step: 316050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:43:51,070-Speed 4360.94 samples/sec Loss 0.5854 Epoch: 18 Global Step: 316100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:44:02,954-Speed 4308.34 samples/sec Loss 0.5634 Epoch: 18 Global Step: 316150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:44:14,731-Speed 4347.61 samples/sec Loss 0.5823 Epoch: 18 Global Step: 316200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:44:26,451-Speed 4368.93 samples/sec Loss 0.5579 Epoch: 18 Global Step: 316250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:44:38,167-Speed 4370.32 samples/sec Loss 0.5685 Epoch: 18 Global Step: 316300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:44:50,075-Speed 4299.71 samples/sec Loss 0.5635 Epoch: 18 Global Step: 316350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:45:01,900-Speed 4330.18 samples/sec Loss 0.5723 Epoch: 18 Global Step: 316400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:45:13,761-Speed 4316.53 samples/sec Loss 0.5651 Epoch: 18 Global Step: 316450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:45:25,658-Speed 4303.79 samples/sec Loss 0.5710 Epoch: 18 Global Step: 316500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:45:37,412-Speed 4356.32 samples/sec Loss 0.5789 Epoch: 18 Global Step: 316550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:45:49,912-Speed 4096.07 samples/sec Loss 0.5727 Epoch: 18 Global Step: 316600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:46:01,484-Speed 4424.72 samples/sec Loss 0.5842 Epoch: 18 Global Step: 316650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:46:13,228-Speed 4359.94 samples/sec Loss 0.5478 Epoch: 18 Global Step: 316700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:46:24,927-Speed 4376.37 samples/sec Loss 0.5531 Epoch: 18 Global Step: 316750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:46:36,622-Speed 4378.07 samples/sec Loss 0.5600 Epoch: 18 Global Step: 316800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:46:48,250-Speed 4403.48 samples/sec Loss 0.5596 Epoch: 18 Global Step: 316850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:47:00,976-Speed 4023.30 samples/sec Loss 0.5834 Epoch: 18 Global Step: 316900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:47:12,715-Speed 4361.77 samples/sec Loss 0.5662 Epoch: 18 Global Step: 316950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:47:24,523-Speed 4336.23 samples/sec Loss 0.5572 Epoch: 18 Global Step: 317000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:47:36,157-Speed 4401.20 samples/sec Loss 0.5681 Epoch: 18 Global Step: 317050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:47:47,885-Speed 4365.66 samples/sec Loss 0.5659 Epoch: 18 Global Step: 317100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:48:12,488-Speed 2081.14 samples/sec Loss 0.5663 Epoch: 19 Global Step: 317150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:48:24,801-Speed 4158.39 samples/sec Loss 0.5732 Epoch: 19 Global Step: 317200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:48:37,660-Speed 3981.69 samples/sec Loss 0.5615 Epoch: 19 Global Step: 317250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:48:49,402-Speed 4361.00 samples/sec Loss 0.5617 Epoch: 19 Global Step: 317300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:49:01,291-Speed 4306.58 samples/sec Loss 0.5647 Epoch: 19 Global Step: 317350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:49:12,926-Speed 4400.70 samples/sec Loss 0.5609 Epoch: 19 Global Step: 317400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:49:24,491-Speed 4427.28 samples/sec Loss 0.5683 Epoch: 19 Global Step: 317450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:49:36,897-Speed 4127.28 samples/sec Loss 0.5616 Epoch: 19 Global Step: 317500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:49:48,611-Speed 4371.27 samples/sec Loss 0.5666 Epoch: 19 Global Step: 317550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:50:00,193-Speed 4420.69 samples/sec Loss 0.5724 Epoch: 19 Global Step: 317600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:50:14,026-Speed 3701.55 samples/sec Loss 0.5714 Epoch: 19 Global Step: 317650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:50:25,486-Speed 4467.59 samples/sec Loss 0.5506 Epoch: 19 Global Step: 317700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:50:37,530-Speed 4251.26 samples/sec Loss 0.5648 Epoch: 19 Global Step: 317750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:50:49,245-Speed 4370.84 samples/sec Loss 0.5691 Epoch: 19 Global Step: 317800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:51:00,973-Speed 4365.80 samples/sec Loss 0.5742 Epoch: 19 Global Step: 317850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:51:12,549-Speed 4423.12 samples/sec Loss 0.5634 Epoch: 19 Global Step: 317900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:51:24,127-Speed 4422.23 samples/sec Loss 0.5617 Epoch: 19 Global Step: 317950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:51:35,796-Speed 4388.13 samples/sec Loss 0.5757 Epoch: 19 Global Step: 318000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:52:06,133-[lfw][318000]XNorm: 21.982586 Training: 2021-03-15 22:52:06,134-[lfw][318000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 22:52:06,134-[lfw][318000]Accuracy-Highest: 0.99833 Training: 2021-03-15 22:52:41,306-[cfp_fp][318000]XNorm: 21.948441 Training: 2021-03-15 22:52:41,307-[cfp_fp][318000]Accuracy-Flip: 0.99157+-0.00396 Training: 2021-03-15 22:52:41,307-[cfp_fp][318000]Accuracy-Highest: 0.99200 Training: 2021-03-15 22:53:11,657-[agedb_30][318000]XNorm: 22.708209 Training: 2021-03-15 22:53:11,657-[agedb_30][318000]Accuracy-Flip: 0.98333+-0.00745 Training: 2021-03-15 22:53:11,657-[agedb_30][318000]Accuracy-Highest: 0.98383 Training: 2021-03-15 22:53:23,390-Speed 475.86 samples/sec Loss 0.5687 Epoch: 19 Global Step: 318050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:53:34,933-Speed 4435.64 samples/sec Loss 0.5641 Epoch: 19 Global Step: 318100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:53:46,558-Speed 4404.73 samples/sec Loss 0.5637 Epoch: 19 Global Step: 318150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:53:58,073-Speed 4446.60 samples/sec Loss 0.5585 Epoch: 19 Global Step: 318200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:54:09,665-Speed 4416.75 samples/sec Loss 0.5682 Epoch: 19 Global Step: 318250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:54:21,314-Speed 4395.60 samples/sec Loss 0.5667 Epoch: 19 Global Step: 318300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:54:32,787-Speed 4462.87 samples/sec Loss 0.5677 Epoch: 19 Global Step: 318350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:54:45,174-Speed 4133.57 samples/sec Loss 0.5639 Epoch: 19 Global Step: 318400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:54:56,644-Speed 4463.97 samples/sec Loss 0.5576 Epoch: 19 Global Step: 318450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:55:08,214-Speed 4425.35 samples/sec Loss 0.5687 Epoch: 19 Global Step: 318500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:55:19,907-Speed 4379.13 samples/sec Loss 0.5783 Epoch: 19 Global Step: 318550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:55:31,518-Speed 4409.84 samples/sec Loss 0.5531 Epoch: 19 Global Step: 318600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:55:42,937-Speed 4483.78 samples/sec Loss 0.5702 Epoch: 19 Global Step: 318650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:55:54,420-Speed 4459.36 samples/sec Loss 0.5654 Epoch: 19 Global Step: 318700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:56:05,883-Speed 4466.58 samples/sec Loss 0.5793 Epoch: 19 Global Step: 318750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:56:17,306-Speed 4482.49 samples/sec Loss 0.5615 Epoch: 19 Global Step: 318800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:56:28,919-Speed 4409.14 samples/sec Loss 0.5711 Epoch: 19 Global Step: 318850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:56:40,416-Speed 4453.32 samples/sec Loss 0.5662 Epoch: 19 Global Step: 318900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:56:51,941-Speed 4442.63 samples/sec Loss 0.5624 Epoch: 19 Global Step: 318950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:57:03,580-Speed 4399.58 samples/sec Loss 0.5583 Epoch: 19 Global Step: 319000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:57:15,144-Speed 4427.46 samples/sec Loss 0.5683 Epoch: 19 Global Step: 319050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:57:26,619-Speed 4462.09 samples/sec Loss 0.5714 Epoch: 19 Global Step: 319100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:57:38,233-Speed 4408.92 samples/sec Loss 0.5495 Epoch: 19 Global Step: 319150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:57:50,935-Speed 4030.81 samples/sec Loss 0.5585 Epoch: 19 Global Step: 319200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:58:02,480-Speed 4435.24 samples/sec Loss 0.5670 Epoch: 19 Global Step: 319250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:58:14,096-Speed 4407.92 samples/sec Loss 0.5732 Epoch: 19 Global Step: 319300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:58:25,714-Speed 4406.95 samples/sec Loss 0.5686 Epoch: 19 Global Step: 319350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:58:37,445-Speed 4364.80 samples/sec Loss 0.5615 Epoch: 19 Global Step: 319400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:58:48,881-Speed 4477.28 samples/sec Loss 0.5611 Epoch: 19 Global Step: 319450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:00,425-Speed 4435.23 samples/sec Loss 0.5634 Epoch: 19 Global Step: 319500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:12,710-Speed 4168.17 samples/sec Loss 0.5656 Epoch: 19 Global Step: 319550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:24,321-Speed 4409.75 samples/sec Loss 0.5582 Epoch: 19 Global Step: 319600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:35,725-Speed 4489.58 samples/sec Loss 0.5574 Epoch: 19 Global Step: 319650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:47,423-Speed 4377.06 samples/sec Loss 0.5759 Epoch: 19 Global Step: 319700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 22:59:59,097-Speed 4385.93 samples/sec Loss 0.5684 Epoch: 19 Global Step: 319750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:00:11,500-Speed 4128.50 samples/sec Loss 0.5539 Epoch: 19 Global Step: 319800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:00:23,157-Speed 4392.14 samples/sec Loss 0.5664 Epoch: 19 Global Step: 319850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:00:34,862-Speed 4374.65 samples/sec Loss 0.5656 Epoch: 19 Global Step: 319900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:00:46,348-Speed 4457.69 samples/sec Loss 0.5529 Epoch: 19 Global Step: 319950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:00:57,979-Speed 4402.33 samples/sec Loss 0.5772 Epoch: 19 Global Step: 320000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:01:28,172-[lfw][320000]XNorm: 21.946718 Training: 2021-03-15 23:01:28,172-[lfw][320000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:01:28,172-[lfw][320000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:02:03,239-[cfp_fp][320000]XNorm: 21.917946 Training: 2021-03-15 23:02:03,239-[cfp_fp][320000]Accuracy-Flip: 0.99043+-0.00447 Training: 2021-03-15 23:02:03,239-[cfp_fp][320000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:02:33,593-[agedb_30][320000]XNorm: 22.666437 Training: 2021-03-15 23:02:33,594-[agedb_30][320000]Accuracy-Flip: 0.98250+-0.00739 Training: 2021-03-15 23:02:33,594-[agedb_30][320000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:02:45,132-Speed 477.82 samples/sec Loss 0.5604 Epoch: 19 Global Step: 320050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:02:56,755-Speed 4405.29 samples/sec Loss 0.5631 Epoch: 19 Global Step: 320100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:03:08,288-Speed 4439.94 samples/sec Loss 0.5563 Epoch: 19 Global Step: 320150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:03:20,648-Speed 4142.60 samples/sec Loss 0.5785 Epoch: 19 Global Step: 320200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:03:33,024-Speed 4137.01 samples/sec Loss 0.5733 Epoch: 19 Global Step: 320250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:03:44,842-Speed 4332.78 samples/sec Loss 0.5620 Epoch: 19 Global Step: 320300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:03:57,696-Speed 3983.35 samples/sec Loss 0.5587 Epoch: 19 Global Step: 320350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:04:09,264-Speed 4425.87 samples/sec Loss 0.5623 Epoch: 19 Global Step: 320400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:04:20,835-Speed 4425.17 samples/sec Loss 0.5546 Epoch: 19 Global Step: 320450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:04:32,536-Speed 4375.83 samples/sec Loss 0.5708 Epoch: 19 Global Step: 320500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:04:43,973-Speed 4477.20 samples/sec Loss 0.5581 Epoch: 19 Global Step: 320550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:04:55,561-Speed 4418.62 samples/sec Loss 0.5543 Epoch: 19 Global Step: 320600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:05:07,213-Speed 4394.16 samples/sec Loss 0.5647 Epoch: 19 Global Step: 320650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:05:18,859-Speed 4396.70 samples/sec Loss 0.5586 Epoch: 19 Global Step: 320700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:05:31,154-Speed 4164.63 samples/sec Loss 0.5592 Epoch: 19 Global Step: 320750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:05:42,611-Speed 4469.08 samples/sec Loss 0.5602 Epoch: 19 Global Step: 320800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:05:54,133-Speed 4443.90 samples/sec Loss 0.5532 Epoch: 19 Global Step: 320850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:06:05,662-Speed 4441.23 samples/sec Loss 0.5614 Epoch: 19 Global Step: 320900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:06:17,187-Speed 4442.64 samples/sec Loss 0.5660 Epoch: 19 Global Step: 320950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:06:28,592-Speed 4489.49 samples/sec Loss 0.5671 Epoch: 19 Global Step: 321000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:06:41,184-Speed 4066.35 samples/sec Loss 0.5711 Epoch: 19 Global Step: 321050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:06:52,624-Speed 4475.48 samples/sec Loss 0.5610 Epoch: 19 Global Step: 321100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:07:04,242-Speed 4407.15 samples/sec Loss 0.5600 Epoch: 19 Global Step: 321150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:07:15,774-Speed 4440.05 samples/sec Loss 0.5520 Epoch: 19 Global Step: 321200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:07:27,296-Speed 4443.85 samples/sec Loss 0.5677 Epoch: 19 Global Step: 321250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:07:38,884-Speed 4418.83 samples/sec Loss 0.5652 Epoch: 19 Global Step: 321300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:07:50,478-Speed 4416.01 samples/sec Loss 0.5542 Epoch: 19 Global Step: 321350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:02,109-Speed 4402.38 samples/sec Loss 0.5618 Epoch: 19 Global Step: 321400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:13,574-Speed 4466.12 samples/sec Loss 0.5580 Epoch: 19 Global Step: 321450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:25,097-Speed 4443.23 samples/sec Loss 0.5627 Epoch: 19 Global Step: 321500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:36,712-Speed 4408.50 samples/sec Loss 0.5556 Epoch: 19 Global Step: 321550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:48,182-Speed 4463.94 samples/sec Loss 0.5658 Epoch: 19 Global Step: 321600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:08:59,761-Speed 4422.06 samples/sec Loss 0.5592 Epoch: 19 Global Step: 321650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:09:11,243-Speed 4459.54 samples/sec Loss 0.5609 Epoch: 19 Global Step: 321700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:09:22,782-Speed 4437.08 samples/sec Loss 0.5478 Epoch: 19 Global Step: 321750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:09:34,293-Speed 4448.07 samples/sec Loss 0.5616 Epoch: 19 Global Step: 321800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:09:45,789-Speed 4454.20 samples/sec Loss 0.5744 Epoch: 19 Global Step: 321850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:09:57,254-Speed 4465.74 samples/sec Loss 0.5693 Epoch: 19 Global Step: 321900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:10:09,698-Speed 4114.98 samples/sec Loss 0.5699 Epoch: 19 Global Step: 321950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:10:21,179-Speed 4459.62 samples/sec Loss 0.5635 Epoch: 19 Global Step: 322000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:10:51,489-[lfw][322000]XNorm: 21.855504 Training: 2021-03-15 23:10:51,490-[lfw][322000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:10:51,490-[lfw][322000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:11:26,474-[cfp_fp][322000]XNorm: 21.815730 Training: 2021-03-15 23:11:26,474-[cfp_fp][322000]Accuracy-Flip: 0.99186+-0.00384 Training: 2021-03-15 23:11:26,474-[cfp_fp][322000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:11:56,668-[agedb_30][322000]XNorm: 22.615503 Training: 2021-03-15 23:11:56,668-[agedb_30][322000]Accuracy-Flip: 0.98183+-0.00751 Training: 2021-03-15 23:11:56,668-[agedb_30][322000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:12:08,418-Speed 477.44 samples/sec Loss 0.5578 Epoch: 19 Global Step: 322050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:12:19,870-Speed 4470.95 samples/sec Loss 0.5719 Epoch: 19 Global Step: 322100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:12:31,505-Speed 4400.77 samples/sec Loss 0.5576 Epoch: 19 Global Step: 322150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:12:42,939-Speed 4478.18 samples/sec Loss 0.5561 Epoch: 19 Global Step: 322200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:12:55,332-Speed 4131.40 samples/sec Loss 0.5677 Epoch: 19 Global Step: 322250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:13:06,946-Speed 4408.87 samples/sec Loss 0.5725 Epoch: 19 Global Step: 322300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:13:18,344-Speed 4492.14 samples/sec Loss 0.5670 Epoch: 19 Global Step: 322350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:13:29,945-Speed 4413.69 samples/sec Loss 0.5523 Epoch: 19 Global Step: 322400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:13:42,489-Speed 4081.86 samples/sec Loss 0.5687 Epoch: 19 Global Step: 322450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:13:53,854-Speed 4505.26 samples/sec Loss 0.5591 Epoch: 19 Global Step: 322500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:14:05,342-Speed 4456.73 samples/sec Loss 0.5668 Epoch: 19 Global Step: 322550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:14:16,713-Speed 4503.00 samples/sec Loss 0.5651 Epoch: 19 Global Step: 322600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:14:28,219-Speed 4450.15 samples/sec Loss 0.5754 Epoch: 19 Global Step: 322650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:14:39,784-Speed 4427.58 samples/sec Loss 0.5685 Epoch: 19 Global Step: 322700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:14:51,305-Speed 4444.02 samples/sec Loss 0.5680 Epoch: 19 Global Step: 322750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:15:02,848-Speed 4435.88 samples/sec Loss 0.5719 Epoch: 19 Global Step: 322800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:15:14,404-Speed 4430.88 samples/sec Loss 0.5719 Epoch: 19 Global Step: 322850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:15:27,729-Speed 3842.57 samples/sec Loss 0.5610 Epoch: 19 Global Step: 322900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:15:39,253-Speed 4443.25 samples/sec Loss 0.5517 Epoch: 19 Global Step: 322950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:15:50,661-Speed 4488.09 samples/sec Loss 0.5735 Epoch: 19 Global Step: 323000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:16:02,994-Speed 4151.69 samples/sec Loss 0.5547 Epoch: 19 Global Step: 323050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:16:14,513-Speed 4445.07 samples/sec Loss 0.5546 Epoch: 19 Global Step: 323100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:16:26,010-Speed 4453.64 samples/sec Loss 0.5728 Epoch: 19 Global Step: 323150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:16:37,656-Speed 4396.70 samples/sec Loss 0.5723 Epoch: 19 Global Step: 323200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:16:49,113-Speed 4468.93 samples/sec Loss 0.5638 Epoch: 19 Global Step: 323250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:00,528-Speed 4485.70 samples/sec Loss 0.5650 Epoch: 19 Global Step: 323300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:12,867-Speed 4149.46 samples/sec Loss 0.5592 Epoch: 19 Global Step: 323350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:24,453-Speed 4419.40 samples/sec Loss 0.5660 Epoch: 19 Global Step: 323400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:36,050-Speed 4415.04 samples/sec Loss 0.5543 Epoch: 19 Global Step: 323450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:47,676-Speed 4404.06 samples/sec Loss 0.5644 Epoch: 19 Global Step: 323500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:17:59,294-Speed 4407.26 samples/sec Loss 0.5621 Epoch: 19 Global Step: 323550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:18:10,753-Speed 4468.43 samples/sec Loss 0.5529 Epoch: 19 Global Step: 323600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:18:22,371-Speed 4406.95 samples/sec Loss 0.5738 Epoch: 19 Global Step: 323650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:18:34,100-Speed 4365.70 samples/sec Loss 0.5615 Epoch: 19 Global Step: 323700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:18:46,464-Speed 4141.06 samples/sec Loss 0.5639 Epoch: 19 Global Step: 323750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:18:58,009-Speed 4435.22 samples/sec Loss 0.5519 Epoch: 19 Global Step: 323800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:19:09,477-Speed 4464.92 samples/sec Loss 0.5547 Epoch: 19 Global Step: 323850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:19:21,139-Speed 4390.24 samples/sec Loss 0.5602 Epoch: 19 Global Step: 323900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:19:32,634-Speed 4454.47 samples/sec Loss 0.5732 Epoch: 19 Global Step: 323950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:19:44,401-Speed 4351.29 samples/sec Loss 0.5633 Epoch: 19 Global Step: 324000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:20:14,395-[lfw][324000]XNorm: 21.937462 Training: 2021-03-15 23:20:14,395-[lfw][324000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:20:14,395-[lfw][324000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:20:49,144-[cfp_fp][324000]XNorm: 21.880989 Training: 2021-03-15 23:20:49,145-[cfp_fp][324000]Accuracy-Flip: 0.99129+-0.00426 Training: 2021-03-15 23:20:49,145-[cfp_fp][324000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:21:19,140-[agedb_30][324000]XNorm: 22.669433 Training: 2021-03-15 23:21:19,140-[agedb_30][324000]Accuracy-Flip: 0.98267+-0.00684 Training: 2021-03-15 23:21:19,140-[agedb_30][324000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:21:30,653-Speed 481.88 samples/sec Loss 0.5612 Epoch: 19 Global Step: 324050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:21:42,187-Speed 4439.09 samples/sec Loss 0.5734 Epoch: 19 Global Step: 324100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:21:53,653-Speed 4465.88 samples/sec Loss 0.5608 Epoch: 19 Global Step: 324150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:22:05,199-Speed 4434.54 samples/sec Loss 0.5770 Epoch: 19 Global Step: 324200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:22:16,891-Speed 4379.33 samples/sec Loss 0.5662 Epoch: 19 Global Step: 324250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:22:28,603-Speed 4371.67 samples/sec Loss 0.5597 Epoch: 19 Global Step: 324300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:22:40,176-Speed 4424.17 samples/sec Loss 0.5675 Epoch: 19 Global Step: 324350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:22:51,556-Speed 4499.70 samples/sec Loss 0.5595 Epoch: 19 Global Step: 324400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:23:03,184-Speed 4403.28 samples/sec Loss 0.5674 Epoch: 19 Global Step: 324450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:23:14,842-Speed 4391.83 samples/sec Loss 0.5675 Epoch: 19 Global Step: 324500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:23:26,507-Speed 4389.40 samples/sec Loss 0.5558 Epoch: 19 Global Step: 324550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:23:38,762-Speed 4178.13 samples/sec Loss 0.5578 Epoch: 19 Global Step: 324600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:23:50,374-Speed 4409.52 samples/sec Loss 0.5780 Epoch: 19 Global Step: 324650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:24:01,950-Speed 4423.03 samples/sec Loss 0.5737 Epoch: 19 Global Step: 324700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:24:13,656-Speed 4374.20 samples/sec Loss 0.5679 Epoch: 19 Global Step: 324750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:24:25,269-Speed 4409.04 samples/sec Loss 0.5601 Epoch: 19 Global Step: 324800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:24:36,851-Speed 4420.71 samples/sec Loss 0.5687 Epoch: 19 Global Step: 324850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:24:49,132-Speed 4169.24 samples/sec Loss 0.5635 Epoch: 19 Global Step: 324900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:00,770-Speed 4399.56 samples/sec Loss 0.5790 Epoch: 19 Global Step: 324950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:12,173-Speed 4490.30 samples/sec Loss 0.5719 Epoch: 19 Global Step: 325000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:23,771-Speed 4414.73 samples/sec Loss 0.5644 Epoch: 19 Global Step: 325050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:35,364-Speed 4416.68 samples/sec Loss 0.5633 Epoch: 19 Global Step: 325100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:47,939-Speed 4071.74 samples/sec Loss 0.5742 Epoch: 19 Global Step: 325150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:25:59,577-Speed 4399.55 samples/sec Loss 0.5663 Epoch: 19 Global Step: 325200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:26:11,090-Speed 4447.25 samples/sec Loss 0.5647 Epoch: 19 Global Step: 325250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:26:22,665-Speed 4423.74 samples/sec Loss 0.5675 Epoch: 19 Global Step: 325300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:26:34,254-Speed 4418.29 samples/sec Loss 0.5524 Epoch: 19 Global Step: 325350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:26:45,737-Speed 4458.63 samples/sec Loss 0.5734 Epoch: 19 Global Step: 325400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:26:57,302-Speed 4427.47 samples/sec Loss 0.5468 Epoch: 19 Global Step: 325450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:27:08,754-Speed 4470.99 samples/sec Loss 0.5709 Epoch: 19 Global Step: 325500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:27:20,226-Speed 4463.30 samples/sec Loss 0.5586 Epoch: 19 Global Step: 325550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:27:32,580-Speed 4144.83 samples/sec Loss 0.5759 Epoch: 19 Global Step: 325600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:27:44,999-Speed 4122.93 samples/sec Loss 0.5708 Epoch: 19 Global Step: 325650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:27:56,730-Speed 4364.31 samples/sec Loss 0.5756 Epoch: 19 Global Step: 325700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:28:09,171-Speed 4115.59 samples/sec Loss 0.5576 Epoch: 19 Global Step: 325750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:28:20,797-Speed 4404.45 samples/sec Loss 0.5696 Epoch: 19 Global Step: 325800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:28:32,294-Speed 4453.43 samples/sec Loss 0.5745 Epoch: 19 Global Step: 325850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:28:43,842-Speed 4433.64 samples/sec Loss 0.5594 Epoch: 19 Global Step: 325900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:28:56,199-Speed 4143.86 samples/sec Loss 0.5654 Epoch: 19 Global Step: 325950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:29:07,688-Speed 4456.46 samples/sec Loss 0.5546 Epoch: 19 Global Step: 326000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:29:37,810-[lfw][326000]XNorm: 21.915543 Training: 2021-03-15 23:29:37,810-[lfw][326000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:29:37,810-[lfw][326000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:30:12,919-[cfp_fp][326000]XNorm: 21.883184 Training: 2021-03-15 23:30:12,920-[cfp_fp][326000]Accuracy-Flip: 0.99086+-0.00448 Training: 2021-03-15 23:30:12,920-[cfp_fp][326000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:30:43,085-[agedb_30][326000]XNorm: 22.667921 Training: 2021-03-15 23:30:43,085-[agedb_30][326000]Accuracy-Flip: 0.98317+-0.00769 Training: 2021-03-15 23:30:43,085-[agedb_30][326000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:30:54,423-Speed 479.70 samples/sec Loss 0.5787 Epoch: 19 Global Step: 326050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:31:05,915-Speed 4455.34 samples/sec Loss 0.5773 Epoch: 19 Global Step: 326100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:31:17,486-Speed 4425.14 samples/sec Loss 0.5620 Epoch: 19 Global Step: 326150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:31:29,098-Speed 4409.49 samples/sec Loss 0.5707 Epoch: 19 Global Step: 326200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:31:40,550-Speed 4470.75 samples/sec Loss 0.5616 Epoch: 19 Global Step: 326250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:31:52,111-Speed 4429.18 samples/sec Loss 0.5634 Epoch: 19 Global Step: 326300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:32:03,694-Speed 4420.18 samples/sec Loss 0.5647 Epoch: 19 Global Step: 326350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:32:15,193-Speed 4452.87 samples/sec Loss 0.5622 Epoch: 19 Global Step: 326400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:32:26,657-Speed 4466.44 samples/sec Loss 0.5743 Epoch: 19 Global Step: 326450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:32:39,016-Speed 4143.00 samples/sec Loss 0.5470 Epoch: 19 Global Step: 326500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:32:50,548-Speed 4439.90 samples/sec Loss 0.5610 Epoch: 19 Global Step: 326550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:02,205-Speed 4392.30 samples/sec Loss 0.5693 Epoch: 19 Global Step: 326600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:13,521-Speed 4524.86 samples/sec Loss 0.5655 Epoch: 19 Global Step: 326650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:25,117-Speed 4415.64 samples/sec Loss 0.5641 Epoch: 19 Global Step: 326700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:36,680-Speed 4428.05 samples/sec Loss 0.5605 Epoch: 19 Global Step: 326750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:48,368-Speed 4380.85 samples/sec Loss 0.5647 Epoch: 19 Global Step: 326800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:33:59,957-Speed 4418.14 samples/sec Loss 0.5600 Epoch: 19 Global Step: 326850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:34:11,602-Speed 4396.79 samples/sec Loss 0.5628 Epoch: 19 Global Step: 326900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:34:23,057-Speed 4469.94 samples/sec Loss 0.5675 Epoch: 19 Global Step: 326950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:34:34,817-Speed 4353.85 samples/sec Loss 0.5659 Epoch: 19 Global Step: 327000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:34:46,209-Speed 4494.38 samples/sec Loss 0.5660 Epoch: 19 Global Step: 327050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:34:57,873-Speed 4389.93 samples/sec Loss 0.5565 Epoch: 19 Global Step: 327100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:35:09,391-Speed 4445.30 samples/sec Loss 0.5802 Epoch: 19 Global Step: 327150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:35:20,812-Speed 4483.35 samples/sec Loss 0.5857 Epoch: 19 Global Step: 327200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:35:33,195-Speed 4134.77 samples/sec Loss 0.5716 Epoch: 19 Global Step: 327250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:35:44,826-Speed 4402.38 samples/sec Loss 0.5641 Epoch: 19 Global Step: 327300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:35:56,379-Speed 4431.85 samples/sec Loss 0.5654 Epoch: 19 Global Step: 327350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:36:08,095-Speed 4370.20 samples/sec Loss 0.5583 Epoch: 19 Global Step: 327400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:36:19,631-Speed 4438.89 samples/sec Loss 0.5751 Epoch: 19 Global Step: 327450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 23:36:31,244-Speed 4408.93 samples/sec Loss 0.5667 Epoch: 19 Global Step: 327500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:36:43,562-Speed 4156.68 samples/sec Loss 0.5792 Epoch: 19 Global Step: 327550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:36:55,051-Speed 4456.67 samples/sec Loss 0.5720 Epoch: 19 Global Step: 327600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:37:06,669-Speed 4407.23 samples/sec Loss 0.5559 Epoch: 19 Global Step: 327650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:37:18,202-Speed 4439.34 samples/sec Loss 0.5662 Epoch: 19 Global Step: 327700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:37:30,772-Speed 4073.40 samples/sec Loss 0.5705 Epoch: 19 Global Step: 327750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:37:42,186-Speed 4485.86 samples/sec Loss 0.5713 Epoch: 19 Global Step: 327800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:37:53,719-Speed 4439.81 samples/sec Loss 0.5636 Epoch: 19 Global Step: 327850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:38:05,395-Speed 4385.42 samples/sec Loss 0.5685 Epoch: 19 Global Step: 327900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:38:17,068-Speed 4386.40 samples/sec Loss 0.5607 Epoch: 19 Global Step: 327950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:38:28,710-Speed 4397.75 samples/sec Loss 0.5735 Epoch: 19 Global Step: 328000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:38:59,022-[lfw][328000]XNorm: 22.040189 Training: 2021-03-15 23:38:59,022-[lfw][328000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-15 23:38:59,022-[lfw][328000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:39:34,072-[cfp_fp][328000]XNorm: 21.946611 Training: 2021-03-15 23:39:34,072-[cfp_fp][328000]Accuracy-Flip: 0.99100+-0.00443 Training: 2021-03-15 23:39:34,072-[cfp_fp][328000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:40:04,118-[agedb_30][328000]XNorm: 22.749643 Training: 2021-03-15 23:40:04,118-[agedb_30][328000]Accuracy-Flip: 0.98233+-0.00692 Training: 2021-03-15 23:40:04,119-[agedb_30][328000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:40:15,613-Speed 478.94 samples/sec Loss 0.5676 Epoch: 19 Global Step: 328050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:40:27,265-Speed 4394.48 samples/sec Loss 0.5674 Epoch: 19 Global Step: 328100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:40:38,953-Speed 4380.59 samples/sec Loss 0.5636 Epoch: 19 Global Step: 328150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:40:51,215-Speed 4175.60 samples/sec Loss 0.5564 Epoch: 19 Global Step: 328200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:41:02,637-Speed 4483.00 samples/sec Loss 0.5472 Epoch: 19 Global Step: 328250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:41:14,885-Speed 4180.52 samples/sec Loss 0.5595 Epoch: 19 Global Step: 328300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:41:26,433-Speed 4433.73 samples/sec Loss 0.5701 Epoch: 19 Global Step: 328350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:41:37,881-Speed 4472.66 samples/sec Loss 0.5737 Epoch: 19 Global Step: 328400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:41:50,223-Speed 4148.58 samples/sec Loss 0.5566 Epoch: 19 Global Step: 328450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:42:01,605-Speed 4498.81 samples/sec Loss 0.5558 Epoch: 19 Global Step: 328500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:42:13,199-Speed 4416.26 samples/sec Loss 0.5698 Epoch: 19 Global Step: 328550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:42:24,607-Speed 4488.36 samples/sec Loss 0.5646 Epoch: 19 Global Step: 328600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:42:37,061-Speed 4111.30 samples/sec Loss 0.5641 Epoch: 19 Global Step: 328650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:42:48,652-Speed 4417.47 samples/sec Loss 0.5621 Epoch: 19 Global Step: 328700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:00,285-Speed 4401.54 samples/sec Loss 0.5676 Epoch: 19 Global Step: 328750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:11,847-Speed 4428.37 samples/sec Loss 0.5591 Epoch: 19 Global Step: 328800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:23,345-Speed 4453.24 samples/sec Loss 0.5554 Epoch: 19 Global Step: 328850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:34,838-Speed 4455.04 samples/sec Loss 0.5647 Epoch: 19 Global Step: 328900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:46,549-Speed 4372.09 samples/sec Loss 0.5588 Epoch: 19 Global Step: 328950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:43:58,118-Speed 4425.87 samples/sec Loss 0.5718 Epoch: 19 Global Step: 329000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:44:09,728-Speed 4410.25 samples/sec Loss 0.5636 Epoch: 19 Global Step: 329050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:44:21,218-Speed 4456.05 samples/sec Loss 0.5652 Epoch: 19 Global Step: 329100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:44:32,882-Speed 4390.07 samples/sec Loss 0.5603 Epoch: 19 Global Step: 329150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:44:44,323-Speed 4475.32 samples/sec Loss 0.5631 Epoch: 19 Global Step: 329200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:44:56,675-Speed 4145.28 samples/sec Loss 0.5655 Epoch: 19 Global Step: 329250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:45:08,192-Speed 4445.86 samples/sec Loss 0.5655 Epoch: 19 Global Step: 329300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:45:19,802-Speed 4410.23 samples/sec Loss 0.5634 Epoch: 19 Global Step: 329350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:45:31,332-Speed 4440.49 samples/sec Loss 0.5696 Epoch: 19 Global Step: 329400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:45:42,928-Speed 4415.43 samples/sec Loss 0.5631 Epoch: 19 Global Step: 329450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:45:54,499-Speed 4425.26 samples/sec Loss 0.5683 Epoch: 19 Global Step: 329500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:46:06,243-Speed 4359.99 samples/sec Loss 0.5720 Epoch: 19 Global Step: 329550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:46:17,921-Speed 4384.51 samples/sec Loss 0.5726 Epoch: 19 Global Step: 329600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:46:29,448-Speed 4441.73 samples/sec Loss 0.5620 Epoch: 19 Global Step: 329650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:46:40,925-Speed 4461.58 samples/sec Loss 0.5529 Epoch: 19 Global Step: 329700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:46:52,615-Speed 4379.73 samples/sec Loss 0.5658 Epoch: 19 Global Step: 329750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:47:05,206-Speed 4066.74 samples/sec Loss 0.5636 Epoch: 19 Global Step: 329800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:47:16,557-Speed 4510.65 samples/sec Loss 0.5612 Epoch: 19 Global Step: 329850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:47:28,134-Speed 4422.77 samples/sec Loss 0.5767 Epoch: 19 Global Step: 329900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:47:39,543-Speed 4488.08 samples/sec Loss 0.5638 Epoch: 19 Global Step: 329950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:47:51,102-Speed 4429.63 samples/sec Loss 0.5723 Epoch: 19 Global Step: 330000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:48:21,404-[lfw][330000]XNorm: 22.001582 Training: 2021-03-15 23:48:21,404-[lfw][330000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:48:21,404-[lfw][330000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:48:56,591-[cfp_fp][330000]XNorm: 21.926014 Training: 2021-03-15 23:48:56,591-[cfp_fp][330000]Accuracy-Flip: 0.99071+-0.00457 Training: 2021-03-15 23:48:56,591-[cfp_fp][330000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:49:26,894-[agedb_30][330000]XNorm: 22.734340 Training: 2021-03-15 23:49:26,894-[agedb_30][330000]Accuracy-Flip: 0.98233+-0.00720 Training: 2021-03-15 23:49:26,894-[agedb_30][330000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:49:38,387-Speed 477.23 samples/sec Loss 0.5658 Epoch: 19 Global Step: 330050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:49:50,081-Speed 4378.62 samples/sec Loss 0.5559 Epoch: 19 Global Step: 330100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:50:01,538-Speed 4468.85 samples/sec Loss 0.5680 Epoch: 19 Global Step: 330150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:50:13,201-Speed 4390.47 samples/sec Loss 0.5625 Epoch: 19 Global Step: 330200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:50:25,659-Speed 4109.69 samples/sec Loss 0.5694 Epoch: 19 Global Step: 330250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:50:38,355-Speed 4033.04 samples/sec Loss 0.5625 Epoch: 19 Global Step: 330300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:50:49,788-Speed 4478.59 samples/sec Loss 0.5693 Epoch: 19 Global Step: 330350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:01,277-Speed 4456.78 samples/sec Loss 0.5529 Epoch: 19 Global Step: 330400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:12,801-Speed 4443.04 samples/sec Loss 0.5720 Epoch: 19 Global Step: 330450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:24,378-Speed 4422.87 samples/sec Loss 0.5654 Epoch: 19 Global Step: 330500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:35,877-Speed 4452.83 samples/sec Loss 0.5746 Epoch: 19 Global Step: 330550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:47,303-Speed 4480.91 samples/sec Loss 0.5594 Epoch: 19 Global Step: 330600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:51:58,916-Speed 4409.25 samples/sec Loss 0.5602 Epoch: 19 Global Step: 330650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:52:10,639-Speed 4367.66 samples/sec Loss 0.5619 Epoch: 19 Global Step: 330700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:52:22,091-Speed 4471.00 samples/sec Loss 0.5692 Epoch: 19 Global Step: 330750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:52:33,777-Speed 4381.95 samples/sec Loss 0.5541 Epoch: 19 Global Step: 330800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:52:45,218-Speed 4475.24 samples/sec Loss 0.5612 Epoch: 19 Global Step: 330850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:52:57,576-Speed 4143.39 samples/sec Loss 0.5688 Epoch: 19 Global Step: 330900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:53:09,174-Speed 4414.84 samples/sec Loss 0.5715 Epoch: 19 Global Step: 330950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:53:21,620-Speed 4113.66 samples/sec Loss 0.5799 Epoch: 19 Global Step: 331000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:53:32,925-Speed 4529.14 samples/sec Loss 0.5634 Epoch: 19 Global Step: 331050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:53:45,433-Speed 4093.87 samples/sec Loss 0.5750 Epoch: 19 Global Step: 331100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:53:56,880-Speed 4472.78 samples/sec Loss 0.5652 Epoch: 19 Global Step: 331150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:54:08,492-Speed 4409.68 samples/sec Loss 0.5597 Epoch: 19 Global Step: 331200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:54:20,877-Speed 4134.19 samples/sec Loss 0.5668 Epoch: 19 Global Step: 331250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:54:32,587-Speed 4372.40 samples/sec Loss 0.5645 Epoch: 19 Global Step: 331300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:54:44,003-Speed 4485.09 samples/sec Loss 0.5645 Epoch: 19 Global Step: 331350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:54:55,496-Speed 4455.41 samples/sec Loss 0.5653 Epoch: 19 Global Step: 331400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:55:07,176-Speed 4383.58 samples/sec Loss 0.5672 Epoch: 19 Global Step: 331450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:55:18,696-Speed 4444.71 samples/sec Loss 0.5709 Epoch: 19 Global Step: 331500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:55:30,282-Speed 4419.49 samples/sec Loss 0.5519 Epoch: 19 Global Step: 331550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:55:41,622-Speed 4514.95 samples/sec Loss 0.5574 Epoch: 19 Global Step: 331600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:55:53,203-Speed 4421.54 samples/sec Loss 0.5535 Epoch: 19 Global Step: 331650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:56:04,611-Speed 4488.11 samples/sec Loss 0.5734 Epoch: 19 Global Step: 331700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:56:16,209-Speed 4414.80 samples/sec Loss 0.5584 Epoch: 19 Global Step: 331750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:56:27,722-Speed 4447.22 samples/sec Loss 0.5713 Epoch: 19 Global Step: 331800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:56:39,161-Speed 4476.38 samples/sec Loss 0.5657 Epoch: 19 Global Step: 331850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:56:50,776-Speed 4408.25 samples/sec Loss 0.5667 Epoch: 19 Global Step: 331900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:57:03,240-Speed 4108.00 samples/sec Loss 0.5645 Epoch: 19 Global Step: 331950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:57:14,954-Speed 4370.92 samples/sec Loss 0.5563 Epoch: 19 Global Step: 332000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:57:45,281-[lfw][332000]XNorm: 21.864663 Training: 2021-03-15 23:57:45,281-[lfw][332000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-15 23:57:45,281-[lfw][332000]Accuracy-Highest: 0.99833 Training: 2021-03-15 23:58:20,472-[cfp_fp][332000]XNorm: 21.831351 Training: 2021-03-15 23:58:20,472-[cfp_fp][332000]Accuracy-Flip: 0.99100+-0.00452 Training: 2021-03-15 23:58:20,472-[cfp_fp][332000]Accuracy-Highest: 0.99200 Training: 2021-03-15 23:58:50,729-[agedb_30][332000]XNorm: 22.610139 Training: 2021-03-15 23:58:50,729-[agedb_30][332000]Accuracy-Flip: 0.98350+-0.00721 Training: 2021-03-15 23:58:50,729-[agedb_30][332000]Accuracy-Highest: 0.98383 Training: 2021-03-15 23:59:02,351-Speed 476.74 samples/sec Loss 0.5564 Epoch: 19 Global Step: 332050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:59:13,910-Speed 4429.63 samples/sec Loss 0.5630 Epoch: 19 Global Step: 332100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:59:25,620-Speed 4372.62 samples/sec Loss 0.5599 Epoch: 19 Global Step: 332150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:59:37,142-Speed 4444.11 samples/sec Loss 0.5707 Epoch: 19 Global Step: 332200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 23:59:48,660-Speed 4445.24 samples/sec Loss 0.5618 Epoch: 19 Global Step: 332250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:00,157-Speed 4453.50 samples/sec Loss 0.5592 Epoch: 19 Global Step: 332300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:11,533-Speed 4501.14 samples/sec Loss 0.5647 Epoch: 19 Global Step: 332350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:23,775-Speed 4182.27 samples/sec Loss 0.5700 Epoch: 19 Global Step: 332400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:35,297-Speed 4444.10 samples/sec Loss 0.5657 Epoch: 19 Global Step: 332450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:46,733-Speed 4477.14 samples/sec Loss 0.5652 Epoch: 19 Global Step: 332500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:00:58,309-Speed 4423.18 samples/sec Loss 0.5555 Epoch: 19 Global Step: 332550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:01:09,887-Speed 4422.54 samples/sec Loss 0.5531 Epoch: 19 Global Step: 332600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:01:21,693-Speed 4337.02 samples/sec Loss 0.5633 Epoch: 19 Global Step: 332650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:01:33,225-Speed 4439.84 samples/sec Loss 0.5748 Epoch: 19 Global Step: 332700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:01:44,859-Speed 4401.05 samples/sec Loss 0.5575 Epoch: 19 Global Step: 332750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:01:56,204-Speed 4513.57 samples/sec Loss 0.5710 Epoch: 19 Global Step: 332800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:02:07,866-Speed 4390.47 samples/sec Loss 0.5620 Epoch: 19 Global Step: 332850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:02:20,163-Speed 4164.05 samples/sec Loss 0.5651 Epoch: 19 Global Step: 332900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:02:32,675-Speed 4092.17 samples/sec Loss 0.5632 Epoch: 19 Global Step: 332950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:02:44,152-Speed 4461.48 samples/sec Loss 0.5706 Epoch: 19 Global Step: 333000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:02:55,811-Speed 4391.61 samples/sec Loss 0.5644 Epoch: 19 Global Step: 333050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:03:07,422-Speed 4409.63 samples/sec Loss 0.5699 Epoch: 19 Global Step: 333100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:03:19,113-Speed 4379.75 samples/sec Loss 0.5544 Epoch: 19 Global Step: 333150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:03:30,576-Speed 4466.78 samples/sec Loss 0.5782 Epoch: 19 Global Step: 333200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:03:42,198-Speed 4405.74 samples/sec Loss 0.5852 Epoch: 19 Global Step: 333250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:03:53,798-Speed 4414.03 samples/sec Loss 0.5608 Epoch: 19 Global Step: 333300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:04:05,314-Speed 4445.86 samples/sec Loss 0.5730 Epoch: 19 Global Step: 333350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:04:16,991-Speed 4384.94 samples/sec Loss 0.5519 Epoch: 19 Global Step: 333400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:04:28,544-Speed 4431.85 samples/sec Loss 0.5720 Epoch: 19 Global Step: 333450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:04:40,105-Speed 4429.01 samples/sec Loss 0.5638 Epoch: 19 Global Step: 333500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:04:52,398-Speed 4165.26 samples/sec Loss 0.5814 Epoch: 19 Global Step: 333550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:05:04,146-Speed 4358.14 samples/sec Loss 0.5693 Epoch: 19 Global Step: 333600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:05:16,511-Speed 4141.03 samples/sec Loss 0.5745 Epoch: 19 Global Step: 333650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:05:28,159-Speed 4395.83 samples/sec Loss 0.5593 Epoch: 19 Global Step: 333700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:05:39,711-Speed 4432.20 samples/sec Loss 0.5634 Epoch: 19 Global Step: 333750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-16 00:05:51,355-Speed 4397.27 samples/sec Loss 0.5673 Epoch: 19 Global Step: 333800 Fp16 Grad Scale: 16384 Required: 0 hours