Training: 2021-03-17 21:48:03,448-rank_id: 0 Training: 2021-03-17 21:48:10,195-resume fail, backbone init successfully! Training: 2021-03-17 21:48:12,183-softmax weight resume fail! Training: 2021-03-17 21:48:12,184-softmax weight mom resume fail! Training: 2021-03-17 21:48:12,186-Total Step is: 124550 Training: 2021-03-17 21:48:52,150-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-17 21:49:12,144-Speed 4802.20 samples/sec Loss 56.3661 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 8 hours Training: 2021-03-17 21:49:21,895-Speed 5251.22 samples/sec Loss 55.9345 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 8 hours Training: 2021-03-17 21:49:31,878-Speed 5129.19 samples/sec Loss 53.9155 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 7 hours Training: 2021-03-17 21:49:41,525-Speed 5307.96 samples/sec Loss 51.4414 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 7 hours Training: 2021-03-17 21:49:51,325-Speed 5224.83 samples/sec Loss 49.3224 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 7 hours Training: 2021-03-17 21:50:01,070-Speed 5253.94 samples/sec Loss 47.5518 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 7 hours Training: 2021-03-17 21:50:10,759-Speed 5284.62 samples/sec Loss 46.1695 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 7 hours Training: 2021-03-17 21:50:21,798-Speed 4638.49 samples/sec Loss 45.4753 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 7 hours Training: 2021-03-17 21:50:31,681-Speed 5181.01 samples/sec Loss 44.9452 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 7 hours Training: 2021-03-17 21:50:41,340-Speed 5301.30 samples/sec Loss 44.5123 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 7 hours Training: 2021-03-17 21:50:50,974-Speed 5315.13 samples/sec Loss 44.0159 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 7 hours Training: 2021-03-17 21:51:00,626-Speed 5305.02 samples/sec Loss 43.7007 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 7 hours Training: 2021-03-17 21:51:10,276-Speed 5306.44 samples/sec Loss 43.3346 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:51:19,678-Speed 5445.99 samples/sec Loss 42.9431 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:51:28,948-Speed 5523.29 samples/sec Loss 42.5568 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:51:38,249-Speed 5505.28 samples/sec Loss 42.1404 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:51:47,712-Speed 5410.77 samples/sec Loss 41.6610 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:51:57,325-Speed 5326.35 samples/sec Loss 41.1947 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:06,660-Speed 5485.46 samples/sec Loss 40.7021 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:16,117-Speed 5414.12 samples/sec Loss 40.1330 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:25,771-Speed 5304.04 samples/sec Loss 39.5653 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:35,311-Speed 5367.12 samples/sec Loss 38.9146 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:44,701-Speed 5453.12 samples/sec Loss 38.2946 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:52:54,051-Speed 5476.49 samples/sec Loss 37.4315 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:03,570-Speed 5378.78 samples/sec Loss 36.7773 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:13,112-Speed 5366.23 samples/sec Loss 36.0071 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:22,511-Speed 5447.92 samples/sec Loss 35.2078 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:31,917-Speed 5443.58 samples/sec Loss 34.3247 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:41,329-Speed 5439.98 samples/sec Loss 33.5242 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:53:50,815-Speed 5398.09 samples/sec Loss 32.7252 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:00,238-Speed 5434.01 samples/sec Loss 31.9180 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:09,739-Speed 5389.34 samples/sec Loss 31.1384 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:19,732-Speed 5123.78 samples/sec Loss 30.3958 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:29,407-Speed 5292.58 samples/sec Loss 29.6963 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:39,324-Speed 5162.76 samples/sec Loss 29.0635 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:49,045-Speed 5267.58 samples/sec Loss 28.3328 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:54:58,441-Speed 5449.28 samples/sec Loss 27.8047 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:55:07,973-Speed 5371.71 samples/sec Loss 27.2298 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:55:17,404-Speed 5429.04 samples/sec Loss 26.6488 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:55:35,578-[lfw][2000]XNorm: 23.546258 Training: 2021-03-17 21:55:35,578-[lfw][2000]Accuracy-Flip: 0.97167+-0.00662 Training: 2021-03-17 21:55:35,579-[lfw][2000]Accuracy-Highest: 0.97167 Training: 2021-03-17 21:55:56,250-[cfp_fp][2000]XNorm: 19.143787 Training: 2021-03-17 21:55:56,250-[cfp_fp][2000]Accuracy-Flip: 0.81786+-0.01623 Training: 2021-03-17 21:55:56,250-[cfp_fp][2000]Accuracy-Highest: 0.81786 Training: 2021-03-17 21:56:14,253-[agedb_30][2000]XNorm: 22.067003 Training: 2021-03-17 21:56:14,253-[agedb_30][2000]Accuracy-Flip: 0.85183+-0.02375 Training: 2021-03-17 21:56:14,253-[agedb_30][2000]Accuracy-Highest: 0.85183 Training: 2021-03-17 21:56:23,717-Speed 772.11 samples/sec Loss 26.1256 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 21:56:33,024-Speed 5501.96 samples/sec Loss 25.5930 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 21:56:42,431-Speed 5442.90 samples/sec Loss 25.3210 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:56:51,804-Speed 5462.86 samples/sec Loss 24.8110 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:01,309-Speed 5387.26 samples/sec Loss 24.4338 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:10,669-Speed 5470.29 samples/sec Loss 24.0648 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:20,199-Speed 5372.98 samples/sec Loss 23.7370 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:29,518-Speed 5494.21 samples/sec Loss 23.4394 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:39,015-Speed 5391.58 samples/sec Loss 23.1105 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:48,493-Speed 5402.25 samples/sec Loss 22.9217 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:57:57,934-Speed 5423.44 samples/sec Loss 22.6385 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:07,537-Speed 5332.50 samples/sec Loss 22.2754 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:16,972-Speed 5426.94 samples/sec Loss 22.0433 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:26,399-Speed 5431.37 samples/sec Loss 21.9430 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:35,914-Speed 5381.43 samples/sec Loss 21.7049 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:45,502-Speed 5340.63 samples/sec Loss 21.6310 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:58:54,888-Speed 5455.20 samples/sec Loss 21.5356 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:04,404-Speed 5380.68 samples/sec Loss 21.2773 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:13,785-Speed 5458.23 samples/sec Loss 21.1034 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:23,014-Speed 5548.12 samples/sec Loss 21.0128 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:32,415-Speed 5446.39 samples/sec Loss 20.9475 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:41,968-Speed 5360.29 samples/sec Loss 20.8278 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 21:59:51,834-Speed 5189.54 samples/sec Loss 20.5728 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:01,351-Speed 5380.07 samples/sec Loss 20.5743 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:10,677-Speed 5490.43 samples/sec Loss 20.4370 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:20,173-Speed 5392.46 samples/sec Loss 20.2880 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:29,783-Speed 5327.79 samples/sec Loss 20.2188 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:39,286-Speed 5388.32 samples/sec Loss 20.0719 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:49,112-Speed 5210.66 samples/sec Loss 20.0080 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:00:58,698-Speed 5342.20 samples/sec Loss 19.9646 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:08,519-Speed 5213.35 samples/sec Loss 19.9067 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:18,170-Speed 5305.67 samples/sec Loss 19.8644 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:27,718-Speed 5362.85 samples/sec Loss 19.6879 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:37,174-Speed 5414.96 samples/sec Loss 19.6480 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:46,724-Speed 5361.62 samples/sec Loss 19.6492 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:01:56,059-Speed 5485.08 samples/sec Loss 19.5080 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:02:05,696-Speed 5312.83 samples/sec Loss 19.4267 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:02:15,303-Speed 5329.86 samples/sec Loss 19.3586 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:02:24,656-Speed 5474.42 samples/sec Loss 19.2534 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:02:34,042-Speed 5455.44 samples/sec Loss 19.1920 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:02:50,748-[lfw][4000]XNorm: 23.058168 Training: 2021-03-17 22:02:50,748-[lfw][4000]Accuracy-Flip: 0.98883+-0.00615 Training: 2021-03-17 22:02:50,748-[lfw][4000]Accuracy-Highest: 0.98883 Training: 2021-03-17 22:03:09,295-[cfp_fp][4000]XNorm: 18.948849 Training: 2021-03-17 22:03:09,295-[cfp_fp][4000]Accuracy-Flip: 0.89357+-0.01701 Training: 2021-03-17 22:03:09,295-[cfp_fp][4000]Accuracy-Highest: 0.89357 Training: 2021-03-17 22:03:25,209-[agedb_30][4000]XNorm: 21.612716 Training: 2021-03-17 22:03:25,209-[agedb_30][4000]Accuracy-Flip: 0.90600+-0.01997 Training: 2021-03-17 22:03:25,209-[agedb_30][4000]Accuracy-Highest: 0.90600 Training: 2021-03-17 22:03:34,627-Speed 845.09 samples/sec Loss 19.1949 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:03:43,913-Speed 5513.97 samples/sec Loss 19.2069 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:03:53,388-Speed 5404.30 samples/sec Loss 19.1898 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:02,660-Speed 5522.55 samples/sec Loss 19.0859 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:11,906-Speed 5537.72 samples/sec Loss 19.2021 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:21,293-Speed 5454.48 samples/sec Loss 19.0536 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:30,643-Speed 5476.71 samples/sec Loss 18.9613 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:40,131-Speed 5396.56 samples/sec Loss 18.9237 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:49,427-Speed 5508.20 samples/sec Loss 18.9133 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:04:58,759-Speed 5486.59 samples/sec Loss 18.8925 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:08,269-Speed 5384.65 samples/sec Loss 18.7347 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:17,677-Speed 5442.40 samples/sec Loss 18.7727 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:27,241-Speed 5353.64 samples/sec Loss 18.6398 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:36,969-Speed 5263.64 samples/sec Loss 18.7401 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:46,356-Speed 5454.81 samples/sec Loss 18.6341 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:05:55,877-Speed 5377.75 samples/sec Loss 18.6089 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:05,266-Speed 5453.52 samples/sec Loss 18.6245 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:14,401-Speed 5605.56 samples/sec Loss 18.5925 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:23,988-Speed 5340.72 samples/sec Loss 18.5198 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:34,659-Speed 4798.37 samples/sec Loss 18.1914 Epoch: 1 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:44,266-Speed 5329.68 samples/sec Loss 17.7235 Epoch: 1 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:06:54,021-Speed 5249.13 samples/sec Loss 17.7575 Epoch: 1 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:03,579-Speed 5357.10 samples/sec Loss 17.8764 Epoch: 1 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:13,213-Speed 5314.94 samples/sec Loss 17.8391 Epoch: 1 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:22,637-Speed 5433.64 samples/sec Loss 18.0062 Epoch: 1 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:32,058-Speed 5434.58 samples/sec Loss 17.9836 Epoch: 1 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:41,836-Speed 5236.97 samples/sec Loss 18.0687 Epoch: 1 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:07:51,154-Speed 5494.56 samples/sec Loss 18.1271 Epoch: 1 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:00,546-Speed 5452.10 samples/sec Loss 18.1466 Epoch: 1 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:11,066-Speed 4867.04 samples/sec Loss 18.0927 Epoch: 1 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:20,679-Speed 5326.41 samples/sec Loss 18.0866 Epoch: 1 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:30,906-Speed 5006.85 samples/sec Loss 18.0897 Epoch: 1 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:40,227-Speed 5493.26 samples/sec Loss 18.0217 Epoch: 1 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:49,684-Speed 5414.29 samples/sec Loss 18.0900 Epoch: 1 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:08:59,078-Speed 5451.03 samples/sec Loss 18.0824 Epoch: 1 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:09:08,736-Speed 5301.35 samples/sec Loss 18.0370 Epoch: 1 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:09:18,366-Speed 5317.14 samples/sec Loss 17.9483 Epoch: 1 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:09:27,903-Speed 5368.67 samples/sec Loss 18.0401 Epoch: 1 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:09:37,457-Speed 5359.55 samples/sec Loss 17.9165 Epoch: 1 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:09:46,746-Speed 5511.97 samples/sec Loss 18.0554 Epoch: 1 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:10:03,286-[lfw][6000]XNorm: 24.499241 Training: 2021-03-17 22:10:03,286-[lfw][6000]Accuracy-Flip: 0.99033+-0.00464 Training: 2021-03-17 22:10:03,286-[lfw][6000]Accuracy-Highest: 0.99033 Training: 2021-03-17 22:10:21,713-[cfp_fp][6000]XNorm: 20.013780 Training: 2021-03-17 22:10:21,713-[cfp_fp][6000]Accuracy-Flip: 0.89043+-0.01419 Training: 2021-03-17 22:10:21,714-[cfp_fp][6000]Accuracy-Highest: 0.89357 Training: 2021-03-17 22:10:37,668-[agedb_30][6000]XNorm: 22.975213 Training: 2021-03-17 22:10:37,668-[agedb_30][6000]Accuracy-Flip: 0.92100+-0.01482 Training: 2021-03-17 22:10:37,668-[agedb_30][6000]Accuracy-Highest: 0.92100 Training: 2021-03-17 22:10:47,204-Speed 846.89 samples/sec Loss 17.9337 Epoch: 1 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:10:56,788-Speed 5342.58 samples/sec Loss 17.9521 Epoch: 1 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:06,344-Speed 5358.12 samples/sec Loss 17.9693 Epoch: 1 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:16,026-Speed 5288.66 samples/sec Loss 17.9495 Epoch: 1 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:25,418-Speed 5451.68 samples/sec Loss 17.8320 Epoch: 1 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:34,935-Speed 5379.82 samples/sec Loss 17.8434 Epoch: 1 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:44,356-Speed 5435.13 samples/sec Loss 17.8551 Epoch: 1 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:11:53,876-Speed 5378.56 samples/sec Loss 17.8225 Epoch: 1 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:03,247-Speed 5464.09 samples/sec Loss 17.7956 Epoch: 1 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:12,676-Speed 5430.27 samples/sec Loss 17.8611 Epoch: 1 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:22,129-Speed 5416.57 samples/sec Loss 17.8678 Epoch: 1 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:31,743-Speed 5325.88 samples/sec Loss 17.7456 Epoch: 1 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:41,186-Speed 5422.83 samples/sec Loss 17.8441 Epoch: 1 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:12:50,831-Speed 5308.46 samples/sec Loss 17.7519 Epoch: 1 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:00,518-Speed 5286.08 samples/sec Loss 17.7630 Epoch: 1 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:09,969-Speed 5417.67 samples/sec Loss 17.8216 Epoch: 1 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:19,243-Speed 5521.05 samples/sec Loss 17.6342 Epoch: 1 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:28,752-Speed 5385.23 samples/sec Loss 17.5887 Epoch: 1 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:38,210-Speed 5413.47 samples/sec Loss 17.6551 Epoch: 1 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:47,573-Speed 5468.82 samples/sec Loss 17.6052 Epoch: 1 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:13:56,981-Speed 5442.24 samples/sec Loss 17.6729 Epoch: 1 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:06,326-Speed 5479.05 samples/sec Loss 17.6794 Epoch: 1 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:16,211-Speed 5179.89 samples/sec Loss 17.5017 Epoch: 1 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:25,866-Speed 5303.55 samples/sec Loss 17.6735 Epoch: 1 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:35,244-Speed 5459.71 samples/sec Loss 17.6351 Epoch: 1 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:44,737-Speed 5393.95 samples/sec Loss 17.4922 Epoch: 1 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:14:54,568-Speed 5208.00 samples/sec Loss 17.5862 Epoch: 1 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:03,900-Speed 5486.57 samples/sec Loss 17.5152 Epoch: 1 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:13,359-Speed 5413.27 samples/sec Loss 17.4060 Epoch: 1 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:22,690-Speed 5487.73 samples/sec Loss 17.4545 Epoch: 1 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:32,085-Speed 5449.86 samples/sec Loss 17.5307 Epoch: 1 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:41,897-Speed 5218.41 samples/sec Loss 17.3541 Epoch: 1 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:15:51,766-Speed 5188.33 samples/sec Loss 17.3057 Epoch: 1 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:01,531-Speed 5243.46 samples/sec Loss 17.4150 Epoch: 1 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:11,480-Speed 5146.72 samples/sec Loss 17.4240 Epoch: 1 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:20,896-Speed 5437.65 samples/sec Loss 17.4288 Epoch: 1 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:30,251-Speed 5473.69 samples/sec Loss 17.4217 Epoch: 1 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:39,798-Speed 5363.02 samples/sec Loss 17.4797 Epoch: 1 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:49,189-Speed 5452.45 samples/sec Loss 17.4454 Epoch: 1 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:16:58,720-Speed 5372.17 samples/sec Loss 17.4408 Epoch: 1 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:17:15,313-[lfw][8000]XNorm: 23.745133 Training: 2021-03-17 22:17:15,313-[lfw][8000]Accuracy-Flip: 0.99067+-0.00461 Training: 2021-03-17 22:17:15,313-[lfw][8000]Accuracy-Highest: 0.99067 Training: 2021-03-17 22:17:33,682-[cfp_fp][8000]XNorm: 19.540213 Training: 2021-03-17 22:17:33,683-[cfp_fp][8000]Accuracy-Flip: 0.90586+-0.01313 Training: 2021-03-17 22:17:33,683-[cfp_fp][8000]Accuracy-Highest: 0.90586 Training: 2021-03-17 22:17:49,580-[agedb_30][8000]XNorm: 22.427195 Training: 2021-03-17 22:17:49,580-[agedb_30][8000]Accuracy-Flip: 0.92450+-0.01745 Training: 2021-03-17 22:17:49,580-[agedb_30][8000]Accuracy-Highest: 0.92450 Training: 2021-03-17 22:17:59,037-Speed 848.86 samples/sec Loss 17.3199 Epoch: 1 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:08,658-Speed 5321.79 samples/sec Loss 17.3521 Epoch: 1 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:18,221-Speed 5354.68 samples/sec Loss 17.2776 Epoch: 1 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:27,487-Speed 5525.34 samples/sec Loss 17.3619 Epoch: 1 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:36,912-Speed 5432.85 samples/sec Loss 17.3375 Epoch: 1 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:46,294-Speed 5457.87 samples/sec Loss 17.3063 Epoch: 1 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:18:55,910-Speed 5324.71 samples/sec Loss 17.3091 Epoch: 1 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:05,592-Speed 5288.51 samples/sec Loss 17.1618 Epoch: 1 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:14,836-Speed 5538.59 samples/sec Loss 17.3334 Epoch: 1 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:24,137-Speed 5505.28 samples/sec Loss 17.2936 Epoch: 1 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:33,564-Speed 5431.79 samples/sec Loss 17.2565 Epoch: 1 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:42,881-Speed 5495.63 samples/sec Loss 17.2534 Epoch: 1 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:19:52,626-Speed 5254.14 samples/sec Loss 17.3234 Epoch: 1 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:02,233-Speed 5330.04 samples/sec Loss 17.2601 Epoch: 1 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:11,685-Speed 5417.04 samples/sec Loss 17.2230 Epoch: 1 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:20,989-Speed 5503.12 samples/sec Loss 17.1640 Epoch: 1 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:30,465-Speed 5403.74 samples/sec Loss 17.1743 Epoch: 1 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:39,775-Speed 5499.57 samples/sec Loss 17.2462 Epoch: 1 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:49,305-Speed 5372.79 samples/sec Loss 17.2305 Epoch: 1 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:20:58,748-Speed 5422.13 samples/sec Loss 17.1548 Epoch: 1 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:08,021-Speed 5521.78 samples/sec Loss 17.1607 Epoch: 1 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:17,568-Speed 5363.47 samples/sec Loss 17.1692 Epoch: 1 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:26,877-Speed 5500.74 samples/sec Loss 17.2552 Epoch: 1 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:36,197-Speed 5493.79 samples/sec Loss 17.1856 Epoch: 1 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:45,345-Speed 5597.17 samples/sec Loss 17.0938 Epoch: 1 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:21:54,843-Speed 5390.99 samples/sec Loss 17.0242 Epoch: 1 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:04,528-Speed 5286.69 samples/sec Loss 17.1124 Epoch: 1 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:13,953-Speed 5433.12 samples/sec Loss 17.1342 Epoch: 1 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:23,435-Speed 5399.84 samples/sec Loss 17.0685 Epoch: 1 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:32,850-Speed 5438.23 samples/sec Loss 17.0657 Epoch: 1 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:42,500-Speed 5306.53 samples/sec Loss 17.1229 Epoch: 1 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:22:52,042-Speed 5365.91 samples/sec Loss 17.0277 Epoch: 1 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:01,426-Speed 5456.51 samples/sec Loss 17.0192 Epoch: 1 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:10,865-Speed 5424.54 samples/sec Loss 17.0863 Epoch: 1 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:20,356-Speed 5395.04 samples/sec Loss 16.9545 Epoch: 1 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:30,084-Speed 5263.15 samples/sec Loss 16.9830 Epoch: 1 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:39,877-Speed 5228.80 samples/sec Loss 17.0358 Epoch: 1 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:49,283-Speed 5443.50 samples/sec Loss 17.0774 Epoch: 1 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:23:59,412-Speed 5055.05 samples/sec Loss 17.0578 Epoch: 1 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:24:12,303-Speed 3971.96 samples/sec Loss 16.3617 Epoch: 2 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:24:29,137-[lfw][10000]XNorm: 23.722221 Training: 2021-03-17 22:24:29,138-[lfw][10000]Accuracy-Flip: 0.99133+-0.00470 Training: 2021-03-17 22:24:29,138-[lfw][10000]Accuracy-Highest: 0.99133 Training: 2021-03-17 22:24:47,639-[cfp_fp][10000]XNorm: 19.613400 Training: 2021-03-17 22:24:47,639-[cfp_fp][10000]Accuracy-Flip: 0.90043+-0.01847 Training: 2021-03-17 22:24:47,639-[cfp_fp][10000]Accuracy-Highest: 0.90586 Training: 2021-03-17 22:25:03,761-[agedb_30][10000]XNorm: 22.916831 Training: 2021-03-17 22:25:03,761-[agedb_30][10000]Accuracy-Flip: 0.93400+-0.01711 Training: 2021-03-17 22:25:03,761-[agedb_30][10000]Accuracy-Highest: 0.93400 Training: 2021-03-17 22:25:13,097-Speed 842.20 samples/sec Loss 16.2315 Epoch: 2 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:25:22,714-Speed 5324.49 samples/sec Loss 16.4057 Epoch: 2 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:25:32,435-Speed 5267.31 samples/sec Loss 16.4703 Epoch: 2 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:25:41,975-Speed 5367.16 samples/sec Loss 16.5653 Epoch: 2 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:25:51,406-Speed 5429.39 samples/sec Loss 16.6587 Epoch: 2 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:00,940-Speed 5370.37 samples/sec Loss 16.7072 Epoch: 2 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:10,224-Speed 5515.28 samples/sec Loss 16.8121 Epoch: 2 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:19,789-Speed 5353.00 samples/sec Loss 16.8419 Epoch: 2 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:29,355-Speed 5352.87 samples/sec Loss 16.8676 Epoch: 2 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:38,999-Speed 5309.45 samples/sec Loss 16.7779 Epoch: 2 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:48,567-Speed 5351.45 samples/sec Loss 16.9437 Epoch: 2 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:26:58,146-Speed 5345.48 samples/sec Loss 16.9216 Epoch: 2 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:07,695-Speed 5361.62 samples/sec Loss 16.8399 Epoch: 2 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:17,216-Speed 5378.30 samples/sec Loss 16.8428 Epoch: 2 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:26,870-Speed 5303.71 samples/sec Loss 16.9076 Epoch: 2 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:36,469-Speed 5334.50 samples/sec Loss 16.8749 Epoch: 2 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:45,985-Speed 5380.72 samples/sec Loss 16.9125 Epoch: 2 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:27:55,293-Speed 5501.03 samples/sec Loss 16.8816 Epoch: 2 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:04,678-Speed 5455.84 samples/sec Loss 16.8442 Epoch: 2 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:14,101-Speed 5434.31 samples/sec Loss 16.8088 Epoch: 2 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:23,904-Speed 5222.97 samples/sec Loss 16.8019 Epoch: 2 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:33,421-Speed 5380.19 samples/sec Loss 16.8753 Epoch: 2 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:42,802-Speed 5458.20 samples/sec Loss 16.8684 Epoch: 2 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:28:52,575-Speed 5239.65 samples/sec Loss 16.8184 Epoch: 2 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:02,067-Speed 5394.07 samples/sec Loss 16.7827 Epoch: 2 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:11,739-Speed 5294.56 samples/sec Loss 16.7466 Epoch: 2 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:21,454-Speed 5270.68 samples/sec Loss 16.8978 Epoch: 2 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:30,628-Speed 5581.10 samples/sec Loss 16.8292 Epoch: 2 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:40,437-Speed 5219.92 samples/sec Loss 16.8232 Epoch: 2 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:49,790-Speed 5474.81 samples/sec Loss 16.9371 Epoch: 2 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:29:59,348-Speed 5356.69 samples/sec Loss 16.8409 Epoch: 2 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:08,649-Speed 5505.64 samples/sec Loss 16.8986 Epoch: 2 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:18,109-Speed 5412.14 samples/sec Loss 16.8248 Epoch: 2 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:27,689-Speed 5345.12 samples/sec Loss 16.7642 Epoch: 2 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:37,267-Speed 5345.67 samples/sec Loss 16.8898 Epoch: 2 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:46,699-Speed 5429.10 samples/sec Loss 16.7087 Epoch: 2 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:30:56,142-Speed 5422.55 samples/sec Loss 16.7270 Epoch: 2 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:31:05,552-Speed 5440.95 samples/sec Loss 16.7078 Epoch: 2 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:31:14,917-Speed 5467.66 samples/sec Loss 16.7622 Epoch: 2 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:31:24,294-Speed 5460.62 samples/sec Loss 16.7830 Epoch: 2 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:31:40,891-[lfw][12000]XNorm: 24.056785 Training: 2021-03-17 22:31:40,892-[lfw][12000]Accuracy-Flip: 0.98967+-0.00600 Training: 2021-03-17 22:31:40,894-[lfw][12000]Accuracy-Highest: 0.99133 Training: 2021-03-17 22:31:59,333-[cfp_fp][12000]XNorm: 19.556121 Training: 2021-03-17 22:31:59,334-[cfp_fp][12000]Accuracy-Flip: 0.91357+-0.01310 Training: 2021-03-17 22:31:59,334-[cfp_fp][12000]Accuracy-Highest: 0.91357 Training: 2021-03-17 22:32:15,386-[agedb_30][12000]XNorm: 22.957785 Training: 2021-03-17 22:32:15,386-[agedb_30][12000]Accuracy-Flip: 0.92800+-0.01206 Training: 2021-03-17 22:32:15,386-[agedb_30][12000]Accuracy-Highest: 0.93400 Training: 2021-03-17 22:32:25,470-Speed 836.94 samples/sec Loss 16.8526 Epoch: 2 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:32:34,728-Speed 5530.49 samples/sec Loss 16.8091 Epoch: 2 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:32:44,215-Speed 5397.09 samples/sec Loss 16.6878 Epoch: 2 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:32:54,141-Speed 5158.49 samples/sec Loss 16.6819 Epoch: 2 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:03,747-Speed 5330.37 samples/sec Loss 16.7687 Epoch: 2 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:13,256-Speed 5384.88 samples/sec Loss 16.6685 Epoch: 2 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:22,489-Speed 5546.09 samples/sec Loss 16.6747 Epoch: 2 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:31,947-Speed 5413.68 samples/sec Loss 16.5915 Epoch: 2 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:41,431-Speed 5398.68 samples/sec Loss 16.7427 Epoch: 2 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:33:50,788-Speed 5472.23 samples/sec Loss 16.7898 Epoch: 2 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:00,406-Speed 5324.08 samples/sec Loss 16.6779 Epoch: 2 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:09,956-Speed 5361.54 samples/sec Loss 16.6820 Epoch: 2 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:19,173-Speed 5555.42 samples/sec Loss 16.7229 Epoch: 2 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:28,770-Speed 5335.40 samples/sec Loss 16.6771 Epoch: 2 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:38,006-Speed 5543.72 samples/sec Loss 16.7884 Epoch: 2 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:47,526-Speed 5378.77 samples/sec Loss 16.6417 Epoch: 2 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:34:56,957-Speed 5428.89 samples/sec Loss 16.6356 Epoch: 2 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:06,446-Speed 5395.95 samples/sec Loss 16.6633 Epoch: 2 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:16,022-Speed 5347.32 samples/sec Loss 16.7338 Epoch: 2 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:25,394-Speed 5463.27 samples/sec Loss 16.6651 Epoch: 2 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:34,866-Speed 5406.19 samples/sec Loss 16.6119 Epoch: 2 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:44,252-Speed 5455.35 samples/sec Loss 16.7062 Epoch: 2 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:35:53,758-Speed 5386.23 samples/sec Loss 16.6972 Epoch: 2 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:03,114-Speed 5472.94 samples/sec Loss 16.6733 Epoch: 2 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:12,568-Speed 5415.58 samples/sec Loss 16.6373 Epoch: 2 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:21,855-Speed 5514.11 samples/sec Loss 16.5966 Epoch: 2 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:31,418-Speed 5354.25 samples/sec Loss 16.5926 Epoch: 2 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:40,829-Speed 5440.86 samples/sec Loss 16.6752 Epoch: 2 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:36:50,548-Speed 5268.02 samples/sec Loss 16.7202 Epoch: 2 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:00,176-Speed 5318.13 samples/sec Loss 16.6850 Epoch: 2 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:09,753-Speed 5346.41 samples/sec Loss 16.5432 Epoch: 2 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:19,056-Speed 5503.95 samples/sec Loss 16.5794 Epoch: 2 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:28,680-Speed 5321.06 samples/sec Loss 16.6170 Epoch: 2 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:38,234-Speed 5359.21 samples/sec Loss 16.6980 Epoch: 2 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:47,593-Speed 5471.32 samples/sec Loss 16.6811 Epoch: 2 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:37:57,085-Speed 5394.31 samples/sec Loss 16.5539 Epoch: 2 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:38:06,525-Speed 5423.96 samples/sec Loss 16.6685 Epoch: 2 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:38:16,079-Speed 5359.39 samples/sec Loss 16.6281 Epoch: 2 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:38:25,443-Speed 5468.21 samples/sec Loss 16.6559 Epoch: 2 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:38:34,887-Speed 5421.85 samples/sec Loss 16.5135 Epoch: 2 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:38:51,502-[lfw][14000]XNorm: 23.204801 Training: 2021-03-17 22:38:51,502-[lfw][14000]Accuracy-Flip: 0.98967+-0.00591 Training: 2021-03-17 22:38:51,502-[lfw][14000]Accuracy-Highest: 0.99133 Training: 2021-03-17 22:39:09,967-[cfp_fp][14000]XNorm: 19.049849 Training: 2021-03-17 22:39:09,968-[cfp_fp][14000]Accuracy-Flip: 0.90900+-0.01616 Training: 2021-03-17 22:39:09,968-[cfp_fp][14000]Accuracy-Highest: 0.91357 Training: 2021-03-17 22:39:25,936-[agedb_30][14000]XNorm: 22.296440 Training: 2021-03-17 22:39:25,936-[agedb_30][14000]Accuracy-Flip: 0.93267+-0.01635 Training: 2021-03-17 22:39:25,936-[agedb_30][14000]Accuracy-Highest: 0.93400 Training: 2021-03-17 22:39:35,152-Speed 849.59 samples/sec Loss 16.5919 Epoch: 2 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:39:44,711-Speed 5357.06 samples/sec Loss 16.6294 Epoch: 2 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:39:54,332-Speed 5322.18 samples/sec Loss 16.6225 Epoch: 2 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:03,774-Speed 5422.70 samples/sec Loss 16.6895 Epoch: 2 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:13,228-Speed 5415.95 samples/sec Loss 16.6270 Epoch: 2 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:22,961-Speed 5260.60 samples/sec Loss 16.5428 Epoch: 2 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:32,400-Speed 5424.82 samples/sec Loss 16.5835 Epoch: 2 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:42,045-Speed 5308.93 samples/sec Loss 16.5250 Epoch: 2 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:40:52,002-Speed 5142.20 samples/sec Loss 16.4751 Epoch: 2 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:01,867-Speed 5190.55 samples/sec Loss 16.5017 Epoch: 2 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:11,232-Speed 5467.58 samples/sec Loss 16.7406 Epoch: 2 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:20,539-Speed 5501.73 samples/sec Loss 16.6063 Epoch: 2 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:29,761-Speed 5552.13 samples/sec Loss 16.6587 Epoch: 2 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:39,179-Speed 5436.87 samples/sec Loss 16.5102 Epoch: 2 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:48,537-Speed 5471.62 samples/sec Loss 16.5103 Epoch: 2 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:41:57,865-Speed 5489.26 samples/sec Loss 16.5160 Epoch: 2 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:07,303-Speed 5425.18 samples/sec Loss 16.5066 Epoch: 2 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:16,921-Speed 5323.97 samples/sec Loss 16.5867 Epoch: 2 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:28,942-Speed 4259.22 samples/sec Loss 16.5402 Epoch: 3 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:38,959-Speed 5111.84 samples/sec Loss 15.7323 Epoch: 3 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:48,611-Speed 5304.93 samples/sec Loss 15.8345 Epoch: 3 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:42:58,039-Speed 5430.99 samples/sec Loss 15.9551 Epoch: 3 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:07,789-Speed 5251.88 samples/sec Loss 16.0478 Epoch: 3 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:17,498-Speed 5274.05 samples/sec Loss 16.1132 Epoch: 3 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:26,854-Speed 5472.51 samples/sec Loss 16.2132 Epoch: 3 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:36,613-Speed 5247.03 samples/sec Loss 16.2153 Epoch: 3 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:46,187-Speed 5347.94 samples/sec Loss 16.2678 Epoch: 3 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:43:55,538-Speed 5476.07 samples/sec Loss 16.2722 Epoch: 3 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:44:04,857-Speed 5494.45 samples/sec Loss 16.3328 Epoch: 3 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:44:14,294-Speed 5426.12 samples/sec Loss 16.4676 Epoch: 3 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:44:23,918-Speed 5320.19 samples/sec Loss 16.4364 Epoch: 3 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:44:33,629-Speed 5272.87 samples/sec Loss 16.4577 Epoch: 3 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:44:43,266-Speed 5312.95 samples/sec Loss 16.3827 Epoch: 3 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:44:52,818-Speed 5360.80 samples/sec Loss 16.4353 Epoch: 3 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:02,105-Speed 5513.35 samples/sec Loss 16.4541 Epoch: 3 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:11,931-Speed 5210.71 samples/sec Loss 16.3922 Epoch: 3 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:21,331-Speed 5447.38 samples/sec Loss 16.4624 Epoch: 3 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:30,562-Speed 5546.89 samples/sec Loss 16.3289 Epoch: 3 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:40,132-Speed 5350.09 samples/sec Loss 16.3628 Epoch: 3 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:45:49,664-Speed 5372.00 samples/sec Loss 16.3800 Epoch: 3 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:46:06,379-[lfw][16000]XNorm: 22.491023 Training: 2021-03-17 22:46:06,380-[lfw][16000]Accuracy-Flip: 0.99250+-0.00410 Training: 2021-03-17 22:46:06,380-[lfw][16000]Accuracy-Highest: 0.99250 Training: 2021-03-17 22:46:24,946-[cfp_fp][16000]XNorm: 18.416064 Training: 2021-03-17 22:46:24,947-[cfp_fp][16000]Accuracy-Flip: 0.92100+-0.01369 Training: 2021-03-17 22:46:24,947-[cfp_fp][16000]Accuracy-Highest: 0.92100 Training: 2021-03-17 22:46:40,998-[agedb_30][16000]XNorm: 21.535176 Training: 2021-03-17 22:46:40,998-[agedb_30][16000]Accuracy-Flip: 0.93333+-0.01562 Training: 2021-03-17 22:46:40,998-[agedb_30][16000]Accuracy-Highest: 0.93400 Training: 2021-03-17 22:46:50,340-Speed 843.83 samples/sec Loss 16.5058 Epoch: 3 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:46:59,770-Speed 5430.18 samples/sec Loss 16.3987 Epoch: 3 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:09,271-Speed 5389.22 samples/sec Loss 16.4173 Epoch: 3 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:18,593-Speed 5492.42 samples/sec Loss 16.3848 Epoch: 3 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:28,076-Speed 5399.40 samples/sec Loss 16.4493 Epoch: 3 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:37,642-Speed 5352.87 samples/sec Loss 16.3363 Epoch: 3 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:47,329-Speed 5285.61 samples/sec Loss 16.4579 Epoch: 3 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 22:47:56,794-Speed 5410.14 samples/sec Loss 16.4431 Epoch: 3 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:06,399-Speed 5330.61 samples/sec Loss 16.4212 Epoch: 3 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:15,808-Speed 5442.29 samples/sec Loss 16.4934 Epoch: 3 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:25,206-Speed 5448.00 samples/sec Loss 16.3958 Epoch: 3 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:34,582-Speed 5461.14 samples/sec Loss 16.4046 Epoch: 3 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:44,533-Speed 5145.70 samples/sec Loss 16.4366 Epoch: 3 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:48:54,343-Speed 5219.40 samples/sec Loss 16.3964 Epoch: 3 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:03,956-Speed 5326.32 samples/sec Loss 16.4772 Epoch: 3 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:13,909-Speed 5144.54 samples/sec Loss 16.4590 Epoch: 3 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:23,947-Speed 5101.08 samples/sec Loss 16.3419 Epoch: 3 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:33,639-Speed 5282.77 samples/sec Loss 16.4635 Epoch: 3 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:43,102-Speed 5411.19 samples/sec Loss 16.3674 Epoch: 3 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:49:52,610-Speed 5385.46 samples/sec Loss 16.4312 Epoch: 3 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:02,115-Speed 5386.64 samples/sec Loss 16.3976 Epoch: 3 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:11,514-Speed 5448.04 samples/sec Loss 16.3555 Epoch: 3 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:20,917-Speed 5445.30 samples/sec Loss 16.3700 Epoch: 3 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:30,343-Speed 5431.95 samples/sec Loss 16.3951 Epoch: 3 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:39,875-Speed 5371.42 samples/sec Loss 16.3215 Epoch: 3 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:49,457-Speed 5344.00 samples/sec Loss 16.2632 Epoch: 3 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:50:58,922-Speed 5410.01 samples/sec Loss 16.3291 Epoch: 3 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:08,234-Speed 5498.35 samples/sec Loss 16.4409 Epoch: 3 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:17,612-Speed 5460.10 samples/sec Loss 16.3526 Epoch: 3 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:27,057-Speed 5420.85 samples/sec Loss 16.3186 Epoch: 3 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:36,415-Speed 5471.80 samples/sec Loss 16.3382 Epoch: 3 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:45,756-Speed 5481.37 samples/sec Loss 16.3374 Epoch: 3 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:51:55,258-Speed 5389.10 samples/sec Loss 16.4150 Epoch: 3 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:04,592-Speed 5485.59 samples/sec Loss 16.4338 Epoch: 3 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:14,098-Speed 5386.10 samples/sec Loss 16.3709 Epoch: 3 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:23,712-Speed 5325.89 samples/sec Loss 16.5183 Epoch: 3 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:33,051-Speed 5483.07 samples/sec Loss 16.3344 Epoch: 3 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:42,679-Speed 5318.14 samples/sec Loss 16.3122 Epoch: 3 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:52:51,993-Speed 5497.05 samples/sec Loss 16.4371 Epoch: 3 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:53:01,396-Speed 5445.39 samples/sec Loss 16.3844 Epoch: 3 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:53:18,098-[lfw][18000]XNorm: 23.406828 Training: 2021-03-17 22:53:18,098-[lfw][18000]Accuracy-Flip: 0.99217+-0.00624 Training: 2021-03-17 22:53:18,098-[lfw][18000]Accuracy-Highest: 0.99250 Training: 2021-03-17 22:53:36,560-[cfp_fp][18000]XNorm: 18.740843 Training: 2021-03-17 22:53:36,560-[cfp_fp][18000]Accuracy-Flip: 0.91486+-0.01526 Training: 2021-03-17 22:53:36,560-[cfp_fp][18000]Accuracy-Highest: 0.92100 Training: 2021-03-17 22:53:52,510-[agedb_30][18000]XNorm: 22.191585 Training: 2021-03-17 22:53:52,510-[agedb_30][18000]Accuracy-Flip: 0.93817+-0.01383 Training: 2021-03-17 22:53:52,510-[agedb_30][18000]Accuracy-Highest: 0.93817 Training: 2021-03-17 22:54:01,989-Speed 845.00 samples/sec Loss 16.3323 Epoch: 3 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:11,514-Speed 5375.28 samples/sec Loss 16.3179 Epoch: 3 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:21,040-Speed 5375.16 samples/sec Loss 16.3268 Epoch: 3 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:30,511-Speed 5406.48 samples/sec Loss 16.4731 Epoch: 3 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:40,018-Speed 5385.95 samples/sec Loss 16.3369 Epoch: 3 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:49,625-Speed 5329.62 samples/sec Loss 16.3227 Epoch: 3 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:54:59,056-Speed 5429.37 samples/sec Loss 16.4317 Epoch: 3 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:08,582-Speed 5375.13 samples/sec Loss 16.2989 Epoch: 3 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:18,115-Speed 5371.41 samples/sec Loss 16.4328 Epoch: 3 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:27,646-Speed 5371.93 samples/sec Loss 16.2716 Epoch: 3 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:37,118-Speed 5406.19 samples/sec Loss 16.3698 Epoch: 3 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:46,786-Speed 5295.85 samples/sec Loss 16.3326 Epoch: 3 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:55:56,115-Speed 5488.48 samples/sec Loss 16.3524 Epoch: 3 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:05,675-Speed 5356.29 samples/sec Loss 16.3000 Epoch: 3 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:14,864-Speed 5572.53 samples/sec Loss 16.2692 Epoch: 3 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:24,248-Speed 5456.15 samples/sec Loss 16.3860 Epoch: 3 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:33,728-Speed 5401.38 samples/sec Loss 16.2456 Epoch: 3 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:43,180-Speed 5417.27 samples/sec Loss 16.3599 Epoch: 3 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:56:52,836-Speed 5302.57 samples/sec Loss 16.3681 Epoch: 3 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:02,738-Speed 5170.74 samples/sec Loss 16.3893 Epoch: 3 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:12,159-Speed 5435.33 samples/sec Loss 16.3425 Epoch: 3 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:21,729-Speed 5350.17 samples/sec Loss 16.2363 Epoch: 3 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:31,124-Speed 5450.23 samples/sec Loss 16.2223 Epoch: 3 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:41,424-Speed 4971.30 samples/sec Loss 16.3462 Epoch: 3 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:57:51,612-Speed 5025.44 samples/sec Loss 16.2692 Epoch: 3 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:01,089-Speed 5403.34 samples/sec Loss 16.3295 Epoch: 3 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:10,461-Speed 5463.30 samples/sec Loss 16.3672 Epoch: 3 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:19,958-Speed 5391.85 samples/sec Loss 16.2718 Epoch: 3 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:29,669-Speed 5272.31 samples/sec Loss 16.3023 Epoch: 3 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:39,259-Speed 5339.65 samples/sec Loss 16.2819 Epoch: 3 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:48,491-Speed 5546.09 samples/sec Loss 16.3986 Epoch: 3 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:58:57,938-Speed 5419.65 samples/sec Loss 16.3003 Epoch: 3 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:07,339-Speed 5446.95 samples/sec Loss 16.3412 Epoch: 3 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:17,059-Speed 5267.77 samples/sec Loss 16.3069 Epoch: 3 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:26,539-Speed 5400.99 samples/sec Loss 16.2510 Epoch: 3 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:36,107-Speed 5351.33 samples/sec Loss 16.2324 Epoch: 3 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:45,570-Speed 5410.88 samples/sec Loss 16.2342 Epoch: 3 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 22:59:55,166-Speed 5336.20 samples/sec Loss 16.2572 Epoch: 3 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:00:07,824-Speed 4045.23 samples/sec Loss 15.9111 Epoch: 4 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:00:17,535-Speed 5273.06 samples/sec Loss 15.5399 Epoch: 4 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:00:33,938-[lfw][20000]XNorm: 23.069220 Training: 2021-03-17 23:00:33,938-[lfw][20000]Accuracy-Flip: 0.99217+-0.00495 Training: 2021-03-17 23:00:33,938-[lfw][20000]Accuracy-Highest: 0.99250 Training: 2021-03-17 23:00:52,378-[cfp_fp][20000]XNorm: 19.019159 Training: 2021-03-17 23:00:52,378-[cfp_fp][20000]Accuracy-Flip: 0.92200+-0.01442 Training: 2021-03-17 23:00:52,378-[cfp_fp][20000]Accuracy-Highest: 0.92200 Training: 2021-03-17 23:01:08,459-[agedb_30][20000]XNorm: 22.315032 Training: 2021-03-17 23:01:08,460-[agedb_30][20000]Accuracy-Flip: 0.94233+-0.01332 Training: 2021-03-17 23:01:08,460-[agedb_30][20000]Accuracy-Highest: 0.94233 Training: 2021-03-17 23:01:17,804-Speed 849.54 samples/sec Loss 15.5579 Epoch: 4 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:01:27,173-Speed 5464.89 samples/sec Loss 15.7264 Epoch: 4 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:01:36,601-Speed 5431.46 samples/sec Loss 15.9612 Epoch: 4 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:01:46,226-Speed 5319.63 samples/sec Loss 15.9854 Epoch: 4 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:01:55,652-Speed 5431.97 samples/sec Loss 15.9598 Epoch: 4 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:05,058-Speed 5443.98 samples/sec Loss 16.0020 Epoch: 4 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:14,628-Speed 5350.36 samples/sec Loss 16.0186 Epoch: 4 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:24,122-Speed 5393.36 samples/sec Loss 16.1570 Epoch: 4 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:34,234-Speed 5063.66 samples/sec Loss 16.1168 Epoch: 4 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:43,629-Speed 5449.66 samples/sec Loss 16.1902 Epoch: 4 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:02:53,149-Speed 5378.79 samples/sec Loss 16.2372 Epoch: 4 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:02,446-Speed 5507.62 samples/sec Loss 16.2357 Epoch: 4 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:11,949-Speed 5387.66 samples/sec Loss 16.2943 Epoch: 4 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:21,207-Speed 5531.01 samples/sec Loss 16.1213 Epoch: 4 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:30,569-Speed 5468.97 samples/sec Loss 16.1636 Epoch: 4 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:40,012-Speed 5422.54 samples/sec Loss 16.1829 Epoch: 4 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:49,543-Speed 5372.64 samples/sec Loss 16.2445 Epoch: 4 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:03:58,906-Speed 5468.50 samples/sec Loss 16.1525 Epoch: 4 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:08,528-Speed 5321.49 samples/sec Loss 16.1030 Epoch: 4 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:18,160-Speed 5316.58 samples/sec Loss 16.2050 Epoch: 4 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:27,565-Speed 5443.95 samples/sec Loss 16.1978 Epoch: 4 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:36,974-Speed 5441.99 samples/sec Loss 16.2359 Epoch: 4 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:46,595-Speed 5321.77 samples/sec Loss 16.2199 Epoch: 4 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:04:56,104-Speed 5385.15 samples/sec Loss 16.2802 Epoch: 4 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:05,447-Speed 5480.51 samples/sec Loss 16.2223 Epoch: 4 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:15,037-Speed 5338.94 samples/sec Loss 16.1451 Epoch: 4 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:24,443-Speed 5443.74 samples/sec Loss 16.2027 Epoch: 4 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:34,052-Speed 5328.84 samples/sec Loss 16.1658 Epoch: 4 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:43,658-Speed 5330.53 samples/sec Loss 16.2318 Epoch: 4 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:05:53,300-Speed 5310.22 samples/sec Loss 16.1527 Epoch: 4 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:02,998-Speed 5279.97 samples/sec Loss 16.2114 Epoch: 4 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:13,122-Speed 5057.39 samples/sec Loss 16.2499 Epoch: 4 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:22,635-Speed 5382.67 samples/sec Loss 16.2164 Epoch: 4 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:32,399-Speed 5243.98 samples/sec Loss 16.2338 Epoch: 4 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:41,808-Speed 5441.86 samples/sec Loss 16.1561 Epoch: 4 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:06:51,316-Speed 5385.49 samples/sec Loss 16.1252 Epoch: 4 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:07:00,684-Speed 5465.78 samples/sec Loss 16.1176 Epoch: 4 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:07:10,136-Speed 5417.43 samples/sec Loss 16.1903 Epoch: 4 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:07:19,415-Speed 5517.92 samples/sec Loss 16.2503 Epoch: 4 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:07:28,868-Speed 5416.76 samples/sec Loss 16.2540 Epoch: 4 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:07:45,225-[lfw][22000]XNorm: 22.559328 Training: 2021-03-17 23:07:45,225-[lfw][22000]Accuracy-Flip: 0.99183+-0.00462 Training: 2021-03-17 23:07:45,225-[lfw][22000]Accuracy-Highest: 0.99250 Training: 2021-03-17 23:08:03,697-[cfp_fp][22000]XNorm: 18.569706 Training: 2021-03-17 23:08:03,697-[cfp_fp][22000]Accuracy-Flip: 0.91571+-0.01229 Training: 2021-03-17 23:08:03,697-[cfp_fp][22000]Accuracy-Highest: 0.92200 Training: 2021-03-17 23:08:19,678-[agedb_30][22000]XNorm: 21.598036 Training: 2021-03-17 23:08:19,678-[agedb_30][22000]Accuracy-Flip: 0.93583+-0.01268 Training: 2021-03-17 23:08:19,678-[agedb_30][22000]Accuracy-Highest: 0.94233 Training: 2021-03-17 23:08:29,210-Speed 848.51 samples/sec Loss 16.1817 Epoch: 4 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:08:38,495-Speed 5514.88 samples/sec Loss 16.1916 Epoch: 4 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:08:47,999-Speed 5387.82 samples/sec Loss 16.1117 Epoch: 4 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:08:57,487-Speed 5396.59 samples/sec Loss 16.1358 Epoch: 4 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:07,017-Speed 5372.72 samples/sec Loss 16.1375 Epoch: 4 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:16,686-Speed 5295.50 samples/sec Loss 16.1685 Epoch: 4 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:26,310-Speed 5320.40 samples/sec Loss 16.1436 Epoch: 4 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:35,691-Speed 5458.08 samples/sec Loss 16.1848 Epoch: 4 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:45,286-Speed 5336.42 samples/sec Loss 16.2199 Epoch: 4 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:09:54,909-Speed 5321.33 samples/sec Loss 16.1439 Epoch: 4 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:04,349-Speed 5423.97 samples/sec Loss 16.2216 Epoch: 4 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:13,695-Speed 5478.57 samples/sec Loss 16.1696 Epoch: 4 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:23,167-Speed 5405.71 samples/sec Loss 16.2075 Epoch: 4 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:32,730-Speed 5354.71 samples/sec Loss 16.2390 Epoch: 4 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:42,374-Speed 5309.25 samples/sec Loss 16.2064 Epoch: 4 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:10:51,838-Speed 5409.94 samples/sec Loss 16.1222 Epoch: 4 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:01,222-Speed 5456.80 samples/sec Loss 16.1210 Epoch: 4 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:10,708-Speed 5397.79 samples/sec Loss 16.1641 Epoch: 4 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:20,239-Speed 5372.18 samples/sec Loss 16.1369 Epoch: 4 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:29,672-Speed 5427.94 samples/sec Loss 16.2280 Epoch: 4 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:39,320-Speed 5307.50 samples/sec Loss 16.1868 Epoch: 4 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:48,773-Speed 5416.76 samples/sec Loss 16.0931 Epoch: 4 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:11:58,257-Speed 5398.58 samples/sec Loss 16.2587 Epoch: 4 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:07,780-Speed 5376.85 samples/sec Loss 16.1702 Epoch: 4 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:17,231-Speed 5417.39 samples/sec Loss 16.1094 Epoch: 4 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:26,567-Speed 5484.46 samples/sec Loss 16.2421 Epoch: 4 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:36,080-Speed 5382.81 samples/sec Loss 16.2039 Epoch: 4 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:45,551-Speed 5406.44 samples/sec Loss 16.1182 Epoch: 4 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:12:54,930-Speed 5459.43 samples/sec Loss 16.1886 Epoch: 4 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:04,385-Speed 5415.71 samples/sec Loss 16.0775 Epoch: 4 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:13,954-Speed 5350.86 samples/sec Loss 16.1140 Epoch: 4 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:23,340-Speed 5455.31 samples/sec Loss 16.0755 Epoch: 4 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:32,871-Speed 5372.28 samples/sec Loss 16.1679 Epoch: 4 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:42,524-Speed 5304.76 samples/sec Loss 16.2172 Epoch: 4 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:13:52,247-Speed 5266.01 samples/sec Loss 16.0704 Epoch: 4 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:01,997-Speed 5251.88 samples/sec Loss 16.0917 Epoch: 4 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:11,341-Speed 5479.80 samples/sec Loss 16.1123 Epoch: 4 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:20,963-Speed 5321.55 samples/sec Loss 16.0986 Epoch: 4 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:30,596-Speed 5315.64 samples/sec Loss 16.1929 Epoch: 4 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:40,235-Speed 5311.85 samples/sec Loss 16.1771 Epoch: 4 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:14:56,752-[lfw][24000]XNorm: 22.894636 Training: 2021-03-17 23:14:56,753-[lfw][24000]Accuracy-Flip: 0.99333+-0.00422 Training: 2021-03-17 23:14:56,753-[lfw][24000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:15:15,203-[cfp_fp][24000]XNorm: 18.974399 Training: 2021-03-17 23:15:15,203-[cfp_fp][24000]Accuracy-Flip: 0.91400+-0.01610 Training: 2021-03-17 23:15:15,203-[cfp_fp][24000]Accuracy-Highest: 0.92200 Training: 2021-03-17 23:15:31,175-[agedb_30][24000]XNorm: 22.086760 Training: 2021-03-17 23:15:31,175-[agedb_30][24000]Accuracy-Flip: 0.93750+-0.01342 Training: 2021-03-17 23:15:31,176-[agedb_30][24000]Accuracy-Highest: 0.94233 Training: 2021-03-17 23:15:40,537-Speed 849.07 samples/sec Loss 16.1126 Epoch: 4 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:15:49,980-Speed 5422.62 samples/sec Loss 16.1575 Epoch: 4 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:15:59,390-Speed 5441.47 samples/sec Loss 16.0885 Epoch: 4 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:08,726-Speed 5484.18 samples/sec Loss 16.1601 Epoch: 4 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:18,295-Speed 5351.29 samples/sec Loss 16.1912 Epoch: 4 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:27,651-Speed 5473.03 samples/sec Loss 16.1546 Epoch: 4 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:37,194-Speed 5365.70 samples/sec Loss 16.1463 Epoch: 4 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:46,725-Speed 5372.19 samples/sec Loss 16.1445 Epoch: 4 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:16:56,128-Speed 5445.51 samples/sec Loss 16.0729 Epoch: 4 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:05,540-Speed 5440.00 samples/sec Loss 16.0612 Epoch: 4 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:15,027-Speed 5397.36 samples/sec Loss 16.0896 Epoch: 4 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:24,218-Speed 5571.17 samples/sec Loss 16.1079 Epoch: 4 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:34,090-Speed 5186.72 samples/sec Loss 16.1144 Epoch: 4 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:43,509-Speed 5436.02 samples/sec Loss 16.1293 Epoch: 4 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:17:53,065-Speed 5358.62 samples/sec Loss 16.1484 Epoch: 4 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:02,693-Speed 5318.20 samples/sec Loss 16.0953 Epoch: 4 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:12,195-Speed 5388.37 samples/sec Loss 16.0984 Epoch: 4 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:21,647-Speed 5416.99 samples/sec Loss 16.1729 Epoch: 4 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:34,428-Speed 4006.27 samples/sec Loss 15.4563 Epoch: 5 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:43,934-Speed 5386.25 samples/sec Loss 15.3308 Epoch: 5 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:18:53,532-Speed 5334.58 samples/sec Loss 15.5184 Epoch: 5 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:03,199-Speed 5296.95 samples/sec Loss 15.6093 Epoch: 5 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:12,756-Speed 5357.74 samples/sec Loss 15.7368 Epoch: 5 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:22,286-Speed 5373.10 samples/sec Loss 15.8197 Epoch: 5 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:31,793-Speed 5386.09 samples/sec Loss 15.8443 Epoch: 5 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:41,241-Speed 5419.24 samples/sec Loss 15.9416 Epoch: 5 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:19:50,860-Speed 5323.25 samples/sec Loss 15.9252 Epoch: 5 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:00,480-Speed 5322.38 samples/sec Loss 15.9715 Epoch: 5 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:09,790-Speed 5499.90 samples/sec Loss 16.0291 Epoch: 5 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:19,379-Speed 5339.80 samples/sec Loss 16.0660 Epoch: 5 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:28,887-Speed 5385.61 samples/sec Loss 16.0945 Epoch: 5 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:38,391-Speed 5387.31 samples/sec Loss 16.0199 Epoch: 5 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:47,651-Speed 5529.61 samples/sec Loss 15.9871 Epoch: 5 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:20:57,092-Speed 5423.43 samples/sec Loss 16.0741 Epoch: 5 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:06,409-Speed 5495.58 samples/sec Loss 16.0927 Epoch: 5 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:15,972-Speed 5354.30 samples/sec Loss 16.1339 Epoch: 5 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:25,502-Speed 5372.98 samples/sec Loss 16.0699 Epoch: 5 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:34,956-Speed 5415.93 samples/sec Loss 16.0422 Epoch: 5 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:44,405-Speed 5419.00 samples/sec Loss 16.1344 Epoch: 5 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:21:53,903-Speed 5391.19 samples/sec Loss 16.0768 Epoch: 5 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:22:10,729-[lfw][26000]XNorm: 23.711174 Training: 2021-03-17 23:22:10,729-[lfw][26000]Accuracy-Flip: 0.99033+-0.00464 Training: 2021-03-17 23:22:10,729-[lfw][26000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:22:29,300-[cfp_fp][26000]XNorm: 19.515315 Training: 2021-03-17 23:22:29,300-[cfp_fp][26000]Accuracy-Flip: 0.91471+-0.01579 Training: 2021-03-17 23:22:29,300-[cfp_fp][26000]Accuracy-Highest: 0.92200 Training: 2021-03-17 23:22:45,303-[agedb_30][26000]XNorm: 23.184868 Training: 2021-03-17 23:22:45,303-[agedb_30][26000]Accuracy-Flip: 0.93217+-0.01592 Training: 2021-03-17 23:22:45,303-[agedb_30][26000]Accuracy-Highest: 0.94233 Training: 2021-03-17 23:22:54,590-Speed 843.69 samples/sec Loss 16.0878 Epoch: 5 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:04,424-Speed 5206.60 samples/sec Loss 16.0828 Epoch: 5 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:14,163-Speed 5257.66 samples/sec Loss 16.0107 Epoch: 5 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:23,785-Speed 5321.36 samples/sec Loss 16.0457 Epoch: 5 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:33,075-Speed 5511.86 samples/sec Loss 16.0778 Epoch: 5 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:42,565-Speed 5395.85 samples/sec Loss 15.9891 Epoch: 5 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:23:52,712-Speed 5045.96 samples/sec Loss 16.1406 Epoch: 5 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:02,428-Speed 5269.84 samples/sec Loss 16.1197 Epoch: 5 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:11,761-Speed 5486.56 samples/sec Loss 15.9901 Epoch: 5 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:21,191-Speed 5429.88 samples/sec Loss 16.0392 Epoch: 5 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:30,953-Speed 5245.05 samples/sec Loss 16.1044 Epoch: 5 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:40,324-Speed 5463.91 samples/sec Loss 16.0626 Epoch: 5 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:49,770-Speed 5420.37 samples/sec Loss 16.1241 Epoch: 5 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:24:59,263-Speed 5394.04 samples/sec Loss 16.1635 Epoch: 5 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:08,913-Speed 5306.12 samples/sec Loss 16.0463 Epoch: 5 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:18,202-Speed 5512.19 samples/sec Loss 16.0876 Epoch: 5 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:27,643-Speed 5423.90 samples/sec Loss 16.0258 Epoch: 5 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:37,257-Speed 5326.03 samples/sec Loss 16.1150 Epoch: 5 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:46,589-Speed 5487.12 samples/sec Loss 15.9991 Epoch: 5 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:25:55,936-Speed 5478.01 samples/sec Loss 16.1034 Epoch: 5 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:05,531-Speed 5336.38 samples/sec Loss 16.1178 Epoch: 5 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:15,063-Speed 5371.34 samples/sec Loss 16.0139 Epoch: 5 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:24,882-Speed 5214.56 samples/sec Loss 16.1203 Epoch: 5 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:34,285-Speed 5445.80 samples/sec Loss 16.0847 Epoch: 5 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:43,744-Speed 5412.94 samples/sec Loss 16.0821 Epoch: 5 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:26:53,160-Speed 5438.06 samples/sec Loss 16.0138 Epoch: 5 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:02,642-Speed 5400.26 samples/sec Loss 16.0639 Epoch: 5 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:12,190-Speed 5362.38 samples/sec Loss 15.9558 Epoch: 5 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:21,652-Speed 5412.18 samples/sec Loss 16.1113 Epoch: 5 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:31,420-Speed 5241.95 samples/sec Loss 16.0574 Epoch: 5 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:40,869-Speed 5418.89 samples/sec Loss 16.0951 Epoch: 5 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:27:50,340-Speed 5406.67 samples/sec Loss 16.0822 Epoch: 5 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:00,083-Speed 5254.93 samples/sec Loss 16.0054 Epoch: 5 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:09,560-Speed 5403.20 samples/sec Loss 16.0861 Epoch: 5 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:19,097-Speed 5368.99 samples/sec Loss 16.1564 Epoch: 5 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:28,621-Speed 5376.31 samples/sec Loss 16.0359 Epoch: 5 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:37,983-Speed 5469.47 samples/sec Loss 16.0764 Epoch: 5 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:47,605-Speed 5321.47 samples/sec Loss 15.9627 Epoch: 5 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:28:56,893-Speed 5512.75 samples/sec Loss 16.0555 Epoch: 5 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:29:06,068-Speed 5581.13 samples/sec Loss 16.0171 Epoch: 5 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:29:22,683-[lfw][28000]XNorm: 22.852132 Training: 2021-03-17 23:29:22,683-[lfw][28000]Accuracy-Flip: 0.99167+-0.00511 Training: 2021-03-17 23:29:22,685-[lfw][28000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:29:41,129-[cfp_fp][28000]XNorm: 18.809203 Training: 2021-03-17 23:29:41,129-[cfp_fp][28000]Accuracy-Flip: 0.92400+-0.01292 Training: 2021-03-17 23:29:41,129-[cfp_fp][28000]Accuracy-Highest: 0.92400 Training: 2021-03-17 23:29:57,064-[agedb_30][28000]XNorm: 22.011289 Training: 2021-03-17 23:29:57,064-[agedb_30][28000]Accuracy-Flip: 0.93800+-0.01318 Training: 2021-03-17 23:29:57,064-[agedb_30][28000]Accuracy-Highest: 0.94233 Training: 2021-03-17 23:30:06,446-Speed 848.00 samples/sec Loss 16.0071 Epoch: 5 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:30:15,796-Speed 5476.05 samples/sec Loss 16.0636 Epoch: 5 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:30:25,312-Speed 5381.12 samples/sec Loss 16.0262 Epoch: 5 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:30:34,870-Speed 5357.27 samples/sec Loss 16.1175 Epoch: 5 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:30:44,387-Speed 5379.88 samples/sec Loss 16.0838 Epoch: 5 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:30:53,849-Speed 5411.75 samples/sec Loss 16.0016 Epoch: 5 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:03,091-Speed 5540.04 samples/sec Loss 15.9764 Epoch: 5 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:12,837-Speed 5253.70 samples/sec Loss 16.0120 Epoch: 5 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:22,424-Speed 5340.80 samples/sec Loss 16.0878 Epoch: 5 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:32,241-Speed 5215.78 samples/sec Loss 15.9363 Epoch: 5 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:41,682-Speed 5423.84 samples/sec Loss 16.0967 Epoch: 5 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:31:51,252-Speed 5350.07 samples/sec Loss 15.9428 Epoch: 5 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:00,696-Speed 5421.96 samples/sec Loss 15.9665 Epoch: 5 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:10,406-Speed 5273.28 samples/sec Loss 16.0785 Epoch: 5 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:19,783-Speed 5460.49 samples/sec Loss 16.1000 Epoch: 5 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:29,415-Speed 5315.91 samples/sec Loss 16.0699 Epoch: 5 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:39,166-Speed 5250.83 samples/sec Loss 16.0786 Epoch: 5 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:48,787-Speed 5322.10 samples/sec Loss 16.0358 Epoch: 5 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:32:58,186-Speed 5448.06 samples/sec Loss 15.9785 Epoch: 5 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:07,510-Speed 5491.07 samples/sec Loss 16.0900 Epoch: 5 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:16,995-Speed 5398.57 samples/sec Loss 15.9642 Epoch: 5 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:26,434-Speed 5424.38 samples/sec Loss 15.9662 Epoch: 5 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:35,917-Speed 5399.71 samples/sec Loss 15.9716 Epoch: 5 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:45,411-Speed 5393.29 samples/sec Loss 15.9640 Epoch: 5 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:33:55,100-Speed 5284.35 samples/sec Loss 15.9957 Epoch: 5 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:04,691-Speed 5339.27 samples/sec Loss 15.9534 Epoch: 5 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:14,124-Speed 5427.78 samples/sec Loss 16.0978 Epoch: 5 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:23,473-Speed 5477.25 samples/sec Loss 16.0279 Epoch: 5 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:32,962-Speed 5395.73 samples/sec Loss 16.0654 Epoch: 5 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:42,508-Speed 5364.22 samples/sec Loss 15.9978 Epoch: 5 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:34:51,737-Speed 5547.71 samples/sec Loss 16.0204 Epoch: 5 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:01,082-Speed 5479.30 samples/sec Loss 15.9113 Epoch: 5 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:10,581-Speed 5390.34 samples/sec Loss 15.9771 Epoch: 5 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:20,190-Speed 5328.88 samples/sec Loss 15.9782 Epoch: 5 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:29,580-Speed 5453.09 samples/sec Loss 16.0917 Epoch: 5 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:39,069-Speed 5396.26 samples/sec Loss 16.0840 Epoch: 5 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:35:48,418-Speed 5476.77 samples/sec Loss 15.9865 Epoch: 5 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:36:00,704-Speed 4167.39 samples/sec Loss 15.8365 Epoch: 6 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:36:10,545-Speed 5203.26 samples/sec Loss 15.1703 Epoch: 6 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:36:20,180-Speed 5314.50 samples/sec Loss 15.2700 Epoch: 6 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:36:36,942-[lfw][30000]XNorm: 21.142089 Training: 2021-03-17 23:36:36,943-[lfw][30000]Accuracy-Flip: 0.99233+-0.00478 Training: 2021-03-17 23:36:36,943-[lfw][30000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:36:55,339-[cfp_fp][30000]XNorm: 17.323348 Training: 2021-03-17 23:36:55,339-[cfp_fp][30000]Accuracy-Flip: 0.91743+-0.01205 Training: 2021-03-17 23:36:55,339-[cfp_fp][30000]Accuracy-Highest: 0.92400 Training: 2021-03-17 23:37:11,302-[agedb_30][30000]XNorm: 20.407250 Training: 2021-03-17 23:37:11,302-[agedb_30][30000]Accuracy-Flip: 0.94500+-0.00946 Training: 2021-03-17 23:37:11,302-[agedb_30][30000]Accuracy-Highest: 0.94500 Training: 2021-03-17 23:37:20,575-Speed 847.76 samples/sec Loss 15.4210 Epoch: 6 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:37:30,008-Speed 5428.37 samples/sec Loss 15.6143 Epoch: 6 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:37:39,479-Speed 5406.13 samples/sec Loss 15.6352 Epoch: 6 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:37:49,131-Speed 5305.30 samples/sec Loss 15.7418 Epoch: 6 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:37:58,978-Speed 5199.50 samples/sec Loss 15.8525 Epoch: 6 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:08,505-Speed 5374.86 samples/sec Loss 15.8420 Epoch: 6 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:18,213-Speed 5274.17 samples/sec Loss 15.8353 Epoch: 6 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:27,757-Speed 5365.06 samples/sec Loss 15.9471 Epoch: 6 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:37,259-Speed 5389.03 samples/sec Loss 15.9152 Epoch: 6 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:46,675-Speed 5437.69 samples/sec Loss 15.9403 Epoch: 6 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:38:56,222-Speed 5363.24 samples/sec Loss 15.9148 Epoch: 6 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:05,656-Speed 5427.39 samples/sec Loss 15.9595 Epoch: 6 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:15,197-Speed 5366.62 samples/sec Loss 16.0233 Epoch: 6 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:24,851-Speed 5303.99 samples/sec Loss 15.9683 Epoch: 6 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:34,420-Speed 5350.92 samples/sec Loss 15.9603 Epoch: 6 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:44,088-Speed 5296.14 samples/sec Loss 15.9501 Epoch: 6 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:39:53,645-Speed 5357.38 samples/sec Loss 15.9458 Epoch: 6 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:03,363-Speed 5268.70 samples/sec Loss 15.9972 Epoch: 6 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:12,846-Speed 5399.69 samples/sec Loss 15.9958 Epoch: 6 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:22,412-Speed 5352.59 samples/sec Loss 15.9735 Epoch: 6 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:31,934-Speed 5377.47 samples/sec Loss 15.9467 Epoch: 6 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:41,332-Speed 5448.42 samples/sec Loss 16.0131 Epoch: 6 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:40:51,300-Speed 5136.65 samples/sec Loss 15.9333 Epoch: 6 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:01,128-Speed 5209.98 samples/sec Loss 15.9291 Epoch: 6 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:10,623-Speed 5392.36 samples/sec Loss 16.0530 Epoch: 6 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:20,141-Speed 5379.69 samples/sec Loss 15.9721 Epoch: 6 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:29,413-Speed 5522.41 samples/sec Loss 15.9911 Epoch: 6 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:38,986-Speed 5348.60 samples/sec Loss 15.9408 Epoch: 6 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:48,256-Speed 5523.55 samples/sec Loss 15.9530 Epoch: 6 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:41:57,798-Speed 5365.74 samples/sec Loss 15.9408 Epoch: 6 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:07,030-Speed 5546.69 samples/sec Loss 16.0321 Epoch: 6 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:16,317-Speed 5513.47 samples/sec Loss 15.9590 Epoch: 6 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:25,682-Speed 5467.43 samples/sec Loss 15.9449 Epoch: 6 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:34,872-Speed 5571.77 samples/sec Loss 15.9296 Epoch: 6 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:44,287-Speed 5438.46 samples/sec Loss 15.8523 Epoch: 6 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:42:53,657-Speed 5464.69 samples/sec Loss 16.0568 Epoch: 6 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:43:03,150-Speed 5393.71 samples/sec Loss 15.9171 Epoch: 6 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:43:12,639-Speed 5396.04 samples/sec Loss 15.8815 Epoch: 6 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:43:22,166-Speed 5374.40 samples/sec Loss 15.9339 Epoch: 6 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:43:31,678-Speed 5383.01 samples/sec Loss 16.0174 Epoch: 6 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:43:48,328-[lfw][32000]XNorm: 23.720215 Training: 2021-03-17 23:43:48,328-[lfw][32000]Accuracy-Flip: 0.99317+-0.00398 Training: 2021-03-17 23:43:48,329-[lfw][32000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:44:06,822-[cfp_fp][32000]XNorm: 19.155370 Training: 2021-03-17 23:44:06,822-[cfp_fp][32000]Accuracy-Flip: 0.91629+-0.01429 Training: 2021-03-17 23:44:06,822-[cfp_fp][32000]Accuracy-Highest: 0.92400 Training: 2021-03-17 23:44:22,823-[agedb_30][32000]XNorm: 23.138137 Training: 2021-03-17 23:44:22,823-[agedb_30][32000]Accuracy-Flip: 0.94017+-0.01146 Training: 2021-03-17 23:44:22,823-[agedb_30][32000]Accuracy-Highest: 0.94500 Training: 2021-03-17 23:44:32,233-Speed 845.52 samples/sec Loss 16.1063 Epoch: 6 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:44:42,004-Speed 5240.49 samples/sec Loss 15.9527 Epoch: 6 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:44:51,423-Speed 5435.87 samples/sec Loss 16.0084 Epoch: 6 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:00,717-Speed 5509.49 samples/sec Loss 15.9321 Epoch: 6 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:10,141-Speed 5433.03 samples/sec Loss 15.9653 Epoch: 6 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:19,844-Speed 5276.84 samples/sec Loss 15.9518 Epoch: 6 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:29,385-Speed 5367.02 samples/sec Loss 16.0237 Epoch: 6 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:38,847-Speed 5411.08 samples/sec Loss 15.9851 Epoch: 6 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:48,219-Speed 5463.61 samples/sec Loss 15.9585 Epoch: 6 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:45:57,504-Speed 5514.50 samples/sec Loss 15.9633 Epoch: 6 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:07,006-Speed 5389.02 samples/sec Loss 16.1086 Epoch: 6 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:16,598-Speed 5338.01 samples/sec Loss 15.9126 Epoch: 6 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:26,093-Speed 5392.43 samples/sec Loss 15.8844 Epoch: 6 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:35,386-Speed 5510.10 samples/sec Loss 15.9459 Epoch: 6 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:44,923-Speed 5369.10 samples/sec Loss 15.8876 Epoch: 6 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:46:54,539-Speed 5324.83 samples/sec Loss 16.0347 Epoch: 6 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:47:04,167-Speed 5318.32 samples/sec Loss 15.9890 Epoch: 6 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 23:47:13,899-Speed 5261.13 samples/sec Loss 16.0450 Epoch: 6 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:47:23,106-Speed 5561.42 samples/sec Loss 15.9752 Epoch: 6 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:47:32,373-Speed 5525.59 samples/sec Loss 15.8882 Epoch: 6 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:47:41,733-Speed 5470.20 samples/sec Loss 15.9675 Epoch: 6 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:47:51,116-Speed 5456.72 samples/sec Loss 15.9746 Epoch: 6 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:00,569-Speed 5416.79 samples/sec Loss 15.9729 Epoch: 6 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:10,053-Speed 5398.99 samples/sec Loss 15.9226 Epoch: 6 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:19,604-Speed 5360.53 samples/sec Loss 15.9840 Epoch: 6 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:29,270-Speed 5297.32 samples/sec Loss 15.9396 Epoch: 6 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:39,166-Speed 5174.36 samples/sec Loss 16.0039 Epoch: 6 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:48,627-Speed 5411.77 samples/sec Loss 15.8477 Epoch: 6 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:48:58,031-Speed 5444.94 samples/sec Loss 15.9278 Epoch: 6 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:07,575-Speed 5365.29 samples/sec Loss 15.9704 Epoch: 6 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:17,150-Speed 5347.61 samples/sec Loss 15.9317 Epoch: 6 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:27,184-Speed 5102.86 samples/sec Loss 15.9017 Epoch: 6 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:36,717-Speed 5371.64 samples/sec Loss 15.9514 Epoch: 6 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:46,169-Speed 5416.76 samples/sec Loss 16.0720 Epoch: 6 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:49:55,747-Speed 5346.15 samples/sec Loss 15.8628 Epoch: 6 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:05,212-Speed 5409.67 samples/sec Loss 15.9131 Epoch: 6 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:14,724-Speed 5382.61 samples/sec Loss 15.9477 Epoch: 6 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:24,120-Speed 5449.72 samples/sec Loss 15.9192 Epoch: 6 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:33,498-Speed 5459.70 samples/sec Loss 15.9613 Epoch: 6 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:42,610-Speed 5619.34 samples/sec Loss 15.9610 Epoch: 6 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:50:59,338-[lfw][34000]XNorm: 23.893689 Training: 2021-03-17 23:50:59,338-[lfw][34000]Accuracy-Flip: 0.98967+-0.00521 Training: 2021-03-17 23:50:59,338-[lfw][34000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:51:17,844-[cfp_fp][34000]XNorm: 19.598698 Training: 2021-03-17 23:51:17,844-[cfp_fp][34000]Accuracy-Flip: 0.91814+-0.01157 Training: 2021-03-17 23:51:17,844-[cfp_fp][34000]Accuracy-Highest: 0.92400 Training: 2021-03-17 23:51:33,874-[agedb_30][34000]XNorm: 22.490367 Training: 2021-03-17 23:51:33,875-[agedb_30][34000]Accuracy-Flip: 0.93817+-0.00871 Training: 2021-03-17 23:51:33,875-[agedb_30][34000]Accuracy-Highest: 0.94500 Training: 2021-03-17 23:51:43,246-Speed 844.40 samples/sec Loss 15.9247 Epoch: 6 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:51:52,592-Speed 5478.98 samples/sec Loss 15.9373 Epoch: 6 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:02,164-Speed 5349.02 samples/sec Loss 15.9992 Epoch: 6 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:11,701-Speed 5369.28 samples/sec Loss 16.0324 Epoch: 6 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:20,933-Speed 5545.88 samples/sec Loss 15.9243 Epoch: 6 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:30,250-Speed 5495.70 samples/sec Loss 15.9372 Epoch: 6 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:39,616-Speed 5466.90 samples/sec Loss 15.9095 Epoch: 6 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:49,095-Speed 5402.19 samples/sec Loss 15.9262 Epoch: 6 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:52:58,412-Speed 5495.64 samples/sec Loss 15.8747 Epoch: 6 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:07,987-Speed 5347.85 samples/sec Loss 15.9095 Epoch: 6 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:17,553-Speed 5352.54 samples/sec Loss 15.9200 Epoch: 6 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:26,997-Speed 5421.62 samples/sec Loss 15.9766 Epoch: 6 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:36,599-Speed 5332.55 samples/sec Loss 15.8815 Epoch: 6 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:45,906-Speed 5501.98 samples/sec Loss 15.9200 Epoch: 6 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:53:55,663-Speed 5248.05 samples/sec Loss 15.9708 Epoch: 6 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:05,401-Speed 5257.66 samples/sec Loss 15.9210 Epoch: 6 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:14,796-Speed 5450.38 samples/sec Loss 15.9777 Epoch: 6 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:26,881-Speed 4236.94 samples/sec Loss 15.4415 Epoch: 7 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:36,582-Speed 5278.41 samples/sec Loss 15.2298 Epoch: 7 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:46,105-Speed 5376.47 samples/sec Loss 15.3335 Epoch: 7 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:54:55,868-Speed 5244.62 samples/sec Loss 15.4480 Epoch: 7 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:05,563-Speed 5281.93 samples/sec Loss 15.4969 Epoch: 7 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:15,351-Speed 5231.05 samples/sec Loss 15.6486 Epoch: 7 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:24,771-Speed 5435.45 samples/sec Loss 15.7095 Epoch: 7 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:34,176-Speed 5444.74 samples/sec Loss 15.7621 Epoch: 7 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:43,811-Speed 5314.27 samples/sec Loss 15.7920 Epoch: 7 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:55:53,477-Speed 5296.99 samples/sec Loss 15.7494 Epoch: 7 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:02,790-Speed 5498.30 samples/sec Loss 15.7799 Epoch: 7 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:12,415-Speed 5319.75 samples/sec Loss 15.7840 Epoch: 7 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:21,836-Speed 5434.80 samples/sec Loss 15.8142 Epoch: 7 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:31,262-Speed 5432.10 samples/sec Loss 15.9206 Epoch: 7 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:40,689-Speed 5431.81 samples/sec Loss 15.8567 Epoch: 7 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:50,328-Speed 5311.76 samples/sec Loss 15.7361 Epoch: 7 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:56:59,840-Speed 5383.46 samples/sec Loss 15.8845 Epoch: 7 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:09,657-Speed 5215.81 samples/sec Loss 15.9200 Epoch: 7 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:19,480-Speed 5212.32 samples/sec Loss 15.8989 Epoch: 7 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:28,788-Speed 5500.99 samples/sec Loss 15.9521 Epoch: 7 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:38,338-Speed 5361.68 samples/sec Loss 15.9133 Epoch: 7 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:47,862-Speed 5375.96 samples/sec Loss 16.0128 Epoch: 7 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:57:57,792-Speed 5156.54 samples/sec Loss 16.0189 Epoch: 7 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:58:14,482-[lfw][36000]XNorm: 22.683571 Training: 2021-03-17 23:58:14,482-[lfw][36000]Accuracy-Flip: 0.99233+-0.00473 Training: 2021-03-17 23:58:14,483-[lfw][36000]Accuracy-Highest: 0.99333 Training: 2021-03-17 23:58:33,057-[cfp_fp][36000]XNorm: 18.513561 Training: 2021-03-17 23:58:33,058-[cfp_fp][36000]Accuracy-Flip: 0.92200+-0.01949 Training: 2021-03-17 23:58:33,058-[cfp_fp][36000]Accuracy-Highest: 0.92400 Training: 2021-03-17 23:58:49,251-[agedb_30][36000]XNorm: 21.671191 Training: 2021-03-17 23:58:49,251-[agedb_30][36000]Accuracy-Flip: 0.93783+-0.01216 Training: 2021-03-17 23:58:49,251-[agedb_30][36000]Accuracy-Highest: 0.94500 Training: 2021-03-17 23:58:58,797-Speed 839.29 samples/sec Loss 15.9661 Epoch: 7 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:08,339-Speed 5366.22 samples/sec Loss 16.0530 Epoch: 7 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:17,828-Speed 5395.92 samples/sec Loss 15.9205 Epoch: 7 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:27,203-Speed 5461.47 samples/sec Loss 15.8157 Epoch: 7 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:36,761-Speed 5357.18 samples/sec Loss 15.8805 Epoch: 7 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:46,098-Speed 5484.27 samples/sec Loss 15.8598 Epoch: 7 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 23:59:55,454-Speed 5472.56 samples/sec Loss 15.9658 Epoch: 7 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:05,024-Speed 5351.00 samples/sec Loss 15.8917 Epoch: 7 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:14,581-Speed 5357.73 samples/sec Loss 15.8837 Epoch: 7 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:24,119-Speed 5368.08 samples/sec Loss 15.9233 Epoch: 7 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:33,524-Speed 5444.61 samples/sec Loss 15.9239 Epoch: 7 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:42,840-Speed 5496.33 samples/sec Loss 15.9453 Epoch: 7 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:00:52,266-Speed 5432.15 samples/sec Loss 15.8859 Epoch: 7 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:01,714-Speed 5419.75 samples/sec Loss 15.9775 Epoch: 7 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:11,248-Speed 5370.06 samples/sec Loss 15.8744 Epoch: 7 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:20,690-Speed 5423.15 samples/sec Loss 15.8780 Epoch: 7 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:30,357-Speed 5297.16 samples/sec Loss 15.9455 Epoch: 7 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:39,780-Speed 5433.95 samples/sec Loss 15.9321 Epoch: 7 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:49,286-Speed 5386.60 samples/sec Loss 15.7714 Epoch: 7 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:01:58,815-Speed 5373.21 samples/sec Loss 15.9091 Epoch: 7 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:08,308-Speed 5393.88 samples/sec Loss 15.8303 Epoch: 7 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:17,799-Speed 5394.96 samples/sec Loss 15.9303 Epoch: 7 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:27,274-Speed 5403.75 samples/sec Loss 15.9507 Epoch: 7 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:36,848-Speed 5348.17 samples/sec Loss 15.9647 Epoch: 7 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:46,397-Speed 5362.39 samples/sec Loss 15.9110 Epoch: 7 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:02:55,870-Speed 5405.24 samples/sec Loss 15.8607 Epoch: 7 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:05,229-Speed 5470.99 samples/sec Loss 15.8619 Epoch: 7 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:14,736-Speed 5386.24 samples/sec Loss 15.8485 Epoch: 7 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:24,113-Speed 5460.07 samples/sec Loss 15.8562 Epoch: 7 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:33,791-Speed 5291.19 samples/sec Loss 15.8866 Epoch: 7 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:43,301-Speed 5384.20 samples/sec Loss 15.9416 Epoch: 7 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:03:52,849-Speed 5362.60 samples/sec Loss 15.8991 Epoch: 7 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:02,283-Speed 5427.48 samples/sec Loss 15.8817 Epoch: 7 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:11,725-Speed 5423.02 samples/sec Loss 15.8981 Epoch: 7 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:21,376-Speed 5305.47 samples/sec Loss 15.8702 Epoch: 7 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:30,670-Speed 5509.63 samples/sec Loss 15.8527 Epoch: 7 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:40,453-Speed 5233.84 samples/sec Loss 15.9009 Epoch: 7 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:49,778-Speed 5490.49 samples/sec Loss 15.9018 Epoch: 7 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:04:59,191-Speed 5440.05 samples/sec Loss 15.8618 Epoch: 7 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:05:08,630-Speed 5424.92 samples/sec Loss 15.8907 Epoch: 7 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:05:25,387-[lfw][38000]XNorm: 23.433744 Training: 2021-03-18 00:05:25,387-[lfw][38000]Accuracy-Flip: 0.99183+-0.00369 Training: 2021-03-18 00:05:25,399-[lfw][38000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:05:43,840-[cfp_fp][38000]XNorm: 19.345214 Training: 2021-03-18 00:05:43,840-[cfp_fp][38000]Accuracy-Flip: 0.91186+-0.01523 Training: 2021-03-18 00:05:43,840-[cfp_fp][38000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:05:59,796-[agedb_30][38000]XNorm: 22.387933 Training: 2021-03-18 00:05:59,797-[agedb_30][38000]Accuracy-Flip: 0.94050+-0.01036 Training: 2021-03-18 00:05:59,797-[agedb_30][38000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:06:09,213-Speed 845.12 samples/sec Loss 15.9753 Epoch: 7 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:06:18,540-Speed 5490.24 samples/sec Loss 15.9165 Epoch: 7 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:06:28,080-Speed 5367.08 samples/sec Loss 15.9109 Epoch: 7 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:06:37,444-Speed 5467.96 samples/sec Loss 15.7876 Epoch: 7 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:06:47,335-Speed 5177.00 samples/sec Loss 15.9399 Epoch: 7 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:06:56,724-Speed 5453.24 samples/sec Loss 15.8418 Epoch: 7 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:06,412-Speed 5285.00 samples/sec Loss 15.8981 Epoch: 7 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:16,156-Speed 5254.82 samples/sec Loss 15.8497 Epoch: 7 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:25,896-Speed 5257.14 samples/sec Loss 15.8403 Epoch: 7 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:35,684-Speed 5231.06 samples/sec Loss 15.7859 Epoch: 7 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:45,087-Speed 5445.70 samples/sec Loss 15.9479 Epoch: 7 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:07:54,566-Speed 5401.55 samples/sec Loss 15.8819 Epoch: 7 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:04,210-Speed 5309.25 samples/sec Loss 15.8551 Epoch: 7 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:13,633-Speed 5434.19 samples/sec Loss 15.8809 Epoch: 7 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:23,224-Speed 5338.82 samples/sec Loss 15.9392 Epoch: 7 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:32,636-Speed 5440.12 samples/sec Loss 15.8884 Epoch: 7 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:42,037-Speed 5446.47 samples/sec Loss 15.9275 Epoch: 7 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:08:51,961-Speed 5159.54 samples/sec Loss 15.9030 Epoch: 7 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:01,371-Speed 5441.29 samples/sec Loss 15.8332 Epoch: 7 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:11,063-Speed 5283.27 samples/sec Loss 15.7817 Epoch: 7 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:20,819-Speed 5248.17 samples/sec Loss 15.8876 Epoch: 7 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:30,353-Speed 5370.92 samples/sec Loss 15.9409 Epoch: 7 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:39,788-Speed 5427.08 samples/sec Loss 15.7777 Epoch: 7 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:49,220-Speed 5428.57 samples/sec Loss 15.8965 Epoch: 7 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:09:58,495-Speed 5520.13 samples/sec Loss 15.9220 Epoch: 7 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:07,890-Speed 5450.36 samples/sec Loss 15.8619 Epoch: 7 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:17,431-Speed 5366.62 samples/sec Loss 15.8299 Epoch: 7 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:26,752-Speed 5493.51 samples/sec Loss 15.8723 Epoch: 7 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:36,052-Speed 5505.27 samples/sec Loss 15.8990 Epoch: 7 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:45,609-Speed 5358.07 samples/sec Loss 15.8698 Epoch: 7 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:10:55,064-Speed 5415.27 samples/sec Loss 15.8840 Epoch: 7 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:04,497-Speed 5427.84 samples/sec Loss 15.9111 Epoch: 7 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:14,407-Speed 5166.97 samples/sec Loss 15.9184 Epoch: 7 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:23,830-Speed 5433.80 samples/sec Loss 15.8937 Epoch: 7 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:33,259-Speed 5430.50 samples/sec Loss 15.7784 Epoch: 7 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:42,784-Speed 5375.62 samples/sec Loss 15.8497 Epoch: 7 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:11:52,230-Speed 5421.13 samples/sec Loss 15.8962 Epoch: 7 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:12:04,449-Speed 4190.20 samples/sec Loss 15.1826 Epoch: 8 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:12:13,970-Speed 5378.09 samples/sec Loss 15.1656 Epoch: 8 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:12:23,409-Speed 5424.84 samples/sec Loss 15.3461 Epoch: 8 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:12:40,243-[lfw][40000]XNorm: 24.603416 Training: 2021-03-18 00:12:40,243-[lfw][40000]Accuracy-Flip: 0.99267+-0.00467 Training: 2021-03-18 00:12:40,243-[lfw][40000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:12:58,826-[cfp_fp][40000]XNorm: 20.138668 Training: 2021-03-18 00:12:58,826-[cfp_fp][40000]Accuracy-Flip: 0.91643+-0.01238 Training: 2021-03-18 00:12:58,826-[cfp_fp][40000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:13:14,888-[agedb_30][40000]XNorm: 23.553715 Training: 2021-03-18 00:13:14,888-[agedb_30][40000]Accuracy-Flip: 0.93950+-0.01340 Training: 2021-03-18 00:13:14,888-[agedb_30][40000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:13:24,314-Speed 840.66 samples/sec Loss 15.4314 Epoch: 8 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:13:33,838-Speed 5375.99 samples/sec Loss 15.5416 Epoch: 8 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:13:43,219-Speed 5458.28 samples/sec Loss 15.5421 Epoch: 8 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:13:52,598-Speed 5459.33 samples/sec Loss 15.7180 Epoch: 8 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:02,075-Speed 5402.97 samples/sec Loss 15.7788 Epoch: 8 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:11,633-Speed 5357.29 samples/sec Loss 15.6425 Epoch: 8 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:21,410-Speed 5237.36 samples/sec Loss 15.6264 Epoch: 8 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:31,173-Speed 5244.68 samples/sec Loss 15.7886 Epoch: 8 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:40,902-Speed 5262.81 samples/sec Loss 15.8173 Epoch: 8 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:50,571-Speed 5295.95 samples/sec Loss 15.8454 Epoch: 8 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:14:59,919-Speed 5477.42 samples/sec Loss 15.8384 Epoch: 8 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:09,614-Speed 5281.42 samples/sec Loss 15.7893 Epoch: 8 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:19,112-Speed 5390.90 samples/sec Loss 15.8464 Epoch: 8 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:28,690-Speed 5346.05 samples/sec Loss 15.8718 Epoch: 8 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:38,234-Speed 5364.75 samples/sec Loss 15.8875 Epoch: 8 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:47,713-Speed 5402.00 samples/sec Loss 15.9600 Epoch: 8 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:15:57,377-Speed 5298.52 samples/sec Loss 15.8415 Epoch: 8 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:07,585-Speed 5015.92 samples/sec Loss 15.8146 Epoch: 8 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:17,155-Speed 5350.68 samples/sec Loss 15.8346 Epoch: 8 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:26,722-Speed 5351.67 samples/sec Loss 15.8228 Epoch: 8 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:36,083-Speed 5469.99 samples/sec Loss 15.8733 Epoch: 8 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:45,493-Speed 5441.36 samples/sec Loss 15.8967 Epoch: 8 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:16:54,944-Speed 5418.17 samples/sec Loss 15.8230 Epoch: 8 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:04,507-Speed 5354.22 samples/sec Loss 15.8406 Epoch: 8 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:14,233-Speed 5264.56 samples/sec Loss 15.9099 Epoch: 8 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:23,765-Speed 5371.85 samples/sec Loss 15.9380 Epoch: 8 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:33,271-Speed 5386.55 samples/sec Loss 15.9056 Epoch: 8 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:42,645-Speed 5461.97 samples/sec Loss 15.8866 Epoch: 8 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:17:52,235-Speed 5338.86 samples/sec Loss 15.9563 Epoch: 8 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:01,812-Speed 5346.98 samples/sec Loss 15.8712 Epoch: 8 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:11,263-Speed 5418.20 samples/sec Loss 15.9214 Epoch: 8 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:20,859-Speed 5335.47 samples/sec Loss 15.8236 Epoch: 8 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:30,290-Speed 5429.65 samples/sec Loss 15.8649 Epoch: 8 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:39,684-Speed 5450.42 samples/sec Loss 15.8495 Epoch: 8 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:49,462-Speed 5236.81 samples/sec Loss 15.8524 Epoch: 8 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:18:59,140-Speed 5290.27 samples/sec Loss 15.8298 Epoch: 8 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:19:08,855-Speed 5270.72 samples/sec Loss 15.8415 Epoch: 8 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:19:18,279-Speed 5433.17 samples/sec Loss 15.8210 Epoch: 8 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:19:28,113-Speed 5207.26 samples/sec Loss 15.8320 Epoch: 8 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:19:37,497-Speed 5456.33 samples/sec Loss 15.8296 Epoch: 8 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:19:54,185-[lfw][42000]XNorm: 24.014952 Training: 2021-03-18 00:19:54,185-[lfw][42000]Accuracy-Flip: 0.99167+-0.00342 Training: 2021-03-18 00:19:54,185-[lfw][42000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:20:12,719-[cfp_fp][42000]XNorm: 19.626186 Training: 2021-03-18 00:20:12,719-[cfp_fp][42000]Accuracy-Flip: 0.91586+-0.01308 Training: 2021-03-18 00:20:12,719-[cfp_fp][42000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:20:28,727-[agedb_30][42000]XNorm: 22.477699 Training: 2021-03-18 00:20:28,728-[agedb_30][42000]Accuracy-Flip: 0.93733+-0.01065 Training: 2021-03-18 00:20:28,728-[agedb_30][42000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:20:38,053-Speed 845.51 samples/sec Loss 15.8761 Epoch: 8 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:20:47,401-Speed 5477.36 samples/sec Loss 15.8807 Epoch: 8 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:20:56,837-Speed 5426.22 samples/sec Loss 15.9012 Epoch: 8 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:06,385-Speed 5362.85 samples/sec Loss 15.9110 Epoch: 8 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:15,879-Speed 5393.44 samples/sec Loss 15.8402 Epoch: 8 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:25,225-Speed 5478.40 samples/sec Loss 15.8971 Epoch: 8 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:35,216-Speed 5124.76 samples/sec Loss 15.9285 Epoch: 8 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:44,753-Speed 5368.76 samples/sec Loss 15.9416 Epoch: 8 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:21:54,074-Speed 5493.36 samples/sec Loss 15.8154 Epoch: 8 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:03,772-Speed 5279.66 samples/sec Loss 15.8551 Epoch: 8 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:13,188-Speed 5438.07 samples/sec Loss 15.8651 Epoch: 8 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:23,006-Speed 5215.38 samples/sec Loss 15.8081 Epoch: 8 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:32,413-Speed 5442.95 samples/sec Loss 15.7810 Epoch: 8 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:41,640-Speed 5549.46 samples/sec Loss 15.8229 Epoch: 8 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:22:50,939-Speed 5506.20 samples/sec Loss 15.9769 Epoch: 8 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:00,394-Speed 5415.44 samples/sec Loss 15.8257 Epoch: 8 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:10,054-Speed 5300.33 samples/sec Loss 15.8856 Epoch: 8 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:19,657-Speed 5332.20 samples/sec Loss 15.9427 Epoch: 8 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:29,065-Speed 5442.52 samples/sec Loss 15.7953 Epoch: 8 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:38,808-Speed 5255.20 samples/sec Loss 15.7574 Epoch: 8 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:48,331-Speed 5376.95 samples/sec Loss 15.8197 Epoch: 8 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:23:58,157-Speed 5210.74 samples/sec Loss 15.7784 Epoch: 8 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:07,523-Speed 5467.21 samples/sec Loss 15.7786 Epoch: 8 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:16,889-Speed 5466.82 samples/sec Loss 15.8341 Epoch: 8 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:26,704-Speed 5216.88 samples/sec Loss 15.8272 Epoch: 8 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:36,132-Speed 5431.03 samples/sec Loss 15.8459 Epoch: 8 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:46,011-Speed 5183.26 samples/sec Loss 15.7701 Epoch: 8 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:24:55,520-Speed 5384.49 samples/sec Loss 15.7920 Epoch: 8 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:04,727-Speed 5561.41 samples/sec Loss 15.8662 Epoch: 8 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:14,128-Speed 5446.95 samples/sec Loss 15.7563 Epoch: 8 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:23,619-Speed 5394.66 samples/sec Loss 15.8587 Epoch: 8 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:33,015-Speed 5449.40 samples/sec Loss 15.8405 Epoch: 8 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:42,661-Speed 5308.54 samples/sec Loss 15.8310 Epoch: 8 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:25:51,946-Speed 5514.68 samples/sec Loss 15.8108 Epoch: 8 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:01,595-Speed 5306.52 samples/sec Loss 15.8220 Epoch: 8 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:10,931-Speed 5484.75 samples/sec Loss 15.8665 Epoch: 8 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:20,554-Speed 5320.62 samples/sec Loss 15.7289 Epoch: 8 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:30,027-Speed 5405.09 samples/sec Loss 15.8625 Epoch: 8 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:39,391-Speed 5468.48 samples/sec Loss 15.8340 Epoch: 8 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:26:48,942-Speed 5360.77 samples/sec Loss 15.7879 Epoch: 8 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:27:05,482-[lfw][44000]XNorm: 22.507961 Training: 2021-03-18 00:27:05,482-[lfw][44000]Accuracy-Flip: 0.99267+-0.00403 Training: 2021-03-18 00:27:05,482-[lfw][44000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:27:23,902-[cfp_fp][44000]XNorm: 18.846041 Training: 2021-03-18 00:27:23,903-[cfp_fp][44000]Accuracy-Flip: 0.92114+-0.01303 Training: 2021-03-18 00:27:23,903-[cfp_fp][44000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:27:39,844-[agedb_30][44000]XNorm: 21.789017 Training: 2021-03-18 00:27:39,844-[agedb_30][44000]Accuracy-Flip: 0.93617+-0.01155 Training: 2021-03-18 00:27:39,844-[agedb_30][44000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:27:49,214-Speed 849.49 samples/sec Loss 15.8167 Epoch: 8 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:27:58,595-Speed 5458.26 samples/sec Loss 15.8265 Epoch: 8 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:08,096-Speed 5389.01 samples/sec Loss 15.8735 Epoch: 8 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:17,430-Speed 5485.65 samples/sec Loss 15.8660 Epoch: 8 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:26,838-Speed 5442.50 samples/sec Loss 15.8404 Epoch: 8 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:36,191-Speed 5475.15 samples/sec Loss 15.7847 Epoch: 8 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:45,468-Speed 5518.85 samples/sec Loss 15.7493 Epoch: 8 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:28:54,930-Speed 5411.49 samples/sec Loss 15.8688 Epoch: 8 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:04,514-Speed 5342.94 samples/sec Loss 15.8394 Epoch: 8 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:14,019-Speed 5386.80 samples/sec Loss 15.7663 Epoch: 8 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:23,605-Speed 5341.41 samples/sec Loss 15.8590 Epoch: 8 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:32,796-Speed 5570.79 samples/sec Loss 15.7639 Epoch: 8 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:42,310-Speed 5382.00 samples/sec Loss 15.8833 Epoch: 8 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:29:51,726-Speed 5438.12 samples/sec Loss 15.8790 Epoch: 8 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:01,120-Speed 5450.22 samples/sec Loss 15.8439 Epoch: 8 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:10,654-Speed 5370.54 samples/sec Loss 15.8119 Epoch: 8 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:22,783-Speed 4221.57 samples/sec Loss 15.6995 Epoch: 9 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:32,318-Speed 5370.05 samples/sec Loss 15.0173 Epoch: 9 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:41,648-Speed 5488.15 samples/sec Loss 15.1475 Epoch: 9 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:30:51,420-Speed 5240.13 samples/sec Loss 15.2688 Epoch: 9 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:00,866-Speed 5420.23 samples/sec Loss 15.4500 Epoch: 9 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:10,495-Speed 5317.71 samples/sec Loss 15.4627 Epoch: 9 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:19,892-Speed 5449.07 samples/sec Loss 15.5891 Epoch: 9 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:29,294-Speed 5446.05 samples/sec Loss 15.7434 Epoch: 9 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:38,706-Speed 5440.69 samples/sec Loss 15.6544 Epoch: 9 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:48,575-Speed 5188.29 samples/sec Loss 15.6748 Epoch: 9 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:31:58,133-Speed 5356.89 samples/sec Loss 15.7392 Epoch: 9 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:07,570-Speed 5425.76 samples/sec Loss 15.8177 Epoch: 9 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:17,619-Speed 5095.62 samples/sec Loss 15.7621 Epoch: 9 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:27,015-Speed 5449.27 samples/sec Loss 15.7170 Epoch: 9 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:36,453-Speed 5425.28 samples/sec Loss 15.7724 Epoch: 9 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:46,228-Speed 5238.27 samples/sec Loss 15.7351 Epoch: 9 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:32:55,762-Speed 5370.52 samples/sec Loss 15.7102 Epoch: 9 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:05,796-Speed 5102.84 samples/sec Loss 15.8554 Epoch: 9 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:15,736-Speed 5151.45 samples/sec Loss 15.7849 Epoch: 9 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:25,062-Speed 5490.40 samples/sec Loss 15.8931 Epoch: 9 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:34,271-Speed 5559.82 samples/sec Loss 15.8076 Epoch: 9 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:43,830-Speed 5356.82 samples/sec Loss 15.7980 Epoch: 9 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:33:53,083-Speed 5533.78 samples/sec Loss 15.7650 Epoch: 9 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:34:02,702-Speed 5323.05 samples/sec Loss 15.7599 Epoch: 9 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:34:19,282-[lfw][46000]XNorm: 21.754176 Training: 2021-03-18 00:34:19,282-[lfw][46000]Accuracy-Flip: 0.99217+-0.00454 Training: 2021-03-18 00:34:19,282-[lfw][46000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:34:37,704-[cfp_fp][46000]XNorm: 17.756897 Training: 2021-03-18 00:34:37,705-[cfp_fp][46000]Accuracy-Flip: 0.91586+-0.01504 Training: 2021-03-18 00:34:37,705-[cfp_fp][46000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:34:53,668-[agedb_30][46000]XNorm: 20.467319 Training: 2021-03-18 00:34:53,669-[agedb_30][46000]Accuracy-Flip: 0.93817+-0.01217 Training: 2021-03-18 00:34:53,669-[agedb_30][46000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:35:02,930-Speed 850.11 samples/sec Loss 15.7150 Epoch: 9 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:12,472-Speed 5366.19 samples/sec Loss 15.8412 Epoch: 9 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:22,065-Speed 5337.68 samples/sec Loss 15.7103 Epoch: 9 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:31,456-Speed 5452.68 samples/sec Loss 15.8292 Epoch: 9 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:40,798-Speed 5481.30 samples/sec Loss 15.7611 Epoch: 9 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:50,183-Speed 5455.51 samples/sec Loss 15.7795 Epoch: 9 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:35:59,566-Speed 5456.99 samples/sec Loss 15.8711 Epoch: 9 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:09,106-Speed 5367.40 samples/sec Loss 15.8717 Epoch: 9 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:18,668-Speed 5354.72 samples/sec Loss 15.8052 Epoch: 9 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:28,197-Speed 5373.97 samples/sec Loss 15.8663 Epoch: 9 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:37,756-Speed 5356.49 samples/sec Loss 15.8478 Epoch: 9 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:47,226-Speed 5406.60 samples/sec Loss 15.8580 Epoch: 9 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:36:56,807-Speed 5344.23 samples/sec Loss 15.8593 Epoch: 9 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:06,208-Speed 5446.68 samples/sec Loss 15.7771 Epoch: 9 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:15,748-Speed 5367.26 samples/sec Loss 15.9023 Epoch: 9 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:25,253-Speed 5387.36 samples/sec Loss 15.7999 Epoch: 9 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:34,687-Speed 5427.37 samples/sec Loss 15.8504 Epoch: 9 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:44,025-Speed 5483.02 samples/sec Loss 15.7834 Epoch: 9 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:37:53,420-Speed 5450.12 samples/sec Loss 15.8236 Epoch: 9 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:03,044-Speed 5320.23 samples/sec Loss 15.7314 Epoch: 9 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:12,544-Speed 5389.80 samples/sec Loss 15.9086 Epoch: 9 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:21,714-Speed 5584.31 samples/sec Loss 15.8848 Epoch: 9 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:31,009-Speed 5508.47 samples/sec Loss 15.7766 Epoch: 9 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:40,371-Speed 5469.54 samples/sec Loss 15.7949 Epoch: 9 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:49,826-Speed 5415.07 samples/sec Loss 15.7422 Epoch: 9 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:38:59,275-Speed 5419.01 samples/sec Loss 15.8460 Epoch: 9 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:08,617-Speed 5481.03 samples/sec Loss 15.6857 Epoch: 9 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:18,073-Speed 5415.03 samples/sec Loss 15.7711 Epoch: 9 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:27,444-Speed 5463.70 samples/sec Loss 15.7260 Epoch: 9 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:36,833-Speed 5453.78 samples/sec Loss 15.7662 Epoch: 9 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:46,163-Speed 5488.12 samples/sec Loss 15.8439 Epoch: 9 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:39:55,593-Speed 5429.75 samples/sec Loss 15.7435 Epoch: 9 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:05,504-Speed 5166.24 samples/sec Loss 15.7620 Epoch: 9 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:15,337-Speed 5207.10 samples/sec Loss 15.8647 Epoch: 9 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:24,606-Speed 5524.50 samples/sec Loss 15.8503 Epoch: 9 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:34,051-Speed 5421.35 samples/sec Loss 15.6667 Epoch: 9 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:43,399-Speed 5477.04 samples/sec Loss 15.8529 Epoch: 9 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:40:53,222-Speed 5212.70 samples/sec Loss 15.8002 Epoch: 9 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:41:02,832-Speed 5328.16 samples/sec Loss 15.7269 Epoch: 9 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:41:12,326-Speed 5393.42 samples/sec Loss 15.8083 Epoch: 9 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:41:29,151-[lfw][48000]XNorm: 24.426112 Training: 2021-03-18 00:41:29,151-[lfw][48000]Accuracy-Flip: 0.99317+-0.00391 Training: 2021-03-18 00:41:29,151-[lfw][48000]Accuracy-Highest: 0.99333 Training: 2021-03-18 00:41:47,756-[cfp_fp][48000]XNorm: 20.227684 Training: 2021-03-18 00:41:47,756-[cfp_fp][48000]Accuracy-Flip: 0.91929+-0.01194 Training: 2021-03-18 00:41:47,756-[cfp_fp][48000]Accuracy-Highest: 0.92400 Training: 2021-03-18 00:42:03,833-[agedb_30][48000]XNorm: 23.889078 Training: 2021-03-18 00:42:03,833-[agedb_30][48000]Accuracy-Flip: 0.93667+-0.01630 Training: 2021-03-18 00:42:03,833-[agedb_30][48000]Accuracy-Highest: 0.94500 Training: 2021-03-18 00:42:13,507-Speed 836.86 samples/sec Loss 15.7560 Epoch: 9 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:42:22,951-Speed 5421.71 samples/sec Loss 15.9063 Epoch: 9 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:42:32,488-Speed 5369.15 samples/sec Loss 15.6971 Epoch: 9 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:42:42,404-Speed 5163.47 samples/sec Loss 15.8286 Epoch: 9 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:42:51,768-Speed 5468.47 samples/sec Loss 15.8242 Epoch: 9 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:01,148-Speed 5458.56 samples/sec Loss 15.9292 Epoch: 9 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:10,695-Speed 5363.33 samples/sec Loss 15.8051 Epoch: 9 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:19,914-Speed 5554.08 samples/sec Loss 15.7790 Epoch: 9 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:29,259-Speed 5479.56 samples/sec Loss 15.8795 Epoch: 9 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:38,703-Speed 5421.50 samples/sec Loss 15.8478 Epoch: 9 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:48,091-Speed 5454.38 samples/sec Loss 15.8676 Epoch: 9 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:43:57,368-Speed 5519.33 samples/sec Loss 15.8261 Epoch: 9 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:06,967-Speed 5334.41 samples/sec Loss 15.8204 Epoch: 9 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:16,364-Speed 5449.03 samples/sec Loss 15.8328 Epoch: 9 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:25,786-Speed 5434.44 samples/sec Loss 15.9462 Epoch: 9 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:35,182-Speed 5449.91 samples/sec Loss 15.7187 Epoch: 9 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:44,647-Speed 5409.62 samples/sec Loss 15.7250 Epoch: 9 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:44:54,201-Speed 5359.53 samples/sec Loss 15.8306 Epoch: 9 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:03,662-Speed 5411.72 samples/sec Loss 15.6665 Epoch: 9 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:13,215-Speed 5359.76 samples/sec Loss 15.7983 Epoch: 9 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:22,638-Speed 5433.78 samples/sec Loss 15.8566 Epoch: 9 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:32,023-Speed 5455.80 samples/sec Loss 15.7466 Epoch: 9 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:41,358-Speed 5485.38 samples/sec Loss 15.9051 Epoch: 9 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:45:50,738-Speed 5458.73 samples/sec Loss 15.8174 Epoch: 9 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:00,261-Speed 5376.59 samples/sec Loss 15.7655 Epoch: 9 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:09,699-Speed 5425.21 samples/sec Loss 15.7253 Epoch: 9 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:19,431-Speed 5261.39 samples/sec Loss 15.7742 Epoch: 9 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:29,091-Speed 5300.72 samples/sec Loss 15.7500 Epoch: 9 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:38,732-Speed 5311.09 samples/sec Loss 15.6998 Epoch: 9 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-18 00:46:48,108-Speed 5460.93 samples/sec Loss 15.7696 Epoch: 9 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:46:57,668-Speed 5355.70 samples/sec Loss 15.8841 Epoch: 9 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:06,969-Speed 5505.00 samples/sec Loss 15.7846 Epoch: 9 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:16,704-Speed 5259.83 samples/sec Loss 15.8792 Epoch: 9 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:25,929-Speed 5550.60 samples/sec Loss 15.8692 Epoch: 9 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:35,334-Speed 5444.32 samples/sec Loss 15.7082 Epoch: 9 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:44,905-Speed 5349.49 samples/sec Loss 15.8669 Epoch: 9 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:47:56,984-Speed 4239.14 samples/sec Loss 14.7973 Epoch: 10 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:48:06,589-Speed 5331.00 samples/sec Loss 13.3208 Epoch: 10 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:48:16,153-Speed 5354.07 samples/sec Loss 12.9042 Epoch: 10 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:48:25,987-Speed 5206.68 samples/sec Loss 12.6091 Epoch: 10 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:48:42,893-[lfw][50000]XNorm: 23.518752 Training: 2021-03-18 00:48:42,893-[lfw][50000]Accuracy-Flip: 0.99433+-0.00335 Training: 2021-03-18 00:48:42,895-[lfw][50000]Accuracy-Highest: 0.99433 Training: 2021-03-18 00:49:01,426-[cfp_fp][50000]XNorm: 19.051778 Training: 2021-03-18 00:49:01,426-[cfp_fp][50000]Accuracy-Flip: 0.94557+-0.01240 Training: 2021-03-18 00:49:01,426-[cfp_fp][50000]Accuracy-Highest: 0.94557 Training: 2021-03-18 00:49:17,452-[agedb_30][50000]XNorm: 22.488380 Training: 2021-03-18 00:49:17,452-[agedb_30][50000]Accuracy-Flip: 0.95667+-0.00916 Training: 2021-03-18 00:49:17,452-[agedb_30][50000]Accuracy-Highest: 0.95667 Training: 2021-03-18 00:49:26,881-Speed 840.81 samples/sec Loss 12.2897 Epoch: 10 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:49:36,873-Speed 5124.47 samples/sec Loss 11.9902 Epoch: 10 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:49:46,623-Speed 5251.59 samples/sec Loss 11.7555 Epoch: 10 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:49:56,055-Speed 5428.86 samples/sec Loss 11.4969 Epoch: 10 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:05,346-Speed 5511.07 samples/sec Loss 11.3186 Epoch: 10 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:14,778-Speed 5428.61 samples/sec Loss 11.0871 Epoch: 10 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:24,134-Speed 5472.63 samples/sec Loss 11.0242 Epoch: 10 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:33,746-Speed 5327.35 samples/sec Loss 10.8101 Epoch: 10 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:43,150-Speed 5444.76 samples/sec Loss 10.7072 Epoch: 10 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:50:52,697-Speed 5363.16 samples/sec Loss 10.5036 Epoch: 10 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:02,565-Speed 5188.72 samples/sec Loss 10.4923 Epoch: 10 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:12,437-Speed 5186.91 samples/sec Loss 10.3381 Epoch: 10 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:22,299-Speed 5191.95 samples/sec Loss 10.2120 Epoch: 10 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:31,884-Speed 5341.86 samples/sec Loss 9.9817 Epoch: 10 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:41,201-Speed 5496.01 samples/sec Loss 9.8523 Epoch: 10 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:51:50,737-Speed 5369.33 samples/sec Loss 9.7638 Epoch: 10 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:00,062-Speed 5491.10 samples/sec Loss 9.7223 Epoch: 10 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:09,631-Speed 5351.25 samples/sec Loss 9.5923 Epoch: 10 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:19,010-Speed 5459.03 samples/sec Loss 9.4713 Epoch: 10 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:28,556-Speed 5364.26 samples/sec Loss 9.4299 Epoch: 10 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:37,771-Speed 5556.41 samples/sec Loss 9.3667 Epoch: 10 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:47,210-Speed 5424.72 samples/sec Loss 9.2276 Epoch: 10 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:52:56,784-Speed 5347.76 samples/sec Loss 9.1101 Epoch: 10 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:06,026-Speed 5540.24 samples/sec Loss 9.1083 Epoch: 10 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:15,507-Speed 5400.99 samples/sec Loss 8.9410 Epoch: 10 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:24,849-Speed 5480.73 samples/sec Loss 9.0222 Epoch: 10 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:34,343-Speed 5393.26 samples/sec Loss 8.7982 Epoch: 10 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:43,702-Speed 5471.36 samples/sec Loss 8.7518 Epoch: 10 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:53:53,358-Speed 5302.51 samples/sec Loss 8.7769 Epoch: 10 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:03,019-Speed 5300.03 samples/sec Loss 8.6684 Epoch: 10 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:12,365-Speed 5479.15 samples/sec Loss 8.6019 Epoch: 10 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:21,757-Speed 5451.46 samples/sec Loss 8.5573 Epoch: 10 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:31,060-Speed 5504.47 samples/sec Loss 8.4720 Epoch: 10 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:40,308-Speed 5536.71 samples/sec Loss 8.4003 Epoch: 10 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:49,692-Speed 5456.10 samples/sec Loss 8.4156 Epoch: 10 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:54:59,076-Speed 5456.32 samples/sec Loss 8.3048 Epoch: 10 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:55:08,763-Speed 5285.87 samples/sec Loss 8.2731 Epoch: 10 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:55:18,250-Speed 5397.46 samples/sec Loss 8.2304 Epoch: 10 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:55:27,688-Speed 5424.88 samples/sec Loss 8.2310 Epoch: 10 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:55:37,058-Speed 5464.75 samples/sec Loss 8.1549 Epoch: 10 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:55:53,798-[lfw][52000]XNorm: 22.234551 Training: 2021-03-18 00:55:53,798-[lfw][52000]Accuracy-Flip: 0.99500+-0.00247 Training: 2021-03-18 00:55:53,799-[lfw][52000]Accuracy-Highest: 0.99500 Training: 2021-03-18 00:56:12,329-[cfp_fp][52000]XNorm: 18.333517 Training: 2021-03-18 00:56:12,329-[cfp_fp][52000]Accuracy-Flip: 0.96086+-0.00920 Training: 2021-03-18 00:56:12,330-[cfp_fp][52000]Accuracy-Highest: 0.96086 Training: 2021-03-18 00:56:28,324-[agedb_30][52000]XNorm: 21.485381 Training: 2021-03-18 00:56:28,324-[agedb_30][52000]Accuracy-Flip: 0.96100+-0.00989 Training: 2021-03-18 00:56:28,324-[agedb_30][52000]Accuracy-Highest: 0.96100 Training: 2021-03-18 00:56:37,631-Speed 845.27 samples/sec Loss 8.1557 Epoch: 10 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:56:47,321-Speed 5284.03 samples/sec Loss 8.1127 Epoch: 10 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:56:56,833-Speed 5383.30 samples/sec Loss 8.0215 Epoch: 10 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:06,153-Speed 5493.96 samples/sec Loss 7.9526 Epoch: 10 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:15,678-Speed 5376.09 samples/sec Loss 8.0125 Epoch: 10 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:25,015-Speed 5483.47 samples/sec Loss 8.0532 Epoch: 10 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:34,582-Speed 5352.32 samples/sec Loss 7.9186 Epoch: 10 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:43,961-Speed 5459.28 samples/sec Loss 7.8808 Epoch: 10 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:57:53,597-Speed 5313.69 samples/sec Loss 7.9414 Epoch: 10 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:03,271-Speed 5293.05 samples/sec Loss 7.7857 Epoch: 10 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:12,686-Speed 5438.41 samples/sec Loss 7.8393 Epoch: 10 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:22,573-Speed 5178.86 samples/sec Loss 7.8098 Epoch: 10 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:31,916-Speed 5480.31 samples/sec Loss 7.7067 Epoch: 10 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:41,533-Speed 5324.51 samples/sec Loss 7.7131 Epoch: 10 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:58:50,955-Speed 5434.21 samples/sec Loss 7.7959 Epoch: 10 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:00,575-Speed 5322.48 samples/sec Loss 7.7163 Epoch: 10 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:10,132-Speed 5357.72 samples/sec Loss 7.6745 Epoch: 10 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:19,693-Speed 5355.35 samples/sec Loss 7.7099 Epoch: 10 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:29,336-Speed 5309.92 samples/sec Loss 7.6376 Epoch: 10 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:38,805-Speed 5407.67 samples/sec Loss 7.6521 Epoch: 10 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:48,740-Speed 5153.72 samples/sec Loss 7.6251 Epoch: 10 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 00:59:58,827-Speed 5076.30 samples/sec Loss 7.6287 Epoch: 10 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:08,342-Speed 5381.09 samples/sec Loss 7.5902 Epoch: 10 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:17,556-Speed 5557.01 samples/sec Loss 7.5750 Epoch: 10 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:27,181-Speed 5320.03 samples/sec Loss 7.5237 Epoch: 10 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:36,662-Speed 5400.34 samples/sec Loss 7.5750 Epoch: 10 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:45,895-Speed 5545.99 samples/sec Loss 7.5011 Epoch: 10 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:00:55,209-Speed 5497.64 samples/sec Loss 7.5715 Epoch: 10 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:04,691-Speed 5399.81 samples/sec Loss 7.5324 Epoch: 10 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:14,003-Speed 5498.94 samples/sec Loss 7.5066 Epoch: 10 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:23,347-Speed 5479.73 samples/sec Loss 7.5640 Epoch: 10 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:33,106-Speed 5246.61 samples/sec Loss 7.5112 Epoch: 10 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:42,443-Speed 5484.26 samples/sec Loss 7.5233 Epoch: 10 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:01:51,776-Speed 5486.48 samples/sec Loss 7.4657 Epoch: 10 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:01,169-Speed 5451.00 samples/sec Loss 7.4504 Epoch: 10 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:10,558-Speed 5453.41 samples/sec Loss 7.4733 Epoch: 10 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:20,137-Speed 5345.50 samples/sec Loss 7.4854 Epoch: 10 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:29,599-Speed 5411.43 samples/sec Loss 7.5193 Epoch: 10 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:38,914-Speed 5497.23 samples/sec Loss 7.5075 Epoch: 10 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:02:48,457-Speed 5365.51 samples/sec Loss 7.4681 Epoch: 10 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:03:05,266-[lfw][54000]XNorm: 23.264443 Training: 2021-03-18 01:03:05,266-[lfw][54000]Accuracy-Flip: 0.99567+-0.00382 Training: 2021-03-18 01:03:05,271-[lfw][54000]Accuracy-Highest: 0.99567 Training: 2021-03-18 01:03:23,725-[cfp_fp][54000]XNorm: 19.338989 Training: 2021-03-18 01:03:23,726-[cfp_fp][54000]Accuracy-Flip: 0.96257+-0.01063 Training: 2021-03-18 01:03:23,726-[cfp_fp][54000]Accuracy-Highest: 0.96257 Training: 2021-03-18 01:03:39,720-[agedb_30][54000]XNorm: 22.559598 Training: 2021-03-18 01:03:39,720-[agedb_30][54000]Accuracy-Flip: 0.96617+-0.00799 Training: 2021-03-18 01:03:39,720-[agedb_30][54000]Accuracy-Highest: 0.96617 Training: 2021-03-18 01:03:49,005-Speed 845.62 samples/sec Loss 7.4727 Epoch: 10 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:03:58,391-Speed 5455.35 samples/sec Loss 7.4443 Epoch: 10 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:07,829-Speed 5425.08 samples/sec Loss 7.4683 Epoch: 10 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:17,247-Speed 5437.01 samples/sec Loss 7.4588 Epoch: 10 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:26,863-Speed 5324.46 samples/sec Loss 7.4215 Epoch: 10 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:36,434-Speed 5350.22 samples/sec Loss 7.4416 Epoch: 10 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:45,836-Speed 5445.69 samples/sec Loss 7.4293 Epoch: 10 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:04:55,296-Speed 5412.75 samples/sec Loss 7.5245 Epoch: 10 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:04,755-Speed 5413.37 samples/sec Loss 7.4007 Epoch: 10 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:14,295-Speed 5366.98 samples/sec Loss 7.4417 Epoch: 10 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:23,572-Speed 5519.47 samples/sec Loss 7.4497 Epoch: 10 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:33,130-Speed 5357.23 samples/sec Loss 7.3957 Epoch: 10 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:43,154-Speed 5108.10 samples/sec Loss 7.4520 Epoch: 10 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:05:52,526-Speed 5463.39 samples/sec Loss 7.4698 Epoch: 10 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:01,914-Speed 5454.01 samples/sec Loss 7.3615 Epoch: 10 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:11,378-Speed 5410.25 samples/sec Loss 7.4695 Epoch: 10 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:23,663-Speed 4168.11 samples/sec Loss 6.5071 Epoch: 11 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:33,640-Speed 5131.96 samples/sec Loss 6.5310 Epoch: 11 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:43,070-Speed 5429.73 samples/sec Loss 6.6032 Epoch: 11 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:06:52,499-Speed 5431.08 samples/sec Loss 6.6152 Epoch: 11 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:01,816-Speed 5495.79 samples/sec Loss 6.6453 Epoch: 11 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:11,294-Speed 5402.34 samples/sec Loss 6.6575 Epoch: 11 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:20,679-Speed 5455.75 samples/sec Loss 6.7166 Epoch: 11 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:30,262-Speed 5342.99 samples/sec Loss 6.7951 Epoch: 11 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:40,062-Speed 5225.05 samples/sec Loss 6.8578 Epoch: 11 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:49,581-Speed 5379.42 samples/sec Loss 6.9094 Epoch: 11 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:07:58,967-Speed 5454.89 samples/sec Loss 6.8796 Epoch: 11 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:08,634-Speed 5296.69 samples/sec Loss 6.9298 Epoch: 11 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:17,798-Speed 5587.67 samples/sec Loss 6.9532 Epoch: 11 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:27,586-Speed 5230.97 samples/sec Loss 6.9350 Epoch: 11 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:37,636-Speed 5094.81 samples/sec Loss 6.9436 Epoch: 11 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:47,090-Speed 5416.26 samples/sec Loss 7.0262 Epoch: 11 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:08:56,533-Speed 5422.34 samples/sec Loss 7.0514 Epoch: 11 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:05,763-Speed 5547.70 samples/sec Loss 7.0163 Epoch: 11 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:15,335-Speed 5349.13 samples/sec Loss 7.0964 Epoch: 11 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:24,648-Speed 5497.73 samples/sec Loss 7.0537 Epoch: 11 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:34,056-Speed 5442.82 samples/sec Loss 7.1330 Epoch: 11 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:43,515-Speed 5413.27 samples/sec Loss 7.1689 Epoch: 11 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:09:53,366-Speed 5197.79 samples/sec Loss 7.2100 Epoch: 11 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:10:02,925-Speed 5356.29 samples/sec Loss 7.1499 Epoch: 11 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:10:19,774-[lfw][56000]XNorm: 23.306469 Training: 2021-03-18 01:10:19,775-[lfw][56000]Accuracy-Flip: 0.99533+-0.00393 Training: 2021-03-18 01:10:19,775-[lfw][56000]Accuracy-Highest: 0.99567 Training: 2021-03-18 01:10:38,315-[cfp_fp][56000]XNorm: 19.337568 Training: 2021-03-18 01:10:38,315-[cfp_fp][56000]Accuracy-Flip: 0.96300+-0.01147 Training: 2021-03-18 01:10:38,315-[cfp_fp][56000]Accuracy-Highest: 0.96300 Training: 2021-03-18 01:10:54,298-[agedb_30][56000]XNorm: 22.607039 Training: 2021-03-18 01:10:54,298-[agedb_30][56000]Accuracy-Flip: 0.96317+-0.00848 Training: 2021-03-18 01:10:54,298-[agedb_30][56000]Accuracy-Highest: 0.96617 Training: 2021-03-18 01:11:03,513-Speed 845.06 samples/sec Loss 7.2485 Epoch: 11 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:11:12,920-Speed 5442.91 samples/sec Loss 7.1521 Epoch: 11 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:11:22,449-Speed 5373.56 samples/sec Loss 7.2248 Epoch: 11 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:11:31,899-Speed 5418.29 samples/sec Loss 7.2905 Epoch: 11 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:11:41,456-Speed 5357.63 samples/sec Loss 7.2721 Epoch: 11 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:11:50,926-Speed 5406.79 samples/sec Loss 7.2400 Epoch: 11 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:00,434-Speed 5385.56 samples/sec Loss 7.3231 Epoch: 11 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:09,830-Speed 5449.11 samples/sec Loss 7.3807 Epoch: 11 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:19,274-Speed 5421.92 samples/sec Loss 7.3688 Epoch: 11 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:29,005-Speed 5261.74 samples/sec Loss 7.4012 Epoch: 11 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:38,516-Speed 5383.46 samples/sec Loss 7.3435 Epoch: 11 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:48,191-Speed 5292.46 samples/sec Loss 7.3686 Epoch: 11 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:12:57,556-Speed 5467.34 samples/sec Loss 7.3750 Epoch: 11 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:07,296-Speed 5257.06 samples/sec Loss 7.3663 Epoch: 11 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:16,838-Speed 5366.13 samples/sec Loss 7.3460 Epoch: 11 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:26,365-Speed 5374.98 samples/sec Loss 7.4599 Epoch: 11 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:35,995-Speed 5317.26 samples/sec Loss 7.3964 Epoch: 11 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:45,479-Speed 5398.61 samples/sec Loss 7.3464 Epoch: 11 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:13:55,106-Speed 5318.73 samples/sec Loss 7.4930 Epoch: 11 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:04,436-Speed 5488.01 samples/sec Loss 7.4583 Epoch: 11 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:13,951-Speed 5381.09 samples/sec Loss 7.4514 Epoch: 11 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:23,444-Speed 5394.03 samples/sec Loss 7.4897 Epoch: 11 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:32,961-Speed 5380.01 samples/sec Loss 7.4416 Epoch: 11 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:42,487-Speed 5374.71 samples/sec Loss 7.5167 Epoch: 11 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:14:52,174-Speed 5286.05 samples/sec Loss 7.5578 Epoch: 11 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:01,775-Speed 5333.05 samples/sec Loss 7.4328 Epoch: 11 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:11,491-Speed 5269.98 samples/sec Loss 7.5182 Epoch: 11 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:21,079-Speed 5340.53 samples/sec Loss 7.4960 Epoch: 11 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:30,460-Speed 5458.28 samples/sec Loss 7.5708 Epoch: 11 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:39,890-Speed 5429.86 samples/sec Loss 7.5260 Epoch: 11 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:49,500-Speed 5328.17 samples/sec Loss 7.5254 Epoch: 11 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:15:58,878-Speed 5459.53 samples/sec Loss 7.5855 Epoch: 11 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:08,400-Speed 5377.76 samples/sec Loss 7.6177 Epoch: 11 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:18,316-Speed 5163.49 samples/sec Loss 7.5997 Epoch: 11 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:27,895-Speed 5345.41 samples/sec Loss 7.5393 Epoch: 11 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:37,322-Speed 5431.43 samples/sec Loss 7.5869 Epoch: 11 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:46,812-Speed 5395.54 samples/sec Loss 7.5859 Epoch: 11 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:16:56,475-Speed 5298.77 samples/sec Loss 7.5482 Epoch: 11 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:17:06,189-Speed 5271.44 samples/sec Loss 7.5812 Epoch: 11 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:17:16,255-Speed 5087.14 samples/sec Loss 7.6178 Epoch: 11 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:17:32,818-[lfw][58000]XNorm: 23.469668 Training: 2021-03-18 01:17:32,818-[lfw][58000]Accuracy-Flip: 0.99617+-0.00422 Training: 2021-03-18 01:17:32,818-[lfw][58000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:17:51,328-[cfp_fp][58000]XNorm: 19.540299 Training: 2021-03-18 01:17:51,328-[cfp_fp][58000]Accuracy-Flip: 0.96557+-0.00841 Training: 2021-03-18 01:17:51,328-[cfp_fp][58000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:18:07,306-[agedb_30][58000]XNorm: 22.522428 Training: 2021-03-18 01:18:07,306-[agedb_30][58000]Accuracy-Flip: 0.96167+-0.01200 Training: 2021-03-18 01:18:07,306-[agedb_30][58000]Accuracy-Highest: 0.96617 Training: 2021-03-18 01:18:16,660-Speed 847.62 samples/sec Loss 7.6260 Epoch: 11 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:18:26,184-Speed 5376.24 samples/sec Loss 7.6482 Epoch: 11 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:18:35,515-Speed 5487.14 samples/sec Loss 7.6037 Epoch: 11 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:18:45,122-Speed 5329.90 samples/sec Loss 7.5994 Epoch: 11 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:18:54,608-Speed 5397.48 samples/sec Loss 7.6743 Epoch: 11 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:04,110-Speed 5389.10 samples/sec Loss 7.6132 Epoch: 11 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:13,692-Speed 5343.68 samples/sec Loss 7.6193 Epoch: 11 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:23,250-Speed 5356.87 samples/sec Loss 7.6510 Epoch: 11 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:32,580-Speed 5488.39 samples/sec Loss 7.6215 Epoch: 11 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:42,187-Speed 5329.49 samples/sec Loss 7.7366 Epoch: 11 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:19:51,651-Speed 5410.75 samples/sec Loss 7.6577 Epoch: 11 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:01,062-Speed 5440.96 samples/sec Loss 7.6400 Epoch: 11 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:10,462-Speed 5446.75 samples/sec Loss 7.6548 Epoch: 11 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:20,063-Speed 5333.63 samples/sec Loss 7.6951 Epoch: 11 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:29,661-Speed 5334.46 samples/sec Loss 7.7194 Epoch: 11 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:39,042-Speed 5458.29 samples/sec Loss 7.6269 Epoch: 11 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:48,700-Speed 5301.73 samples/sec Loss 7.7003 Epoch: 11 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:20:58,161-Speed 5411.68 samples/sec Loss 7.6098 Epoch: 11 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:07,671-Speed 5384.19 samples/sec Loss 7.7370 Epoch: 11 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:17,221-Speed 5361.83 samples/sec Loss 7.7352 Epoch: 11 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:26,644-Speed 5433.54 samples/sec Loss 7.7288 Epoch: 11 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:36,271-Speed 5319.02 samples/sec Loss 7.6177 Epoch: 11 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:45,712-Speed 5423.59 samples/sec Loss 7.7223 Epoch: 11 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:21:55,271-Speed 5356.29 samples/sec Loss 7.8164 Epoch: 11 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:04,611-Speed 5482.45 samples/sec Loss 7.7178 Epoch: 11 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:14,079-Speed 5407.76 samples/sec Loss 7.7730 Epoch: 11 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:23,678-Speed 5334.50 samples/sec Loss 7.7683 Epoch: 11 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:33,333-Speed 5303.32 samples/sec Loss 7.7520 Epoch: 11 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:42,860-Speed 5374.33 samples/sec Loss 7.7370 Epoch: 11 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:22:52,435-Speed 5347.80 samples/sec Loss 7.7492 Epoch: 11 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:01,900-Speed 5409.54 samples/sec Loss 7.7127 Epoch: 11 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:11,449-Speed 5362.45 samples/sec Loss 7.7897 Epoch: 11 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:20,799-Speed 5476.18 samples/sec Loss 7.7753 Epoch: 11 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:30,175-Speed 5461.13 samples/sec Loss 7.8158 Epoch: 11 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:39,819-Speed 5309.50 samples/sec Loss 7.7761 Epoch: 11 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:23:51,887-Speed 4242.91 samples/sec Loss 7.4112 Epoch: 12 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:24:01,900-Speed 5113.45 samples/sec Loss 6.8042 Epoch: 12 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:24:11,568-Speed 5296.70 samples/sec Loss 6.9180 Epoch: 12 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:24:20,989-Speed 5435.10 samples/sec Loss 6.9415 Epoch: 12 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:24:30,561-Speed 5348.94 samples/sec Loss 6.9900 Epoch: 12 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:24:47,420-[lfw][60000]XNorm: 23.517604 Training: 2021-03-18 01:24:47,421-[lfw][60000]Accuracy-Flip: 0.99500+-0.00380 Training: 2021-03-18 01:24:47,421-[lfw][60000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:25:06,174-[cfp_fp][60000]XNorm: 19.411756 Training: 2021-03-18 01:25:06,175-[cfp_fp][60000]Accuracy-Flip: 0.96171+-0.01080 Training: 2021-03-18 01:25:06,177-[cfp_fp][60000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:25:22,203-[agedb_30][60000]XNorm: 22.671222 Training: 2021-03-18 01:25:22,203-[agedb_30][60000]Accuracy-Flip: 0.96717+-0.00813 Training: 2021-03-18 01:25:22,203-[agedb_30][60000]Accuracy-Highest: 0.96717 Training: 2021-03-18 01:25:31,614-Speed 838.63 samples/sec Loss 7.0727 Epoch: 12 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:25:41,116-Speed 5388.74 samples/sec Loss 7.0829 Epoch: 12 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:25:50,558-Speed 5423.08 samples/sec Loss 7.1460 Epoch: 12 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:00,102-Speed 5364.80 samples/sec Loss 7.2180 Epoch: 12 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:09,860-Speed 5247.59 samples/sec Loss 7.2158 Epoch: 12 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:19,268-Speed 5442.64 samples/sec Loss 7.1806 Epoch: 12 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:28,839-Speed 5349.75 samples/sec Loss 7.3095 Epoch: 12 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:38,766-Speed 5157.99 samples/sec Loss 7.2858 Epoch: 12 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:48,655-Speed 5177.74 samples/sec Loss 7.2758 Epoch: 12 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:26:58,555-Speed 5172.17 samples/sec Loss 7.3049 Epoch: 12 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:08,537-Speed 5129.36 samples/sec Loss 7.3982 Epoch: 12 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:18,106-Speed 5351.46 samples/sec Loss 7.4617 Epoch: 12 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:27,838-Speed 5261.23 samples/sec Loss 7.4233 Epoch: 12 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:37,434-Speed 5335.81 samples/sec Loss 7.4482 Epoch: 12 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:47,111-Speed 5290.98 samples/sec Loss 7.4575 Epoch: 12 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:27:56,580-Speed 5407.43 samples/sec Loss 7.5036 Epoch: 12 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:06,171-Speed 5339.25 samples/sec Loss 7.4684 Epoch: 12 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:15,973-Speed 5223.55 samples/sec Loss 7.5373 Epoch: 12 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:25,499-Speed 5375.45 samples/sec Loss 7.5067 Epoch: 12 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:35,073-Speed 5347.96 samples/sec Loss 7.6146 Epoch: 12 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:44,504-Speed 5429.23 samples/sec Loss 7.5683 Epoch: 12 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:28:54,186-Speed 5288.47 samples/sec Loss 7.6282 Epoch: 12 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:03,713-Speed 5374.42 samples/sec Loss 7.5847 Epoch: 12 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:13,286-Speed 5349.19 samples/sec Loss 7.5846 Epoch: 12 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:22,789-Speed 5388.47 samples/sec Loss 7.6630 Epoch: 12 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:32,321-Speed 5371.36 samples/sec Loss 7.6871 Epoch: 12 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:42,024-Speed 5277.32 samples/sec Loss 7.6350 Epoch: 12 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:29:51,479-Speed 5415.22 samples/sec Loss 7.6593 Epoch: 12 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:01,066-Speed 5340.79 samples/sec Loss 7.7161 Epoch: 12 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:10,763-Speed 5280.42 samples/sec Loss 7.6569 Epoch: 12 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:20,289-Speed 5374.80 samples/sec Loss 7.7204 Epoch: 12 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:29,776-Speed 5397.17 samples/sec Loss 7.6354 Epoch: 12 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:39,325-Speed 5362.18 samples/sec Loss 7.6926 Epoch: 12 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:49,029-Speed 5276.86 samples/sec Loss 7.6438 Epoch: 12 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:30:58,571-Speed 5365.94 samples/sec Loss 7.7003 Epoch: 12 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:31:08,144-Speed 5348.54 samples/sec Loss 7.6954 Epoch: 12 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:31:17,763-Speed 5323.28 samples/sec Loss 7.6441 Epoch: 12 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:31:27,166-Speed 5445.32 samples/sec Loss 7.7113 Epoch: 12 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:31:36,786-Speed 5322.94 samples/sec Loss 7.7230 Epoch: 12 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:31:46,303-Speed 5380.12 samples/sec Loss 7.7251 Epoch: 12 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:32:02,884-[lfw][62000]XNorm: 22.366530 Training: 2021-03-18 01:32:02,884-[lfw][62000]Accuracy-Flip: 0.99583+-0.00359 Training: 2021-03-18 01:32:02,884-[lfw][62000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:32:21,368-[cfp_fp][62000]XNorm: 18.562498 Training: 2021-03-18 01:32:21,369-[cfp_fp][62000]Accuracy-Flip: 0.96214+-0.01138 Training: 2021-03-18 01:32:21,369-[cfp_fp][62000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:32:37,329-[agedb_30][62000]XNorm: 21.446005 Training: 2021-03-18 01:32:37,330-[agedb_30][62000]Accuracy-Flip: 0.96667+-0.00904 Training: 2021-03-18 01:32:37,330-[agedb_30][62000]Accuracy-Highest: 0.96717 Training: 2021-03-18 01:32:46,542-Speed 849.96 samples/sec Loss 7.7515 Epoch: 12 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:32:56,006-Speed 5410.27 samples/sec Loss 7.7714 Epoch: 12 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:05,555-Speed 5361.86 samples/sec Loss 7.8110 Epoch: 12 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:15,430-Speed 5185.01 samples/sec Loss 7.7850 Epoch: 12 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:24,811-Speed 5458.68 samples/sec Loss 7.8406 Epoch: 12 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:34,168-Speed 5472.04 samples/sec Loss 7.8315 Epoch: 12 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:43,939-Speed 5240.47 samples/sec Loss 7.8206 Epoch: 12 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:33:53,384-Speed 5421.10 samples/sec Loss 7.7982 Epoch: 12 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:03,037-Speed 5304.46 samples/sec Loss 7.7432 Epoch: 12 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:12,411-Speed 5462.12 samples/sec Loss 7.6954 Epoch: 12 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:22,173-Speed 5245.38 samples/sec Loss 7.7999 Epoch: 12 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:31,672-Speed 5390.49 samples/sec Loss 7.7843 Epoch: 12 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:41,192-Speed 5378.10 samples/sec Loss 7.8289 Epoch: 12 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:34:50,755-Speed 5354.67 samples/sec Loss 7.7947 Epoch: 12 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:00,366-Speed 5327.28 samples/sec Loss 7.7927 Epoch: 12 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:09,881-Speed 5381.85 samples/sec Loss 7.8563 Epoch: 12 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:19,902-Speed 5109.58 samples/sec Loss 7.9089 Epoch: 12 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:29,598-Speed 5280.82 samples/sec Loss 7.8089 Epoch: 12 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:39,439-Speed 5202.78 samples/sec Loss 7.8114 Epoch: 12 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:49,248-Speed 5220.13 samples/sec Loss 7.7653 Epoch: 12 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:35:58,907-Speed 5300.86 samples/sec Loss 7.8057 Epoch: 12 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:08,259-Speed 5475.10 samples/sec Loss 7.8493 Epoch: 12 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:17,858-Speed 5334.44 samples/sec Loss 7.7912 Epoch: 12 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:27,448-Speed 5338.91 samples/sec Loss 7.8010 Epoch: 12 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:36,890-Speed 5422.91 samples/sec Loss 7.7483 Epoch: 12 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:46,411-Speed 5378.26 samples/sec Loss 7.8078 Epoch: 12 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:36:56,048-Speed 5312.94 samples/sec Loss 7.8779 Epoch: 12 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:05,581-Speed 5371.26 samples/sec Loss 7.8183 Epoch: 12 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:15,356-Speed 5237.99 samples/sec Loss 7.8336 Epoch: 12 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:24,911-Speed 5358.79 samples/sec Loss 7.8457 Epoch: 12 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:34,465-Speed 5359.38 samples/sec Loss 7.8519 Epoch: 12 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:43,913-Speed 5419.81 samples/sec Loss 7.7996 Epoch: 12 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:37:53,244-Speed 5487.12 samples/sec Loss 7.8749 Epoch: 12 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:02,944-Speed 5278.60 samples/sec Loss 7.9674 Epoch: 12 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:12,510-Speed 5353.06 samples/sec Loss 7.8399 Epoch: 12 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:22,184-Speed 5292.48 samples/sec Loss 7.8919 Epoch: 12 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:31,671-Speed 5397.07 samples/sec Loss 7.8303 Epoch: 12 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:41,472-Speed 5224.73 samples/sec Loss 7.8430 Epoch: 12 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:38:51,121-Speed 5306.72 samples/sec Loss 7.7977 Epoch: 12 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:39:00,941-Speed 5214.12 samples/sec Loss 7.8277 Epoch: 12 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:39:17,566-[lfw][64000]XNorm: 22.878225 Training: 2021-03-18 01:39:17,566-[lfw][64000]Accuracy-Flip: 0.99600+-0.00238 Training: 2021-03-18 01:39:17,566-[lfw][64000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:39:36,103-[cfp_fp][64000]XNorm: 18.847546 Training: 2021-03-18 01:39:36,103-[cfp_fp][64000]Accuracy-Flip: 0.96400+-0.01069 Training: 2021-03-18 01:39:36,103-[cfp_fp][64000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:39:52,107-[agedb_30][64000]XNorm: 22.142877 Training: 2021-03-18 01:39:52,107-[agedb_30][64000]Accuracy-Flip: 0.96817+-0.00751 Training: 2021-03-18 01:39:52,107-[agedb_30][64000]Accuracy-Highest: 0.96817 Training: 2021-03-18 01:40:01,501-Speed 845.44 samples/sec Loss 7.8106 Epoch: 12 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:10,943-Speed 5422.84 samples/sec Loss 7.8117 Epoch: 12 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:20,652-Speed 5274.08 samples/sec Loss 7.7947 Epoch: 12 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:30,416-Speed 5243.77 samples/sec Loss 7.8990 Epoch: 12 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:40,079-Speed 5298.86 samples/sec Loss 7.8737 Epoch: 12 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:49,574-Speed 5392.63 samples/sec Loss 7.7968 Epoch: 12 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:40:59,311-Speed 5258.73 samples/sec Loss 7.8692 Epoch: 12 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:08,884-Speed 5349.03 samples/sec Loss 7.8767 Epoch: 12 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:18,373-Speed 5395.76 samples/sec Loss 7.8879 Epoch: 12 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:27,888-Speed 5381.45 samples/sec Loss 7.8555 Epoch: 12 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:37,463-Speed 5347.77 samples/sec Loss 7.8677 Epoch: 12 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:47,039-Speed 5346.69 samples/sec Loss 7.8649 Epoch: 12 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:41:56,624-Speed 5342.43 samples/sec Loss 7.8518 Epoch: 12 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:06,062-Speed 5425.35 samples/sec Loss 7.8773 Epoch: 12 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:15,329-Speed 5524.72 samples/sec Loss 7.8537 Epoch: 12 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:27,965-Speed 4052.31 samples/sec Loss 7.2752 Epoch: 13 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:37,524-Speed 5356.38 samples/sec Loss 6.9444 Epoch: 13 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:46,965-Speed 5424.03 samples/sec Loss 7.0207 Epoch: 13 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:42:56,398-Speed 5427.65 samples/sec Loss 7.0474 Epoch: 13 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:05,926-Speed 5374.53 samples/sec Loss 7.1001 Epoch: 13 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:15,454-Speed 5373.93 samples/sec Loss 7.1413 Epoch: 13 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:25,014-Speed 5355.87 samples/sec Loss 7.1712 Epoch: 13 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:34,495-Speed 5400.57 samples/sec Loss 7.2650 Epoch: 13 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:44,524-Speed 5105.78 samples/sec Loss 7.2751 Epoch: 13 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:43:54,094-Speed 5350.45 samples/sec Loss 7.3082 Epoch: 13 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:03,621-Speed 5374.52 samples/sec Loss 7.3492 Epoch: 13 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:13,402-Speed 5234.80 samples/sec Loss 7.3897 Epoch: 13 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:23,633-Speed 5004.49 samples/sec Loss 7.4599 Epoch: 13 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:33,501-Speed 5189.02 samples/sec Loss 7.4478 Epoch: 13 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:42,881-Speed 5458.90 samples/sec Loss 7.4997 Epoch: 13 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:44:52,504-Speed 5320.91 samples/sec Loss 7.4913 Epoch: 13 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:01,886-Speed 5457.23 samples/sec Loss 7.5573 Epoch: 13 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:11,623-Speed 5258.96 samples/sec Loss 7.5001 Epoch: 13 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:21,238-Speed 5325.06 samples/sec Loss 7.5562 Epoch: 13 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:30,727-Speed 5396.22 samples/sec Loss 7.5560 Epoch: 13 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:40,333-Speed 5330.16 samples/sec Loss 7.5978 Epoch: 13 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:50,037-Speed 5276.60 samples/sec Loss 7.5782 Epoch: 13 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:45:59,748-Speed 5273.16 samples/sec Loss 7.5698 Epoch: 13 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:46:09,265-Speed 5379.98 samples/sec Loss 7.6414 Epoch: 13 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:46:18,759-Speed 5393.26 samples/sec Loss 7.6281 Epoch: 13 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:46:35,559-[lfw][66000]XNorm: 22.567085 Training: 2021-03-18 01:46:35,559-[lfw][66000]Accuracy-Flip: 0.99600+-0.00300 Training: 2021-03-18 01:46:35,559-[lfw][66000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:46:53,983-[cfp_fp][66000]XNorm: 18.746796 Training: 2021-03-18 01:46:53,983-[cfp_fp][66000]Accuracy-Flip: 0.96229+-0.00843 Training: 2021-03-18 01:46:53,983-[cfp_fp][66000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:47:09,987-[agedb_30][66000]XNorm: 21.750803 Training: 2021-03-18 01:47:09,987-[agedb_30][66000]Accuracy-Flip: 0.97000+-0.00860 Training: 2021-03-18 01:47:09,987-[agedb_30][66000]Accuracy-Highest: 0.97000 Training: 2021-03-18 01:47:19,544-Speed 842.32 samples/sec Loss 7.6000 Epoch: 13 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:47:29,050-Speed 5386.28 samples/sec Loss 7.6618 Epoch: 13 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:47:38,606-Speed 5358.68 samples/sec Loss 7.6924 Epoch: 13 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:47:48,163-Speed 5357.20 samples/sec Loss 7.6825 Epoch: 13 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:47:58,047-Speed 5180.51 samples/sec Loss 7.7217 Epoch: 13 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:48:07,620-Speed 5348.56 samples/sec Loss 7.6257 Epoch: 13 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:48:17,015-Speed 5450.22 samples/sec Loss 7.6709 Epoch: 13 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-18 01:48:26,637-Speed 5322.00 samples/sec Loss 7.7612 Epoch: 13 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:48:36,175-Speed 5368.28 samples/sec Loss 7.7459 Epoch: 13 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:48:45,659-Speed 5398.84 samples/sec Loss 7.7481 Epoch: 13 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:48:55,352-Speed 5282.36 samples/sec Loss 7.6734 Epoch: 13 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:04,749-Speed 5448.52 samples/sec Loss 7.7556 Epoch: 13 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:14,576-Speed 5210.65 samples/sec Loss 7.7735 Epoch: 13 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:24,202-Speed 5319.03 samples/sec Loss 7.7250 Epoch: 13 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:33,723-Speed 5378.08 samples/sec Loss 7.7373 Epoch: 13 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:43,368-Speed 5308.74 samples/sec Loss 7.7280 Epoch: 13 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:49:53,322-Speed 5143.99 samples/sec Loss 7.7100 Epoch: 13 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:02,738-Speed 5437.80 samples/sec Loss 7.6671 Epoch: 13 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:12,487-Speed 5252.03 samples/sec Loss 7.6879 Epoch: 13 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:21,968-Speed 5400.70 samples/sec Loss 7.7360 Epoch: 13 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:31,696-Speed 5263.39 samples/sec Loss 7.7724 Epoch: 13 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:41,542-Speed 5200.41 samples/sec Loss 7.7816 Epoch: 13 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:50:51,065-Speed 5376.97 samples/sec Loss 7.7451 Epoch: 13 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:00,741-Speed 5291.41 samples/sec Loss 7.7781 Epoch: 13 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:10,426-Speed 5286.88 samples/sec Loss 7.8307 Epoch: 13 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:19,902-Speed 5403.94 samples/sec Loss 7.8135 Epoch: 13 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:29,204-Speed 5504.48 samples/sec Loss 7.7998 Epoch: 13 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:38,869-Speed 5297.36 samples/sec Loss 7.8194 Epoch: 13 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:48,350-Speed 5400.73 samples/sec Loss 7.7814 Epoch: 13 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:51:57,789-Speed 5424.56 samples/sec Loss 7.7989 Epoch: 13 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:07,252-Speed 5411.01 samples/sec Loss 7.7419 Epoch: 13 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:17,075-Speed 5212.60 samples/sec Loss 7.8139 Epoch: 13 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:26,332-Speed 5531.55 samples/sec Loss 7.8629 Epoch: 13 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:35,774-Speed 5422.83 samples/sec Loss 7.7892 Epoch: 13 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:45,253-Speed 5402.02 samples/sec Loss 7.8759 Epoch: 13 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:52:54,901-Speed 5307.35 samples/sec Loss 7.7844 Epoch: 13 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:53:05,136-Speed 5002.95 samples/sec Loss 7.8334 Epoch: 13 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:53:15,016-Speed 5182.16 samples/sec Loss 7.7785 Epoch: 13 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:53:24,843-Speed 5210.81 samples/sec Loss 7.8498 Epoch: 13 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:53:34,323-Speed 5401.18 samples/sec Loss 7.8898 Epoch: 13 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:53:51,052-[lfw][68000]XNorm: 23.160218 Training: 2021-03-18 01:53:51,052-[lfw][68000]Accuracy-Flip: 0.99583+-0.00327 Training: 2021-03-18 01:53:51,052-[lfw][68000]Accuracy-Highest: 0.99617 Training: 2021-03-18 01:54:09,656-[cfp_fp][68000]XNorm: 19.242803 Training: 2021-03-18 01:54:09,656-[cfp_fp][68000]Accuracy-Flip: 0.96457+-0.00859 Training: 2021-03-18 01:54:09,656-[cfp_fp][68000]Accuracy-Highest: 0.96557 Training: 2021-03-18 01:54:25,766-[agedb_30][68000]XNorm: 22.075210 Training: 2021-03-18 01:54:25,766-[agedb_30][68000]Accuracy-Flip: 0.96717+-0.00860 Training: 2021-03-18 01:54:25,766-[agedb_30][68000]Accuracy-Highest: 0.97000 Training: 2021-03-18 01:54:35,141-Speed 841.86 samples/sec Loss 7.8292 Epoch: 13 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:54:44,579-Speed 5425.63 samples/sec Loss 7.7953 Epoch: 13 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:54:54,273-Speed 5281.68 samples/sec Loss 7.7934 Epoch: 13 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:03,940-Speed 5297.16 samples/sec Loss 7.8690 Epoch: 13 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:13,537-Speed 5335.02 samples/sec Loss 7.7891 Epoch: 13 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:23,007-Speed 5407.07 samples/sec Loss 7.7986 Epoch: 13 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:32,351-Speed 5480.02 samples/sec Loss 7.8368 Epoch: 13 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:41,811-Speed 5412.60 samples/sec Loss 7.7973 Epoch: 13 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:55:51,409-Speed 5334.83 samples/sec Loss 7.8759 Epoch: 13 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:01,022-Speed 5326.00 samples/sec Loss 7.8207 Epoch: 13 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:10,611-Speed 5339.81 samples/sec Loss 7.8174 Epoch: 13 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:20,271-Speed 5300.79 samples/sec Loss 7.8763 Epoch: 13 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:29,945-Speed 5292.82 samples/sec Loss 7.7571 Epoch: 13 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:39,660-Speed 5270.52 samples/sec Loss 7.8191 Epoch: 13 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:49,421-Speed 5246.04 samples/sec Loss 7.8059 Epoch: 13 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:56:59,021-Speed 5333.40 samples/sec Loss 7.8616 Epoch: 13 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:08,548-Speed 5374.36 samples/sec Loss 7.8443 Epoch: 13 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:18,113-Speed 5353.57 samples/sec Loss 7.8255 Epoch: 13 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:27,756-Speed 5309.39 samples/sec Loss 7.8466 Epoch: 13 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:37,395-Speed 5312.52 samples/sec Loss 7.8438 Epoch: 13 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:46,861-Speed 5408.99 samples/sec Loss 7.7671 Epoch: 13 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:57:56,223-Speed 5469.47 samples/sec Loss 7.8699 Epoch: 13 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:05,920-Speed 5280.23 samples/sec Loss 7.8654 Epoch: 13 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:15,622-Speed 5277.95 samples/sec Loss 7.8451 Epoch: 13 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:25,161-Speed 5367.73 samples/sec Loss 7.9006 Epoch: 13 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:34,635-Speed 5404.36 samples/sec Loss 7.7922 Epoch: 13 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:44,513-Speed 5184.03 samples/sec Loss 7.8861 Epoch: 13 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:58:54,120-Speed 5329.54 samples/sec Loss 7.8881 Epoch: 13 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:03,518-Speed 5448.18 samples/sec Loss 7.8116 Epoch: 13 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:12,985-Speed 5408.78 samples/sec Loss 7.8210 Epoch: 13 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:22,767-Speed 5234.36 samples/sec Loss 7.9244 Epoch: 13 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:32,507-Speed 5257.09 samples/sec Loss 7.8313 Epoch: 13 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:42,218-Speed 5272.83 samples/sec Loss 7.8634 Epoch: 13 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 01:59:51,954-Speed 5259.16 samples/sec Loss 7.8788 Epoch: 13 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:04,444-Speed 4099.30 samples/sec Loss 7.8146 Epoch: 14 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:14,210-Speed 5243.01 samples/sec Loss 6.9455 Epoch: 14 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:23,750-Speed 5367.28 samples/sec Loss 6.9952 Epoch: 14 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:33,335-Speed 5342.20 samples/sec Loss 6.9684 Epoch: 14 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:43,313-Speed 5131.58 samples/sec Loss 7.0423 Epoch: 14 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:00:53,005-Speed 5282.80 samples/sec Loss 7.1039 Epoch: 14 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:01:09,468-[lfw][70000]XNorm: 22.417305 Training: 2021-03-18 02:01:09,469-[lfw][70000]Accuracy-Flip: 0.99533+-0.00323 Training: 2021-03-18 02:01:09,469-[lfw][70000]Accuracy-Highest: 0.99617 Training: 2021-03-18 02:01:28,016-[cfp_fp][70000]XNorm: 18.645977 Training: 2021-03-18 02:01:28,017-[cfp_fp][70000]Accuracy-Flip: 0.96371+-0.01029 Training: 2021-03-18 02:01:28,017-[cfp_fp][70000]Accuracy-Highest: 0.96557 Training: 2021-03-18 02:01:44,240-[agedb_30][70000]XNorm: 21.679720 Training: 2021-03-18 02:01:44,240-[agedb_30][70000]Accuracy-Flip: 0.96800+-0.00806 Training: 2021-03-18 02:01:44,240-[agedb_30][70000]Accuracy-Highest: 0.97000 Training: 2021-03-18 02:01:53,594-Speed 845.05 samples/sec Loss 7.2425 Epoch: 14 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:03,399-Speed 5222.21 samples/sec Loss 7.1336 Epoch: 14 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:12,945-Speed 5363.56 samples/sec Loss 7.2497 Epoch: 14 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:22,432-Speed 5397.47 samples/sec Loss 7.2738 Epoch: 14 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:31,883-Speed 5417.80 samples/sec Loss 7.3044 Epoch: 14 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:41,991-Speed 5065.57 samples/sec Loss 7.3963 Epoch: 14 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:02:52,157-Speed 5036.98 samples/sec Loss 7.3748 Epoch: 14 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:02,017-Speed 5193.19 samples/sec Loss 7.4484 Epoch: 14 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:11,834-Speed 5215.37 samples/sec Loss 7.3241 Epoch: 14 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:21,392-Speed 5357.65 samples/sec Loss 7.4581 Epoch: 14 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:30,863-Speed 5405.87 samples/sec Loss 7.4381 Epoch: 14 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:40,661-Speed 5225.82 samples/sec Loss 7.4815 Epoch: 14 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:50,301-Speed 5311.65 samples/sec Loss 7.4711 Epoch: 14 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:03:59,917-Speed 5324.80 samples/sec Loss 7.5448 Epoch: 14 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:09,522-Speed 5330.81 samples/sec Loss 7.4805 Epoch: 14 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:19,015-Speed 5394.17 samples/sec Loss 7.5301 Epoch: 14 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:28,667-Speed 5304.52 samples/sec Loss 7.5860 Epoch: 14 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:38,447-Speed 5235.57 samples/sec Loss 7.5649 Epoch: 14 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:48,007-Speed 5356.15 samples/sec Loss 7.6490 Epoch: 14 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:04:57,465-Speed 5413.66 samples/sec Loss 7.5963 Epoch: 14 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:07,045-Speed 5344.89 samples/sec Loss 7.6740 Epoch: 14 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:16,511-Speed 5409.30 samples/sec Loss 7.6255 Epoch: 14 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:26,355-Speed 5201.29 samples/sec Loss 7.6565 Epoch: 14 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:36,221-Speed 5189.86 samples/sec Loss 7.6820 Epoch: 14 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:45,839-Speed 5323.93 samples/sec Loss 7.6723 Epoch: 14 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:05:55,594-Speed 5248.91 samples/sec Loss 7.6590 Epoch: 14 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:05,263-Speed 5295.92 samples/sec Loss 7.7073 Epoch: 14 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:14,790-Speed 5374.13 samples/sec Loss 7.6998 Epoch: 14 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:24,434-Speed 5309.67 samples/sec Loss 7.6681 Epoch: 14 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:33,870-Speed 5426.48 samples/sec Loss 7.7078 Epoch: 14 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:43,186-Speed 5496.45 samples/sec Loss 7.6418 Epoch: 14 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:06:53,154-Speed 5136.49 samples/sec Loss 7.6896 Epoch: 14 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:02,650-Speed 5392.41 samples/sec Loss 7.6467 Epoch: 14 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:12,290-Speed 5311.57 samples/sec Loss 7.7116 Epoch: 14 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:21,882-Speed 5338.19 samples/sec Loss 7.7906 Epoch: 14 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:31,448-Speed 5352.99 samples/sec Loss 7.7023 Epoch: 14 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:40,839-Speed 5451.99 samples/sec Loss 7.6958 Epoch: 14 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:07:50,328-Speed 5395.97 samples/sec Loss 7.7699 Epoch: 14 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:08:00,080-Speed 5250.72 samples/sec Loss 7.7825 Epoch: 14 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:08:09,673-Speed 5337.62 samples/sec Loss 7.6454 Epoch: 14 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:08:26,284-[lfw][72000]XNorm: 22.436213 Training: 2021-03-18 02:08:26,284-[lfw][72000]Accuracy-Flip: 0.99467+-0.00296 Training: 2021-03-18 02:08:26,284-[lfw][72000]Accuracy-Highest: 0.99617 Training: 2021-03-18 02:08:44,743-[cfp_fp][72000]XNorm: 18.671610 Training: 2021-03-18 02:08:44,744-[cfp_fp][72000]Accuracy-Flip: 0.96357+-0.00942 Training: 2021-03-18 02:08:44,744-[cfp_fp][72000]Accuracy-Highest: 0.96557 Training: 2021-03-18 02:09:00,741-[agedb_30][72000]XNorm: 21.739360 Training: 2021-03-18 02:09:00,741-[agedb_30][72000]Accuracy-Flip: 0.96867+-0.00988 Training: 2021-03-18 02:09:00,741-[agedb_30][72000]Accuracy-Highest: 0.97000 Training: 2021-03-18 02:09:10,508-Speed 841.64 samples/sec Loss 7.7366 Epoch: 14 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:09:20,232-Speed 5265.32 samples/sec Loss 7.7760 Epoch: 14 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:09:29,699-Speed 5408.56 samples/sec Loss 7.6982 Epoch: 14 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:09:38,997-Speed 5507.16 samples/sec Loss 7.7598 Epoch: 14 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:09:48,614-Speed 5324.04 samples/sec Loss 7.7587 Epoch: 14 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:09:58,214-Speed 5333.80 samples/sec Loss 7.7832 Epoch: 14 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:07,758-Speed 5364.96 samples/sec Loss 7.7204 Epoch: 14 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:17,251-Speed 5393.78 samples/sec Loss 7.7747 Epoch: 14 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:27,126-Speed 5185.21 samples/sec Loss 7.7465 Epoch: 14 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:36,562-Speed 5426.09 samples/sec Loss 7.7914 Epoch: 14 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:46,336-Speed 5238.60 samples/sec Loss 7.7411 Epoch: 14 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:10:56,027-Speed 5283.70 samples/sec Loss 7.8108 Epoch: 14 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:05,537-Speed 5384.40 samples/sec Loss 7.8501 Epoch: 14 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:15,394-Speed 5194.61 samples/sec Loss 7.8081 Epoch: 14 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:24,884-Speed 5395.17 samples/sec Loss 7.7473 Epoch: 14 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:34,808-Speed 5159.69 samples/sec Loss 7.8066 Epoch: 14 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:44,345-Speed 5368.63 samples/sec Loss 7.8042 Epoch: 14 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:11:54,182-Speed 5205.11 samples/sec Loss 7.7326 Epoch: 14 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:03,885-Speed 5277.62 samples/sec Loss 7.7712 Epoch: 14 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:13,386-Speed 5388.74 samples/sec Loss 7.8342 Epoch: 14 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:22,851-Speed 5409.95 samples/sec Loss 7.7949 Epoch: 14 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:32,385-Speed 5370.52 samples/sec Loss 7.7952 Epoch: 14 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:42,146-Speed 5245.70 samples/sec Loss 7.7514 Epoch: 14 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:12:51,526-Speed 5458.75 samples/sec Loss 7.7946 Epoch: 14 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:01,189-Speed 5299.08 samples/sec Loss 7.8181 Epoch: 14 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:10,871-Speed 5288.24 samples/sec Loss 7.8097 Epoch: 14 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:20,513-Speed 5310.24 samples/sec Loss 7.7691 Epoch: 14 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:30,019-Speed 5386.69 samples/sec Loss 7.8570 Epoch: 14 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:39,400-Speed 5458.36 samples/sec Loss 7.7839 Epoch: 14 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:49,009-Speed 5328.66 samples/sec Loss 7.7933 Epoch: 14 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:13:58,734-Speed 5265.30 samples/sec Loss 7.8336 Epoch: 14 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:08,311-Speed 5346.22 samples/sec Loss 7.8401 Epoch: 14 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:17,760-Speed 5419.15 samples/sec Loss 7.7941 Epoch: 14 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:27,351-Speed 5338.41 samples/sec Loss 7.7990 Epoch: 14 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:36,887-Speed 5369.83 samples/sec Loss 7.7737 Epoch: 14 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:46,393-Speed 5386.30 samples/sec Loss 7.7949 Epoch: 14 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:14:55,818-Speed 5432.84 samples/sec Loss 7.7693 Epoch: 14 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:15:05,359-Speed 5366.45 samples/sec Loss 7.8434 Epoch: 14 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:15:14,738-Speed 5459.64 samples/sec Loss 7.7765 Epoch: 14 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:15:24,145-Speed 5442.97 samples/sec Loss 7.7936 Epoch: 14 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:15:40,553-[lfw][74000]XNorm: 21.348658 Training: 2021-03-18 02:15:40,553-[lfw][74000]Accuracy-Flip: 0.99467+-0.00407 Training: 2021-03-18 02:15:40,553-[lfw][74000]Accuracy-Highest: 0.99617 Training: 2021-03-18 02:15:58,873-[cfp_fp][74000]XNorm: 17.702283 Training: 2021-03-18 02:15:58,873-[cfp_fp][74000]Accuracy-Flip: 0.96171+-0.01018 Training: 2021-03-18 02:15:58,873-[cfp_fp][74000]Accuracy-Highest: 0.96557 Training: 2021-03-18 02:16:14,738-[agedb_30][74000]XNorm: 20.472569 Training: 2021-03-18 02:16:14,739-[agedb_30][74000]Accuracy-Flip: 0.96200+-0.01090 Training: 2021-03-18 02:16:14,739-[agedb_30][74000]Accuracy-Highest: 0.97000 Training: 2021-03-18 02:16:24,026-Speed 855.04 samples/sec Loss 7.8080 Epoch: 14 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:16:33,511-Speed 5398.06 samples/sec Loss 7.7697 Epoch: 14 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:16:43,244-Speed 5260.85 samples/sec Loss 7.8128 Epoch: 14 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:16:52,860-Speed 5324.54 samples/sec Loss 7.7830 Epoch: 14 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:02,194-Speed 5485.75 samples/sec Loss 7.8242 Epoch: 14 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:11,737-Speed 5365.79 samples/sec Loss 7.7433 Epoch: 14 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:21,361-Speed 5320.42 samples/sec Loss 7.8665 Epoch: 14 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:30,853-Speed 5394.19 samples/sec Loss 7.7793 Epoch: 14 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:40,476-Speed 5320.75 samples/sec Loss 7.8652 Epoch: 14 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:50,242-Speed 5243.16 samples/sec Loss 7.7552 Epoch: 14 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:17:59,850-Speed 5329.24 samples/sec Loss 7.7233 Epoch: 14 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:18:09,169-Speed 5494.78 samples/sec Loss 7.7366 Epoch: 14 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:18:18,742-Speed 5348.76 samples/sec Loss 7.8114 Epoch: 14 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:18:28,479-Speed 5258.34 samples/sec Loss 7.7852 Epoch: 14 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:18:40,446-Speed 4278.50 samples/sec Loss 7.4684 Epoch: 15 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:18:50,254-Speed 5220.85 samples/sec Loss 6.8818 Epoch: 15 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:00,136-Speed 5181.44 samples/sec Loss 6.9196 Epoch: 15 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:09,635-Speed 5390.45 samples/sec Loss 7.0231 Epoch: 15 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:19,141-Speed 5386.67 samples/sec Loss 7.0600 Epoch: 15 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:28,728-Speed 5340.83 samples/sec Loss 7.0986 Epoch: 15 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:38,717-Speed 5125.90 samples/sec Loss 7.0973 Epoch: 15 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:48,443-Speed 5265.27 samples/sec Loss 7.1362 Epoch: 15 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:19:58,181-Speed 5257.81 samples/sec Loss 7.2334 Epoch: 15 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:07,867-Speed 5286.39 samples/sec Loss 7.2482 Epoch: 15 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:17,549-Speed 5288.53 samples/sec Loss 7.2413 Epoch: 15 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:27,849-Speed 4971.18 samples/sec Loss 7.3149 Epoch: 15 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:37,519-Speed 5295.08 samples/sec Loss 7.3803 Epoch: 15 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:47,349-Speed 5208.90 samples/sec Loss 7.3508 Epoch: 15 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:20:56,802-Speed 5416.81 samples/sec Loss 7.3608 Epoch: 15 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:06,409-Speed 5329.58 samples/sec Loss 7.4230 Epoch: 15 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:15,900-Speed 5395.06 samples/sec Loss 7.4115 Epoch: 15 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:25,576-Speed 5291.63 samples/sec Loss 7.4449 Epoch: 15 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:35,125-Speed 5362.49 samples/sec Loss 7.4973 Epoch: 15 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:44,723-Speed 5334.39 samples/sec Loss 7.5139 Epoch: 15 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:21:54,149-Speed 5432.35 samples/sec Loss 7.5095 Epoch: 15 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:03,751-Speed 5332.65 samples/sec Loss 7.5075 Epoch: 15 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:13,282-Speed 5371.90 samples/sec Loss 7.5341 Epoch: 15 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:23,087-Speed 5222.15 samples/sec Loss 7.5490 Epoch: 15 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:32,528-Speed 5423.97 samples/sec Loss 7.5978 Epoch: 15 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:42,312-Speed 5233.45 samples/sec Loss 7.5566 Epoch: 15 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:22:58,806-[lfw][76000]XNorm: 23.183258 Training: 2021-03-18 02:22:58,806-[lfw][76000]Accuracy-Flip: 0.99600+-0.00374 Training: 2021-03-18 02:22:58,806-[lfw][76000]Accuracy-Highest: 0.99617 Training: 2021-03-18 02:23:17,292-[cfp_fp][76000]XNorm: 19.348293 Training: 2021-03-18 02:23:17,292-[cfp_fp][76000]Accuracy-Flip: 0.96186+-0.01061 Training: 2021-03-18 02:23:17,295-[cfp_fp][76000]Accuracy-Highest: 0.96557 Training: 2021-03-18 02:23:33,460-[agedb_30][76000]XNorm: 22.452431 Training: 2021-03-18 02:23:33,461-[agedb_30][76000]Accuracy-Flip: 0.96567+-0.00807 Training: 2021-03-18 02:23:33,461-[agedb_30][76000]Accuracy-Highest: 0.97000 Training: 2021-03-18 02:23:43,052-Speed 842.95 samples/sec Loss 7.5399 Epoch: 15 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:23:52,589-Speed 5368.49 samples/sec Loss 7.5926 Epoch: 15 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:02,273-Speed 5287.71 samples/sec Loss 7.6553 Epoch: 15 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:11,848-Speed 5347.76 samples/sec Loss 7.5284 Epoch: 15 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:21,527-Speed 5289.77 samples/sec Loss 7.6439 Epoch: 15 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:30,987-Speed 5412.69 samples/sec Loss 7.6446 Epoch: 15 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:40,602-Speed 5325.34 samples/sec Loss 7.6016 Epoch: 15 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:50,113-Speed 5383.47 samples/sec Loss 7.5937 Epoch: 15 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:24:59,701-Speed 5340.36 samples/sec Loss 7.6509 Epoch: 15 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:09,123-Speed 5434.69 samples/sec Loss 7.6016 Epoch: 15 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:18,781-Speed 5301.85 samples/sec Loss 7.6369 Epoch: 15 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:28,276-Speed 5393.14 samples/sec Loss 7.6980 Epoch: 15 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:37,739-Speed 5410.61 samples/sec Loss 7.5843 Epoch: 15 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:47,193-Speed 5416.03 samples/sec Loss 7.6832 Epoch: 15 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:25:56,711-Speed 5379.67 samples/sec Loss 7.6995 Epoch: 15 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:06,150-Speed 5424.51 samples/sec Loss 7.7182 Epoch: 15 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:15,567-Speed 5437.24 samples/sec Loss 7.6446 Epoch: 15 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:25,054-Speed 5397.22 samples/sec Loss 7.6895 Epoch: 15 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:34,920-Speed 5189.81 samples/sec Loss 7.6803 Epoch: 15 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:44,451-Speed 5372.54 samples/sec Loss 7.6828 Epoch: 15 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:26:53,994-Speed 5365.40 samples/sec Loss 7.7169 Epoch: 15 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:03,457-Speed 5411.10 samples/sec Loss 7.7006 Epoch: 15 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:12,912-Speed 5415.05 samples/sec Loss 7.7095 Epoch: 15 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:22,569-Speed 5302.04 samples/sec Loss 7.6637 Epoch: 15 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:32,472-Speed 5170.79 samples/sec Loss 7.6663 Epoch: 15 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:42,052-Speed 5344.98 samples/sec Loss 7.7147 Epoch: 15 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:27:51,727-Speed 5292.01 samples/sec Loss 7.6949 Epoch: 15 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:01,373-Speed 5308.13 samples/sec Loss 7.7246 Epoch: 15 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:10,971-Speed 5334.56 samples/sec Loss 7.7117 Epoch: 15 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:20,575-Speed 5331.45 samples/sec Loss 7.7330 Epoch: 15 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:30,410-Speed 5206.32 samples/sec Loss 7.7385 Epoch: 15 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:39,834-Speed 5433.16 samples/sec Loss 7.7369 Epoch: 15 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:49,488-Speed 5304.29 samples/sec Loss 7.7215 Epoch: 15 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:28:58,889-Speed 5446.33 samples/sec Loss 7.6814 Epoch: 15 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:08,865-Speed 5132.42 samples/sec Loss 7.7531 Epoch: 15 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:18,874-Speed 5115.91 samples/sec Loss 7.7939 Epoch: 15 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:28,500-Speed 5319.24 samples/sec Loss 7.7485 Epoch: 15 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:38,257-Speed 5247.55 samples/sec Loss 7.7344 Epoch: 15 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:47,540-Speed 5516.23 samples/sec Loss 7.7511 Epoch: 15 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:29:56,840-Speed 5505.45 samples/sec Loss 7.7772 Epoch: 15 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:30:13,533-[lfw][78000]XNorm: 22.949616 Training: 2021-03-18 02:30:13,533-[lfw][78000]Accuracy-Flip: 0.99483+-0.00329 Training: 2021-03-18 02:30:13,535-[lfw][78000]Accuracy-Highest: 0.99617 Training: 2021-03-18 02:30:32,943-[cfp_fp][78000]XNorm: 19.020593 Training: 2021-03-18 02:30:32,944-[cfp_fp][78000]Accuracy-Flip: 0.96271+-0.01087 Training: 2021-03-18 02:30:32,944-[cfp_fp][78000]Accuracy-Highest: 0.96557 Training: 2021-03-18 02:30:48,969-[agedb_30][78000]XNorm: 22.132261 Training: 2021-03-18 02:30:48,970-[agedb_30][78000]Accuracy-Flip: 0.96367+-0.00859 Training: 2021-03-18 02:30:48,970-[agedb_30][78000]Accuracy-Highest: 0.97000 Training: 2021-03-18 02:30:58,364-Speed 832.20 samples/sec Loss 7.7352 Epoch: 15 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:07,718-Speed 5474.09 samples/sec Loss 7.7741 Epoch: 15 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:17,110-Speed 5452.39 samples/sec Loss 7.7546 Epoch: 15 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:26,647-Speed 5368.93 samples/sec Loss 7.8324 Epoch: 15 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:36,064-Speed 5437.27 samples/sec Loss 7.7380 Epoch: 15 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:45,640-Speed 5347.04 samples/sec Loss 7.7672 Epoch: 15 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:31:55,189-Speed 5362.39 samples/sec Loss 7.7482 Epoch: 15 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:04,947-Speed 5247.09 samples/sec Loss 7.7486 Epoch: 15 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:14,458-Speed 5383.48 samples/sec Loss 7.7629 Epoch: 15 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:24,183-Speed 5265.42 samples/sec Loss 7.7506 Epoch: 15 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:33,784-Speed 5332.85 samples/sec Loss 7.7744 Epoch: 15 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:43,309-Speed 5375.71 samples/sec Loss 7.7718 Epoch: 15 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:32:52,765-Speed 5414.79 samples/sec Loss 7.7822 Epoch: 15 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:02,223-Speed 5413.84 samples/sec Loss 7.7659 Epoch: 15 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:12,139-Speed 5163.65 samples/sec Loss 7.7572 Epoch: 15 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:21,734-Speed 5336.84 samples/sec Loss 7.7189 Epoch: 15 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:31,307-Speed 5348.48 samples/sec Loss 7.7977 Epoch: 15 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:41,069-Speed 5244.99 samples/sec Loss 7.8360 Epoch: 15 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:33:50,523-Speed 5416.66 samples/sec Loss 7.7361 Epoch: 15 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:00,176-Speed 5304.58 samples/sec Loss 7.7761 Epoch: 15 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:10,125-Speed 5146.43 samples/sec Loss 7.7554 Epoch: 15 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:19,462-Speed 5483.65 samples/sec Loss 7.7454 Epoch: 15 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:29,104-Speed 5310.46 samples/sec Loss 7.7688 Epoch: 15 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:38,983-Speed 5183.10 samples/sec Loss 7.7055 Epoch: 15 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:48,631-Speed 5307.40 samples/sec Loss 7.7802 Epoch: 15 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:34:58,186-Speed 5358.61 samples/sec Loss 7.7520 Epoch: 15 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:07,752-Speed 5352.90 samples/sec Loss 7.8259 Epoch: 15 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:17,434-Speed 5288.61 samples/sec Loss 7.7621 Epoch: 15 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:27,090-Speed 5302.70 samples/sec Loss 7.7731 Epoch: 15 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:36,540-Speed 5417.96 samples/sec Loss 7.7787 Epoch: 15 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:45,912-Speed 5463.62 samples/sec Loss 7.7834 Epoch: 15 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:35:55,516-Speed 5331.56 samples/sec Loss 7.7805 Epoch: 15 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:05,180-Speed 5298.35 samples/sec Loss 7.7949 Epoch: 15 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:14,647-Speed 5408.63 samples/sec Loss 7.6868 Epoch: 15 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:26,737-Speed 4235.20 samples/sec Loss 6.8833 Epoch: 16 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:36,302-Speed 5352.97 samples/sec Loss 6.2981 Epoch: 16 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:46,053-Speed 5251.29 samples/sec Loss 6.2689 Epoch: 16 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:36:55,717-Speed 5298.52 samples/sec Loss 6.0761 Epoch: 16 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:37:05,641-Speed 5159.81 samples/sec Loss 6.0966 Epoch: 16 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:37:15,584-Speed 5150.03 samples/sec Loss 6.0296 Epoch: 16 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:37:31,870-[lfw][80000]XNorm: 22.974997 Training: 2021-03-18 02:37:31,871-[lfw][80000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-18 02:37:31,871-[lfw][80000]Accuracy-Highest: 0.99667 Training: 2021-03-18 02:37:50,424-[cfp_fp][80000]XNorm: 19.149097 Training: 2021-03-18 02:37:50,424-[cfp_fp][80000]Accuracy-Flip: 0.97100+-0.00582 Training: 2021-03-18 02:37:50,425-[cfp_fp][80000]Accuracy-Highest: 0.97100 Training: 2021-03-18 02:38:06,447-[agedb_30][80000]XNorm: 22.196231 Training: 2021-03-18 02:38:06,447-[agedb_30][80000]Accuracy-Flip: 0.97250+-0.00834 Training: 2021-03-18 02:38:06,447-[agedb_30][80000]Accuracy-Highest: 0.97250 Training: 2021-03-18 02:38:16,318-Speed 843.03 samples/sec Loss 6.0335 Epoch: 16 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:38:25,880-Speed 5354.51 samples/sec Loss 5.9979 Epoch: 16 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:38:35,801-Speed 5161.46 samples/sec Loss 5.9914 Epoch: 16 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:38:45,853-Speed 5093.55 samples/sec Loss 5.9376 Epoch: 16 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:38:55,569-Speed 5270.14 samples/sec Loss 5.8792 Epoch: 16 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:05,070-Speed 5389.54 samples/sec Loss 5.9013 Epoch: 16 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:14,859-Speed 5230.37 samples/sec Loss 5.9088 Epoch: 16 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:24,447-Speed 5340.51 samples/sec Loss 5.7887 Epoch: 16 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:33,961-Speed 5381.81 samples/sec Loss 5.8597 Epoch: 16 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:43,440-Speed 5401.67 samples/sec Loss 5.7879 Epoch: 16 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:39:53,102-Speed 5299.55 samples/sec Loss 5.8125 Epoch: 16 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:02,687-Speed 5342.12 samples/sec Loss 5.7220 Epoch: 16 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:12,474-Speed 5231.95 samples/sec Loss 5.7617 Epoch: 16 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:21,857-Speed 5456.68 samples/sec Loss 5.7264 Epoch: 16 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:31,351-Speed 5393.27 samples/sec Loss 5.6628 Epoch: 16 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:40,844-Speed 5393.94 samples/sec Loss 5.6603 Epoch: 16 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:50,446-Speed 5332.28 samples/sec Loss 5.6979 Epoch: 16 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:40:59,881-Speed 5426.92 samples/sec Loss 5.6398 Epoch: 16 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:09,369-Speed 5396.92 samples/sec Loss 5.6474 Epoch: 16 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:18,808-Speed 5424.40 samples/sec Loss 5.5998 Epoch: 16 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:28,316-Speed 5385.20 samples/sec Loss 5.5790 Epoch: 16 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:37,789-Speed 5405.38 samples/sec Loss 5.6221 Epoch: 16 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:47,307-Speed 5379.76 samples/sec Loss 5.5986 Epoch: 16 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:41:56,793-Speed 5397.60 samples/sec Loss 5.5846 Epoch: 16 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:06,502-Speed 5274.01 samples/sec Loss 5.6416 Epoch: 16 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:16,004-Speed 5388.60 samples/sec Loss 5.5701 Epoch: 16 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:25,541-Speed 5368.71 samples/sec Loss 5.4873 Epoch: 16 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:35,017-Speed 5403.76 samples/sec Loss 5.5295 Epoch: 16 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:44,522-Speed 5386.89 samples/sec Loss 5.5649 Epoch: 16 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:42:54,060-Speed 5368.10 samples/sec Loss 5.4859 Epoch: 16 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:03,689-Speed 5317.90 samples/sec Loss 5.4974 Epoch: 16 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:13,308-Speed 5322.88 samples/sec Loss 5.5868 Epoch: 16 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:22,827-Speed 5379.31 samples/sec Loss 5.5157 Epoch: 16 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:32,645-Speed 5215.00 samples/sec Loss 5.4912 Epoch: 16 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:42,496-Speed 5197.72 samples/sec Loss 5.5167 Epoch: 16 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:43:51,899-Speed 5445.17 samples/sec Loss 5.4886 Epoch: 16 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:44:01,536-Speed 5313.25 samples/sec Loss 5.5425 Epoch: 16 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:44:11,570-Speed 5103.03 samples/sec Loss 5.4779 Epoch: 16 Global Step: 81900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:44:21,183-Speed 5326.25 samples/sec Loss 5.4471 Epoch: 16 Global Step: 81950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:44:30,673-Speed 5395.93 samples/sec Loss 5.4559 Epoch: 16 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:44:47,386-[lfw][82000]XNorm: 23.041957 Training: 2021-03-18 02:44:47,387-[lfw][82000]Accuracy-Flip: 0.99683+-0.00283 Training: 2021-03-18 02:44:47,387-[lfw][82000]Accuracy-Highest: 0.99683 Training: 2021-03-18 02:45:05,950-[cfp_fp][82000]XNorm: 19.219411 Training: 2021-03-18 02:45:05,950-[cfp_fp][82000]Accuracy-Flip: 0.97243+-0.00572 Training: 2021-03-18 02:45:05,950-[cfp_fp][82000]Accuracy-Highest: 0.97243 Training: 2021-03-18 02:45:21,986-[agedb_30][82000]XNorm: 22.262662 Training: 2021-03-18 02:45:21,986-[agedb_30][82000]Accuracy-Flip: 0.97333+-0.00810 Training: 2021-03-18 02:45:21,986-[agedb_30][82000]Accuracy-Highest: 0.97333 Training: 2021-03-18 02:45:31,415-Speed 842.91 samples/sec Loss 5.4508 Epoch: 16 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:45:40,991-Speed 5347.37 samples/sec Loss 5.4702 Epoch: 16 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:45:50,670-Speed 5289.86 samples/sec Loss 5.3749 Epoch: 16 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:00,109-Speed 5424.63 samples/sec Loss 5.4205 Epoch: 16 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:09,640-Speed 5372.27 samples/sec Loss 5.4538 Epoch: 16 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:19,430-Speed 5230.28 samples/sec Loss 5.3952 Epoch: 16 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:29,079-Speed 5307.03 samples/sec Loss 5.4419 Epoch: 16 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:38,606-Speed 5374.69 samples/sec Loss 5.4127 Epoch: 16 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:48,095-Speed 5395.74 samples/sec Loss 5.4102 Epoch: 16 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:46:57,854-Speed 5247.20 samples/sec Loss 5.3542 Epoch: 16 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:07,082-Speed 5548.53 samples/sec Loss 5.3806 Epoch: 16 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:16,697-Speed 5325.48 samples/sec Loss 5.3644 Epoch: 16 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:26,308-Speed 5327.32 samples/sec Loss 5.3298 Epoch: 16 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:35,747-Speed 5424.56 samples/sec Loss 5.3968 Epoch: 16 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:45,582-Speed 5206.54 samples/sec Loss 5.3187 Epoch: 16 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:47:55,303-Speed 5267.35 samples/sec Loss 5.3140 Epoch: 16 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:48:05,207-Speed 5169.59 samples/sec Loss 5.3805 Epoch: 16 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:48:14,800-Speed 5337.73 samples/sec Loss 5.3127 Epoch: 16 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:48:24,327-Speed 5374.24 samples/sec Loss 5.3009 Epoch: 16 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:48:33,855-Speed 5374.01 samples/sec Loss 5.2530 Epoch: 16 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-18 02:48:43,594-Speed 5257.87 samples/sec Loss 5.3639 Epoch: 16 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:48:53,265-Speed 5294.92 samples/sec Loss 5.3250 Epoch: 16 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:02,751-Speed 5397.59 samples/sec Loss 5.2787 Epoch: 16 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:12,213-Speed 5411.22 samples/sec Loss 5.2313 Epoch: 16 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:21,780-Speed 5352.15 samples/sec Loss 5.3664 Epoch: 16 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:31,530-Speed 5251.97 samples/sec Loss 5.2461 Epoch: 16 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:41,196-Speed 5296.97 samples/sec Loss 5.2870 Epoch: 16 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:49:50,580-Speed 5456.66 samples/sec Loss 5.2960 Epoch: 16 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:00,011-Speed 5429.16 samples/sec Loss 5.2541 Epoch: 16 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:09,529-Speed 5379.67 samples/sec Loss 5.1823 Epoch: 16 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:19,130-Speed 5333.08 samples/sec Loss 5.1732 Epoch: 16 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:29,074-Speed 5149.42 samples/sec Loss 5.2525 Epoch: 16 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:38,916-Speed 5202.20 samples/sec Loss 5.2268 Epoch: 16 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:48,474-Speed 5357.47 samples/sec Loss 5.2278 Epoch: 16 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:50:58,093-Speed 5322.76 samples/sec Loss 5.2049 Epoch: 16 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:51:07,839-Speed 5253.90 samples/sec Loss 5.2256 Epoch: 16 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:51:17,311-Speed 5405.75 samples/sec Loss 5.2160 Epoch: 16 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:51:26,868-Speed 5357.49 samples/sec Loss 5.2173 Epoch: 16 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:51:36,265-Speed 5448.80 samples/sec Loss 5.2656 Epoch: 16 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:51:45,660-Speed 5450.21 samples/sec Loss 5.2025 Epoch: 16 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:52:02,699-[lfw][84000]XNorm: 23.074762 Training: 2021-03-18 02:52:02,699-[lfw][84000]Accuracy-Flip: 0.99650+-0.00302 Training: 2021-03-18 02:52:02,699-[lfw][84000]Accuracy-Highest: 0.99683 Training: 2021-03-18 02:52:21,913-[cfp_fp][84000]XNorm: 19.281466 Training: 2021-03-18 02:52:21,913-[cfp_fp][84000]Accuracy-Flip: 0.97429+-0.00691 Training: 2021-03-18 02:52:21,913-[cfp_fp][84000]Accuracy-Highest: 0.97429 Training: 2021-03-18 02:52:37,905-[agedb_30][84000]XNorm: 22.407952 Training: 2021-03-18 02:52:37,905-[agedb_30][84000]Accuracy-Flip: 0.97083+-0.00668 Training: 2021-03-18 02:52:37,905-[agedb_30][84000]Accuracy-Highest: 0.97333 Training: 2021-03-18 02:52:47,389-Speed 829.44 samples/sec Loss 5.2061 Epoch: 16 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:52:56,819-Speed 5429.94 samples/sec Loss 5.2040 Epoch: 16 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:06,231-Speed 5440.02 samples/sec Loss 5.2456 Epoch: 16 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:15,798-Speed 5352.01 samples/sec Loss 5.2018 Epoch: 16 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:25,294-Speed 5392.57 samples/sec Loss 5.1800 Epoch: 16 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:34,955-Speed 5299.77 samples/sec Loss 5.2328 Epoch: 16 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:44,600-Speed 5309.00 samples/sec Loss 5.1349 Epoch: 16 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:53:54,662-Speed 5088.71 samples/sec Loss 5.1550 Epoch: 16 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:04,344-Speed 5288.36 samples/sec Loss 5.1796 Epoch: 16 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:13,913-Speed 5351.14 samples/sec Loss 5.1775 Epoch: 16 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:23,393-Speed 5401.26 samples/sec Loss 5.2042 Epoch: 16 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:33,013-Speed 5322.82 samples/sec Loss 5.1418 Epoch: 16 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:42,281-Speed 5524.71 samples/sec Loss 5.1744 Epoch: 16 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:54:54,550-Speed 4173.40 samples/sec Loss 5.0753 Epoch: 17 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:04,532-Speed 5129.87 samples/sec Loss 4.7379 Epoch: 17 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:14,281-Speed 5251.92 samples/sec Loss 4.6589 Epoch: 17 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:23,784-Speed 5388.05 samples/sec Loss 4.6772 Epoch: 17 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:33,365-Speed 5344.58 samples/sec Loss 4.6633 Epoch: 17 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:42,852-Speed 5397.10 samples/sec Loss 4.6752 Epoch: 17 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:55:53,035-Speed 5028.45 samples/sec Loss 4.7081 Epoch: 17 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:02,771-Speed 5258.82 samples/sec Loss 4.7157 Epoch: 17 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:12,315-Speed 5365.16 samples/sec Loss 4.7047 Epoch: 17 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:22,087-Speed 5239.64 samples/sec Loss 4.7150 Epoch: 17 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:31,708-Speed 5322.37 samples/sec Loss 4.7422 Epoch: 17 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:41,992-Speed 4978.77 samples/sec Loss 4.6993 Epoch: 17 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:56:52,112-Speed 5059.33 samples/sec Loss 4.6700 Epoch: 17 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:01,519-Speed 5443.41 samples/sec Loss 4.7055 Epoch: 17 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:11,199-Speed 5289.67 samples/sec Loss 4.7232 Epoch: 17 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:20,745-Speed 5363.68 samples/sec Loss 4.7292 Epoch: 17 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:30,226-Speed 5400.95 samples/sec Loss 4.7154 Epoch: 17 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:39,704-Speed 5402.02 samples/sec Loss 4.6834 Epoch: 17 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:49,348-Speed 5309.66 samples/sec Loss 4.6873 Epoch: 17 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:57:58,863-Speed 5381.61 samples/sec Loss 4.7055 Epoch: 17 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:08,492-Speed 5317.15 samples/sec Loss 4.7067 Epoch: 17 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:18,118-Speed 5319.19 samples/sec Loss 4.7428 Epoch: 17 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:27,677-Speed 5356.98 samples/sec Loss 4.7827 Epoch: 17 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:37,323-Speed 5308.13 samples/sec Loss 4.6860 Epoch: 17 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:47,089-Speed 5243.01 samples/sec Loss 4.7205 Epoch: 17 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:58:56,603-Speed 5381.84 samples/sec Loss 4.7406 Epoch: 17 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:59:06,532-Speed 5156.74 samples/sec Loss 4.7417 Epoch: 17 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 02:59:22,977-[lfw][86000]XNorm: 23.198682 Training: 2021-03-18 02:59:22,977-[lfw][86000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-18 02:59:22,978-[lfw][86000]Accuracy-Highest: 0.99750 Training: 2021-03-18 02:59:41,388-[cfp_fp][86000]XNorm: 19.358367 Training: 2021-03-18 02:59:41,388-[cfp_fp][86000]Accuracy-Flip: 0.97600+-0.00615 Training: 2021-03-18 02:59:41,388-[cfp_fp][86000]Accuracy-Highest: 0.97600 Training: 2021-03-18 02:59:57,248-[agedb_30][86000]XNorm: 22.413416 Training: 2021-03-18 02:59:57,249-[agedb_30][86000]Accuracy-Flip: 0.97450+-0.00719 Training: 2021-03-18 02:59:57,249-[agedb_30][86000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:00:06,685-Speed 851.18 samples/sec Loss 4.7271 Epoch: 17 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:00:16,040-Speed 5473.19 samples/sec Loss 4.7048 Epoch: 17 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:00:25,528-Speed 5397.01 samples/sec Loss 4.7271 Epoch: 17 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:00:35,099-Speed 5349.89 samples/sec Loss 4.6562 Epoch: 17 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:00:44,740-Speed 5310.89 samples/sec Loss 4.7254 Epoch: 17 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:00:54,332-Speed 5337.90 samples/sec Loss 4.7599 Epoch: 17 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:03,830-Speed 5391.06 samples/sec Loss 4.7659 Epoch: 17 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:13,414-Speed 5342.12 samples/sec Loss 4.7870 Epoch: 17 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:22,927-Speed 5383.03 samples/sec Loss 4.7549 Epoch: 17 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:32,384-Speed 5414.11 samples/sec Loss 4.6834 Epoch: 17 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:41,968-Speed 5342.62 samples/sec Loss 4.7588 Epoch: 17 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:01:51,409-Speed 5423.52 samples/sec Loss 4.7154 Epoch: 17 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:00,774-Speed 5467.24 samples/sec Loss 4.8085 Epoch: 17 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:10,156-Speed 5457.66 samples/sec Loss 4.7306 Epoch: 17 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:19,775-Speed 5323.25 samples/sec Loss 4.7454 Epoch: 17 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:29,578-Speed 5223.15 samples/sec Loss 4.7572 Epoch: 17 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:39,387-Speed 5220.40 samples/sec Loss 4.7160 Epoch: 17 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:48,831-Speed 5422.00 samples/sec Loss 4.7472 Epoch: 17 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:02:58,337-Speed 5386.22 samples/sec Loss 4.7688 Epoch: 17 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:07,824-Speed 5396.96 samples/sec Loss 4.7346 Epoch: 17 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:17,315-Speed 5395.18 samples/sec Loss 4.7504 Epoch: 17 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:26,899-Speed 5342.32 samples/sec Loss 4.7566 Epoch: 17 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:36,317-Speed 5436.68 samples/sec Loss 4.7313 Epoch: 17 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:45,983-Speed 5297.60 samples/sec Loss 4.7409 Epoch: 17 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:03:55,404-Speed 5434.86 samples/sec Loss 4.7336 Epoch: 17 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:04,835-Speed 5429.28 samples/sec Loss 4.7240 Epoch: 17 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:14,345-Speed 5384.25 samples/sec Loss 4.7506 Epoch: 17 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:23,896-Speed 5360.99 samples/sec Loss 4.7563 Epoch: 17 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:33,511-Speed 5325.08 samples/sec Loss 4.7613 Epoch: 17 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:43,391-Speed 5182.53 samples/sec Loss 4.7441 Epoch: 17 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:04:52,879-Speed 5396.75 samples/sec Loss 4.7482 Epoch: 17 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:02,293-Speed 5438.91 samples/sec Loss 4.7661 Epoch: 17 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:12,025-Speed 5261.79 samples/sec Loss 4.7582 Epoch: 17 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:22,056-Speed 5104.08 samples/sec Loss 4.7468 Epoch: 17 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:31,503-Speed 5420.27 samples/sec Loss 4.7657 Epoch: 17 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:41,612-Speed 5065.04 samples/sec Loss 4.7372 Epoch: 17 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:05:51,294-Speed 5288.36 samples/sec Loss 4.7762 Epoch: 17 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:06:00,934-Speed 5311.71 samples/sec Loss 4.7935 Epoch: 17 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:06:10,406-Speed 5405.31 samples/sec Loss 4.7365 Epoch: 17 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:06:20,011-Speed 5330.94 samples/sec Loss 4.7641 Epoch: 17 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:06:36,763-[lfw][88000]XNorm: 23.320189 Training: 2021-03-18 03:06:36,764-[lfw][88000]Accuracy-Flip: 0.99700+-0.00296 Training: 2021-03-18 03:06:36,764-[lfw][88000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:06:55,363-[cfp_fp][88000]XNorm: 19.485874 Training: 2021-03-18 03:06:55,363-[cfp_fp][88000]Accuracy-Flip: 0.97557+-0.00677 Training: 2021-03-18 03:06:55,363-[cfp_fp][88000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:07:11,310-[agedb_30][88000]XNorm: 22.512356 Training: 2021-03-18 03:07:11,310-[agedb_30][88000]Accuracy-Flip: 0.97133+-0.00737 Training: 2021-03-18 03:07:11,311-[agedb_30][88000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:07:20,835-Speed 841.79 samples/sec Loss 4.6662 Epoch: 17 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:07:30,308-Speed 5404.98 samples/sec Loss 4.7430 Epoch: 17 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:07:39,887-Speed 5345.17 samples/sec Loss 4.7856 Epoch: 17 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:07:49,395-Speed 5385.58 samples/sec Loss 4.7832 Epoch: 17 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:07:58,915-Speed 5378.28 samples/sec Loss 4.7710 Epoch: 17 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:08,390-Speed 5404.31 samples/sec Loss 4.7456 Epoch: 17 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:17,901-Speed 5383.56 samples/sec Loss 4.7655 Epoch: 17 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:27,330-Speed 5430.28 samples/sec Loss 4.7477 Epoch: 17 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:36,933-Speed 5332.26 samples/sec Loss 4.7600 Epoch: 17 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:46,566-Speed 5315.01 samples/sec Loss 4.6993 Epoch: 17 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:08:56,056-Speed 5395.57 samples/sec Loss 4.7858 Epoch: 17 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:09:05,845-Speed 5230.67 samples/sec Loss 4.7346 Epoch: 17 Global Step: 88600 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-18 03:09:15,451-Speed 5330.67 samples/sec Loss 4.7551 Epoch: 17 Global Step: 88650 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-18 03:09:24,911-Speed 5412.29 samples/sec Loss 4.7053 Epoch: 17 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:09:34,480-Speed 5351.02 samples/sec Loss 4.7435 Epoch: 17 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:09:44,091-Speed 5327.55 samples/sec Loss 4.7121 Epoch: 17 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:09:53,373-Speed 5516.45 samples/sec Loss 4.8137 Epoch: 17 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:02,953-Speed 5344.84 samples/sec Loss 4.7551 Epoch: 17 Global Step: 88900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:12,345-Speed 5452.11 samples/sec Loss 4.7684 Epoch: 17 Global Step: 88950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:21,867-Speed 5377.20 samples/sec Loss 4.7576 Epoch: 17 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:31,488-Speed 5321.89 samples/sec Loss 4.7501 Epoch: 17 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:41,119-Speed 5316.78 samples/sec Loss 4.7814 Epoch: 17 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:10:50,916-Speed 5225.97 samples/sec Loss 4.6842 Epoch: 17 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:00,507-Speed 5338.88 samples/sec Loss 4.7238 Epoch: 17 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:09,989-Speed 5399.91 samples/sec Loss 4.6924 Epoch: 17 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:19,492-Speed 5388.07 samples/sec Loss 4.7466 Epoch: 17 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:29,091-Speed 5334.20 samples/sec Loss 4.7792 Epoch: 17 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:38,697-Speed 5330.11 samples/sec Loss 4.7340 Epoch: 17 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:48,340-Speed 5310.14 samples/sec Loss 4.7306 Epoch: 17 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:11:57,990-Speed 5306.17 samples/sec Loss 4.7439 Epoch: 17 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:07,499-Speed 5384.41 samples/sec Loss 4.7848 Epoch: 17 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:17,405-Speed 5169.26 samples/sec Loss 4.7370 Epoch: 17 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:26,860-Speed 5415.32 samples/sec Loss 4.7623 Epoch: 17 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:39,057-Speed 4197.92 samples/sec Loss 4.5405 Epoch: 18 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:49,084-Speed 5106.53 samples/sec Loss 4.2687 Epoch: 18 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:12:58,701-Speed 5324.49 samples/sec Loss 4.3138 Epoch: 18 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:13:08,200-Speed 5390.19 samples/sec Loss 4.3244 Epoch: 18 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:13:17,853-Speed 5304.47 samples/sec Loss 4.3353 Epoch: 18 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:13:27,461-Speed 5329.64 samples/sec Loss 4.3639 Epoch: 18 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:13:37,606-Speed 5047.05 samples/sec Loss 4.3028 Epoch: 18 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:13:54,015-[lfw][90000]XNorm: 23.084364 Training: 2021-03-18 03:13:54,016-[lfw][90000]Accuracy-Flip: 0.99683+-0.00311 Training: 2021-03-18 03:13:54,016-[lfw][90000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:14:12,480-[cfp_fp][90000]XNorm: 19.300769 Training: 2021-03-18 03:14:12,480-[cfp_fp][90000]Accuracy-Flip: 0.97429+-0.00673 Training: 2021-03-18 03:14:12,480-[cfp_fp][90000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:14:28,493-[agedb_30][90000]XNorm: 22.247687 Training: 2021-03-18 03:14:28,493-[agedb_30][90000]Accuracy-Flip: 0.97333+-0.00662 Training: 2021-03-18 03:14:28,496-[agedb_30][90000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:14:38,070-Speed 846.80 samples/sec Loss 4.3640 Epoch: 18 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:14:47,709-Speed 5312.37 samples/sec Loss 4.3364 Epoch: 18 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:14:57,863-Speed 5042.68 samples/sec Loss 4.3334 Epoch: 18 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:07,705-Speed 5202.61 samples/sec Loss 4.3200 Epoch: 18 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:17,808-Speed 5067.85 samples/sec Loss 4.3580 Epoch: 18 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:27,342-Speed 5370.78 samples/sec Loss 4.3704 Epoch: 18 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:37,152-Speed 5219.54 samples/sec Loss 4.3649 Epoch: 18 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:46,656-Speed 5387.58 samples/sec Loss 4.3176 Epoch: 18 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:15:56,212-Speed 5358.51 samples/sec Loss 4.4733 Epoch: 18 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:05,961-Speed 5251.88 samples/sec Loss 4.3830 Epoch: 18 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:15,509-Speed 5362.46 samples/sec Loss 4.3663 Epoch: 18 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:24,944-Speed 5426.99 samples/sec Loss 4.3561 Epoch: 18 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:34,504-Speed 5356.32 samples/sec Loss 4.3820 Epoch: 18 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:44,082-Speed 5346.20 samples/sec Loss 4.3617 Epoch: 18 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:16:53,625-Speed 5365.55 samples/sec Loss 4.3983 Epoch: 18 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:03,210-Speed 5341.80 samples/sec Loss 4.4154 Epoch: 18 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:12,839-Speed 5317.52 samples/sec Loss 4.3656 Epoch: 18 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:22,329-Speed 5395.43 samples/sec Loss 4.4179 Epoch: 18 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:31,620-Speed 5511.39 samples/sec Loss 4.4485 Epoch: 18 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:41,212-Speed 5338.05 samples/sec Loss 4.4272 Epoch: 18 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:17:50,777-Speed 5352.99 samples/sec Loss 4.4120 Epoch: 18 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:00,236-Speed 5413.49 samples/sec Loss 4.4542 Epoch: 18 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:09,791-Speed 5358.44 samples/sec Loss 4.4099 Epoch: 18 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:19,340-Speed 5362.19 samples/sec Loss 4.4668 Epoch: 18 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:28,855-Speed 5381.42 samples/sec Loss 4.4171 Epoch: 18 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:38,457-Speed 5332.73 samples/sec Loss 4.4843 Epoch: 18 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:48,093-Speed 5313.52 samples/sec Loss 4.4902 Epoch: 18 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:18:57,828-Speed 5259.82 samples/sec Loss 4.4115 Epoch: 18 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:07,351-Speed 5376.84 samples/sec Loss 4.3779 Epoch: 18 Global Step: 91450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:16,866-Speed 5381.61 samples/sec Loss 4.4387 Epoch: 18 Global Step: 91500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:26,648-Speed 5233.95 samples/sec Loss 4.4417 Epoch: 18 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:36,155-Speed 5385.78 samples/sec Loss 4.4753 Epoch: 18 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:45,532-Speed 5460.90 samples/sec Loss 4.4182 Epoch: 18 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:19:55,233-Speed 5277.86 samples/sec Loss 4.4913 Epoch: 18 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:04,867-Speed 5315.10 samples/sec Loss 4.4752 Epoch: 18 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:14,914-Speed 5096.05 samples/sec Loss 4.4678 Epoch: 18 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:24,621-Speed 5275.34 samples/sec Loss 4.4805 Epoch: 18 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:34,329-Speed 5274.34 samples/sec Loss 4.4972 Epoch: 18 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:43,974-Speed 5308.66 samples/sec Loss 4.4880 Epoch: 18 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:20:53,620-Speed 5308.18 samples/sec Loss 4.4415 Epoch: 18 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:21:10,076-[lfw][92000]XNorm: 23.302465 Training: 2021-03-18 03:21:10,077-[lfw][92000]Accuracy-Flip: 0.99717+-0.00248 Training: 2021-03-18 03:21:10,077-[lfw][92000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:21:28,566-[cfp_fp][92000]XNorm: 19.490075 Training: 2021-03-18 03:21:28,566-[cfp_fp][92000]Accuracy-Flip: 0.97429+-0.00589 Training: 2021-03-18 03:21:28,566-[cfp_fp][92000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:21:44,627-[agedb_30][92000]XNorm: 22.528682 Training: 2021-03-18 03:21:44,627-[agedb_30][92000]Accuracy-Flip: 0.97100+-0.00696 Training: 2021-03-18 03:21:44,627-[agedb_30][92000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:21:53,973-Speed 848.35 samples/sec Loss 4.4717 Epoch: 18 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:03,515-Speed 5366.69 samples/sec Loss 4.4879 Epoch: 18 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:12,781-Speed 5525.66 samples/sec Loss 4.4531 Epoch: 18 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:22,513-Speed 5261.73 samples/sec Loss 4.4486 Epoch: 18 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:31,938-Speed 5432.27 samples/sec Loss 4.4929 Epoch: 18 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:41,726-Speed 5231.47 samples/sec Loss 4.4796 Epoch: 18 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:22:51,356-Speed 5316.67 samples/sec Loss 4.4745 Epoch: 18 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:00,928-Speed 5349.67 samples/sec Loss 4.4815 Epoch: 18 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:10,793-Speed 5190.23 samples/sec Loss 4.5252 Epoch: 18 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:20,317-Speed 5376.10 samples/sec Loss 4.5326 Epoch: 18 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:30,142-Speed 5211.84 samples/sec Loss 4.4934 Epoch: 18 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:39,693-Speed 5360.96 samples/sec Loss 4.5394 Epoch: 18 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:49,164-Speed 5406.06 samples/sec Loss 4.5431 Epoch: 18 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:23:59,498-Speed 4955.23 samples/sec Loss 4.5133 Epoch: 18 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:09,201-Speed 5277.15 samples/sec Loss 4.5580 Epoch: 18 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:18,881-Speed 5289.26 samples/sec Loss 4.5537 Epoch: 18 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:28,349-Speed 5408.16 samples/sec Loss 4.5061 Epoch: 18 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:37,905-Speed 5358.32 samples/sec Loss 4.5401 Epoch: 18 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:47,430-Speed 5375.62 samples/sec Loss 4.4572 Epoch: 18 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:24:56,958-Speed 5373.74 samples/sec Loss 4.5153 Epoch: 18 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:06,543-Speed 5342.24 samples/sec Loss 4.5039 Epoch: 18 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:16,287-Speed 5254.98 samples/sec Loss 4.5948 Epoch: 18 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:25,878-Speed 5338.67 samples/sec Loss 4.5028 Epoch: 18 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:35,319-Speed 5423.43 samples/sec Loss 4.5727 Epoch: 18 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:44,994-Speed 5292.23 samples/sec Loss 4.5455 Epoch: 18 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:25:54,480-Speed 5397.70 samples/sec Loss 4.5533 Epoch: 18 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:03,971-Speed 5395.01 samples/sec Loss 4.5409 Epoch: 18 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:13,796-Speed 5211.18 samples/sec Loss 4.5769 Epoch: 18 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:23,578-Speed 5234.63 samples/sec Loss 4.5367 Epoch: 18 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:33,015-Speed 5425.55 samples/sec Loss 4.5521 Epoch: 18 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:42,376-Speed 5469.73 samples/sec Loss 4.5556 Epoch: 18 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:26:51,842-Speed 5409.30 samples/sec Loss 4.5249 Epoch: 18 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:01,377-Speed 5370.00 samples/sec Loss 4.5857 Epoch: 18 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:10,975-Speed 5335.17 samples/sec Loss 4.5997 Epoch: 18 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:20,554-Speed 5344.93 samples/sec Loss 4.5785 Epoch: 18 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:30,008-Speed 5416.34 samples/sec Loss 4.5328 Epoch: 18 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:39,602-Speed 5337.01 samples/sec Loss 4.6261 Epoch: 18 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:49,170-Speed 5351.35 samples/sec Loss 4.5710 Epoch: 18 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:27:58,811-Speed 5311.17 samples/sec Loss 4.5312 Epoch: 18 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:28:08,364-Speed 5359.86 samples/sec Loss 4.5794 Epoch: 18 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:28:25,249-[lfw][94000]XNorm: 23.208978 Training: 2021-03-18 03:28:25,250-[lfw][94000]Accuracy-Flip: 0.99733+-0.00226 Training: 2021-03-18 03:28:25,250-[lfw][94000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:28:43,694-[cfp_fp][94000]XNorm: 19.425955 Training: 2021-03-18 03:28:43,694-[cfp_fp][94000]Accuracy-Flip: 0.97600+-0.00592 Training: 2021-03-18 03:28:43,694-[cfp_fp][94000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:28:59,679-[agedb_30][94000]XNorm: 22.448849 Training: 2021-03-18 03:28:59,679-[agedb_30][94000]Accuracy-Flip: 0.97333+-0.00820 Training: 2021-03-18 03:28:59,679-[agedb_30][94000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:29:09,157-Speed 842.22 samples/sec Loss 4.5548 Epoch: 18 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:29:18,811-Speed 5303.86 samples/sec Loss 4.5590 Epoch: 18 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:29:28,271-Speed 5412.75 samples/sec Loss 4.6126 Epoch: 18 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:29:38,060-Speed 5230.32 samples/sec Loss 4.5392 Epoch: 18 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:29:47,614-Speed 5359.60 samples/sec Loss 4.5919 Epoch: 18 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:29:56,996-Speed 5457.87 samples/sec Loss 4.6126 Epoch: 18 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:06,922-Speed 5158.04 samples/sec Loss 4.5392 Epoch: 18 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:16,509-Speed 5340.77 samples/sec Loss 4.5135 Epoch: 18 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:26,014-Speed 5387.03 samples/sec Loss 4.6169 Epoch: 18 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:35,540-Speed 5375.10 samples/sec Loss 4.6228 Epoch: 18 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:44,958-Speed 5437.00 samples/sec Loss 4.5674 Epoch: 18 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:30:54,512-Speed 5359.07 samples/sec Loss 4.5723 Epoch: 18 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:04,079-Speed 5352.12 samples/sec Loss 4.6072 Epoch: 18 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:17,881-Speed 3709.81 samples/sec Loss 4.1819 Epoch: 19 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:27,586-Speed 5276.11 samples/sec Loss 4.1977 Epoch: 19 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:37,429-Speed 5201.99 samples/sec Loss 4.1046 Epoch: 19 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:47,077-Speed 5307.48 samples/sec Loss 4.1507 Epoch: 19 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:31:56,899-Speed 5212.80 samples/sec Loss 4.1753 Epoch: 19 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:06,916-Speed 5111.59 samples/sec Loss 4.1478 Epoch: 19 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:16,641-Speed 5265.26 samples/sec Loss 4.1232 Epoch: 19 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:26,372-Speed 5261.94 samples/sec Loss 4.1774 Epoch: 19 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:35,825-Speed 5416.45 samples/sec Loss 4.1607 Epoch: 19 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:45,403-Speed 5345.98 samples/sec Loss 4.1471 Epoch: 19 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:32:55,284-Speed 5182.29 samples/sec Loss 4.1984 Epoch: 19 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:05,157-Speed 5185.87 samples/sec Loss 4.2341 Epoch: 19 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:14,957-Speed 5225.25 samples/sec Loss 4.1866 Epoch: 19 Global Step: 95300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:24,702-Speed 5254.34 samples/sec Loss 4.1538 Epoch: 19 Global Step: 95350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:34,257-Speed 5358.45 samples/sec Loss 4.1952 Epoch: 19 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:43,739-Speed 5400.37 samples/sec Loss 4.2417 Epoch: 19 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:33:53,188-Speed 5418.71 samples/sec Loss 4.2639 Epoch: 19 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:02,786-Speed 5334.48 samples/sec Loss 4.2009 Epoch: 19 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:12,469-Speed 5288.03 samples/sec Loss 4.1994 Epoch: 19 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:22,114-Speed 5308.60 samples/sec Loss 4.2366 Epoch: 19 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:31,734-Speed 5323.04 samples/sec Loss 4.2254 Epoch: 19 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:41,148-Speed 5438.50 samples/sec Loss 4.2994 Epoch: 19 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:34:50,901-Speed 5250.36 samples/sec Loss 4.1788 Epoch: 19 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:35:00,735-Speed 5206.56 samples/sec Loss 4.3118 Epoch: 19 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:35:10,239-Speed 5387.30 samples/sec Loss 4.2219 Epoch: 19 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:35:19,999-Speed 5246.49 samples/sec Loss 4.3086 Epoch: 19 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:35:29,514-Speed 5381.12 samples/sec Loss 4.2461 Epoch: 19 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:35:46,023-[lfw][96000]XNorm: 23.145661 Training: 2021-03-18 03:35:46,023-[lfw][96000]Accuracy-Flip: 0.99683+-0.00311 Training: 2021-03-18 03:35:46,023-[lfw][96000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:36:04,494-[cfp_fp][96000]XNorm: 19.291632 Training: 2021-03-18 03:36:04,495-[cfp_fp][96000]Accuracy-Flip: 0.97314+-0.00736 Training: 2021-03-18 03:36:04,495-[cfp_fp][96000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:36:20,447-[agedb_30][96000]XNorm: 22.353049 Training: 2021-03-18 03:36:20,447-[agedb_30][96000]Accuracy-Flip: 0.97317+-0.00765 Training: 2021-03-18 03:36:20,448-[agedb_30][96000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:36:30,109-Speed 844.96 samples/sec Loss 4.2879 Epoch: 19 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:36:39,559-Speed 5418.67 samples/sec Loss 4.2941 Epoch: 19 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:36:49,242-Speed 5287.80 samples/sec Loss 4.3511 Epoch: 19 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:36:58,454-Speed 5558.31 samples/sec Loss 4.3032 Epoch: 19 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:07,923-Speed 5407.40 samples/sec Loss 4.2551 Epoch: 19 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:17,746-Speed 5212.25 samples/sec Loss 4.3173 Epoch: 19 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:27,409-Speed 5298.94 samples/sec Loss 4.3610 Epoch: 19 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:36,948-Speed 5367.75 samples/sec Loss 4.2786 Epoch: 19 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:46,821-Speed 5186.56 samples/sec Loss 4.2912 Epoch: 19 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:37:56,476-Speed 5303.02 samples/sec Loss 4.3615 Epoch: 19 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:06,035-Speed 5356.31 samples/sec Loss 4.3083 Epoch: 19 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:15,563-Speed 5374.21 samples/sec Loss 4.3130 Epoch: 19 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:24,944-Speed 5457.76 samples/sec Loss 4.2749 Epoch: 19 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:34,467-Speed 5377.28 samples/sec Loss 4.3110 Epoch: 19 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:43,916-Speed 5418.46 samples/sec Loss 4.3673 Epoch: 19 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:38:53,777-Speed 5192.52 samples/sec Loss 4.3696 Epoch: 19 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:03,451-Speed 5292.91 samples/sec Loss 4.2901 Epoch: 19 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:13,148-Speed 5280.63 samples/sec Loss 4.3746 Epoch: 19 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:22,574-Speed 5432.37 samples/sec Loss 4.3610 Epoch: 19 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:32,102-Speed 5373.46 samples/sec Loss 4.3701 Epoch: 19 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:41,851-Speed 5252.62 samples/sec Loss 4.3560 Epoch: 19 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:39:51,281-Speed 5429.61 samples/sec Loss 4.3821 Epoch: 19 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:00,830-Speed 5362.34 samples/sec Loss 4.3825 Epoch: 19 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:10,257-Speed 5431.65 samples/sec Loss 4.3893 Epoch: 19 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:19,993-Speed 5259.24 samples/sec Loss 4.3900 Epoch: 19 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:29,568-Speed 5347.52 samples/sec Loss 4.3631 Epoch: 19 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:39,165-Speed 5335.45 samples/sec Loss 4.3846 Epoch: 19 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:48,752-Speed 5340.62 samples/sec Loss 4.3721 Epoch: 19 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:40:58,604-Speed 5197.45 samples/sec Loss 4.3701 Epoch: 19 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:08,118-Speed 5381.79 samples/sec Loss 4.4269 Epoch: 19 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:17,658-Speed 5366.93 samples/sec Loss 4.3850 Epoch: 19 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:27,510-Speed 5197.27 samples/sec Loss 4.3848 Epoch: 19 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:37,186-Speed 5291.97 samples/sec Loss 4.3819 Epoch: 19 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:47,096-Speed 5166.54 samples/sec Loss 4.4443 Epoch: 19 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:41:56,798-Speed 5278.11 samples/sec Loss 4.4251 Epoch: 19 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:42:06,544-Speed 5253.41 samples/sec Loss 4.3929 Epoch: 19 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:42:16,458-Speed 5164.73 samples/sec Loss 4.4310 Epoch: 19 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:42:26,302-Speed 5201.28 samples/sec Loss 4.4019 Epoch: 19 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:42:35,862-Speed 5356.45 samples/sec Loss 4.4458 Epoch: 19 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:42:45,534-Speed 5293.86 samples/sec Loss 4.4478 Epoch: 19 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:43:02,394-[lfw][98000]XNorm: 23.312278 Training: 2021-03-18 03:43:02,394-[lfw][98000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-18 03:43:02,394-[lfw][98000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:43:21,070-[cfp_fp][98000]XNorm: 19.522450 Training: 2021-03-18 03:43:21,070-[cfp_fp][98000]Accuracy-Flip: 0.97543+-0.00660 Training: 2021-03-18 03:43:21,070-[cfp_fp][98000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:43:37,201-[agedb_30][98000]XNorm: 22.537810 Training: 2021-03-18 03:43:37,201-[agedb_30][98000]Accuracy-Flip: 0.97433+-0.00727 Training: 2021-03-18 03:43:37,201-[agedb_30][98000]Accuracy-Highest: 0.97450 Training: 2021-03-18 03:43:46,377-Speed 841.51 samples/sec Loss 4.4758 Epoch: 19 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:43:55,788-Speed 5441.45 samples/sec Loss 4.4590 Epoch: 19 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:05,361-Speed 5348.69 samples/sec Loss 4.4515 Epoch: 19 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:14,819-Speed 5413.73 samples/sec Loss 4.4177 Epoch: 19 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:24,370-Speed 5361.06 samples/sec Loss 4.4315 Epoch: 19 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:33,894-Speed 5376.36 samples/sec Loss 4.4220 Epoch: 19 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:43,621-Speed 5263.71 samples/sec Loss 4.4448 Epoch: 19 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:44:53,143-Speed 5377.52 samples/sec Loss 4.4609 Epoch: 19 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:02,889-Speed 5253.76 samples/sec Loss 4.4623 Epoch: 19 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:12,612-Speed 5266.30 samples/sec Loss 4.4475 Epoch: 19 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:22,029-Speed 5437.21 samples/sec Loss 4.4729 Epoch: 19 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:31,578-Speed 5362.32 samples/sec Loss 4.4411 Epoch: 19 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:41,130-Speed 5360.47 samples/sec Loss 4.4605 Epoch: 19 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:50,572-Speed 5422.66 samples/sec Loss 4.4904 Epoch: 19 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:45:59,993-Speed 5434.90 samples/sec Loss 4.4776 Epoch: 19 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:09,498-Speed 5387.00 samples/sec Loss 4.4732 Epoch: 19 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:18,859-Speed 5470.05 samples/sec Loss 4.4603 Epoch: 19 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:28,553-Speed 5282.18 samples/sec Loss 4.4572 Epoch: 19 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:38,346-Speed 5228.43 samples/sec Loss 4.4937 Epoch: 19 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:47,746-Speed 5447.31 samples/sec Loss 4.4951 Epoch: 19 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:46:57,435-Speed 5284.19 samples/sec Loss 4.4687 Epoch: 19 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:07,206-Speed 5240.76 samples/sec Loss 4.5164 Epoch: 19 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:16,739-Speed 5370.95 samples/sec Loss 4.4803 Epoch: 19 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:26,290-Speed 5360.91 samples/sec Loss 4.4700 Epoch: 19 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:36,052-Speed 5245.56 samples/sec Loss 4.5211 Epoch: 19 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:45,514-Speed 5410.99 samples/sec Loss 4.4607 Epoch: 19 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:47:55,048-Speed 5370.80 samples/sec Loss 4.5605 Epoch: 19 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:04,715-Speed 5296.76 samples/sec Loss 4.5189 Epoch: 19 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:14,586-Speed 5187.10 samples/sec Loss 4.5147 Epoch: 19 Global Step: 99450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:24,358-Speed 5240.03 samples/sec Loss 4.4666 Epoch: 19 Global Step: 99500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:33,800-Speed 5423.06 samples/sec Loss 4.5129 Epoch: 19 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:43,341-Speed 5366.36 samples/sec Loss 4.4876 Epoch: 19 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-18 03:48:56,021-Speed 4038.26 samples/sec Loss 4.3440 Epoch: 20 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:05,949-Speed 5157.62 samples/sec Loss 4.0097 Epoch: 20 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:15,931-Speed 5129.37 samples/sec Loss 4.0237 Epoch: 20 Global Step: 99750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:25,458-Speed 5374.49 samples/sec Loss 4.0306 Epoch: 20 Global Step: 99800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:35,172-Speed 5271.49 samples/sec Loss 4.0476 Epoch: 20 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:44,778-Speed 5330.51 samples/sec Loss 4.0392 Epoch: 20 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:49:54,879-Speed 5068.80 samples/sec Loss 4.1116 Epoch: 20 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:50:04,620-Speed 5256.67 samples/sec Loss 4.0990 Epoch: 20 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:50:21,424-[lfw][100000]XNorm: 22.889142 Training: 2021-03-18 03:50:21,424-[lfw][100000]Accuracy-Flip: 0.99717+-0.00211 Training: 2021-03-18 03:50:21,424-[lfw][100000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:50:39,911-[cfp_fp][100000]XNorm: 19.180854 Training: 2021-03-18 03:50:39,912-[cfp_fp][100000]Accuracy-Flip: 0.97514+-0.00671 Training: 2021-03-18 03:50:39,912-[cfp_fp][100000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:50:55,932-[agedb_30][100000]XNorm: 22.151279 Training: 2021-03-18 03:50:55,932-[agedb_30][100000]Accuracy-Flip: 0.97583+-0.00708 Training: 2021-03-18 03:50:55,932-[agedb_30][100000]Accuracy-Highest: 0.97583 Training: 2021-03-18 03:51:05,420-Speed 842.12 samples/sec Loss 4.0360 Epoch: 20 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:51:15,083-Speed 5299.03 samples/sec Loss 4.0634 Epoch: 20 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:51:25,246-Speed 5038.05 samples/sec Loss 4.0622 Epoch: 20 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:51:34,925-Speed 5290.35 samples/sec Loss 4.1190 Epoch: 20 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:51:45,104-Speed 5029.90 samples/sec Loss 4.0869 Epoch: 20 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:51:54,959-Speed 5196.01 samples/sec Loss 4.1450 Epoch: 20 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:04,645-Speed 5286.32 samples/sec Loss 4.1440 Epoch: 20 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:13,962-Speed 5495.48 samples/sec Loss 4.0823 Epoch: 20 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:23,703-Speed 5256.33 samples/sec Loss 4.1179 Epoch: 20 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:33,262-Speed 5356.89 samples/sec Loss 4.1623 Epoch: 20 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:42,774-Speed 5383.12 samples/sec Loss 4.1370 Epoch: 20 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:52:52,307-Speed 5371.02 samples/sec Loss 4.1420 Epoch: 20 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:01,868-Speed 5355.53 samples/sec Loss 4.1445 Epoch: 20 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:11,561-Speed 5282.78 samples/sec Loss 4.0976 Epoch: 20 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:21,084-Speed 5376.62 samples/sec Loss 4.1439 Epoch: 20 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:30,520-Speed 5425.98 samples/sec Loss 4.1762 Epoch: 20 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:40,029-Speed 5384.93 samples/sec Loss 4.2152 Epoch: 20 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:49,589-Speed 5356.43 samples/sec Loss 4.1503 Epoch: 20 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:53:58,986-Speed 5448.95 samples/sec Loss 4.2219 Epoch: 20 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:08,757-Speed 5239.92 samples/sec Loss 4.1518 Epoch: 20 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:18,460-Speed 5276.93 samples/sec Loss 4.2130 Epoch: 20 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:28,135-Speed 5292.79 samples/sec Loss 4.1863 Epoch: 20 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:37,676-Speed 5366.69 samples/sec Loss 4.2240 Epoch: 20 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:47,111-Speed 5426.97 samples/sec Loss 4.2724 Epoch: 20 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:54:56,492-Speed 5458.49 samples/sec Loss 4.2403 Epoch: 20 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:06,007-Speed 5380.90 samples/sec Loss 4.2597 Epoch: 20 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:15,376-Speed 5465.37 samples/sec Loss 4.2219 Epoch: 20 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:25,160-Speed 5233.25 samples/sec Loss 4.2314 Epoch: 20 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:34,779-Speed 5323.06 samples/sec Loss 4.2210 Epoch: 20 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:44,457-Speed 5290.59 samples/sec Loss 4.2377 Epoch: 20 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:55:53,967-Speed 5384.43 samples/sec Loss 4.2644 Epoch: 20 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:03,345-Speed 5459.68 samples/sec Loss 4.3013 Epoch: 20 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:12,762-Speed 5437.58 samples/sec Loss 4.2987 Epoch: 20 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:22,328-Speed 5352.86 samples/sec Loss 4.2465 Epoch: 20 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:31,699-Speed 5463.55 samples/sec Loss 4.2526 Epoch: 20 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:41,065-Speed 5467.10 samples/sec Loss 4.2849 Epoch: 20 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:56:50,710-Speed 5308.75 samples/sec Loss 4.2526 Epoch: 20 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:57:00,201-Speed 5394.94 samples/sec Loss 4.2647 Epoch: 20 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:57:09,661-Speed 5412.40 samples/sec Loss 4.3480 Epoch: 20 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:57:19,325-Speed 5298.46 samples/sec Loss 4.3152 Epoch: 20 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:57:35,826-[lfw][102000]XNorm: 22.971302 Training: 2021-03-18 03:57:35,826-[lfw][102000]Accuracy-Flip: 0.99650+-0.00273 Training: 2021-03-18 03:57:35,826-[lfw][102000]Accuracy-Highest: 0.99750 Training: 2021-03-18 03:57:54,194-[cfp_fp][102000]XNorm: 19.231684 Training: 2021-03-18 03:57:54,194-[cfp_fp][102000]Accuracy-Flip: 0.97500+-0.00662 Training: 2021-03-18 03:57:54,195-[cfp_fp][102000]Accuracy-Highest: 0.97600 Training: 2021-03-18 03:58:10,182-[agedb_30][102000]XNorm: 22.145101 Training: 2021-03-18 03:58:10,182-[agedb_30][102000]Accuracy-Flip: 0.97583+-0.00783 Training: 2021-03-18 03:58:10,182-[agedb_30][102000]Accuracy-Highest: 0.97583 Training: 2021-03-18 03:58:19,566-Speed 849.93 samples/sec Loss 4.2855 Epoch: 20 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:58:29,226-Speed 5300.77 samples/sec Loss 4.3157 Epoch: 20 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:58:38,704-Speed 5402.26 samples/sec Loss 4.3477 Epoch: 20 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:58:48,410-Speed 5275.19 samples/sec Loss 4.3198 Epoch: 20 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:58:57,952-Speed 5366.16 samples/sec Loss 4.3235 Epoch: 20 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:07,471-Speed 5379.03 samples/sec Loss 4.3240 Epoch: 20 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:17,047-Speed 5347.15 samples/sec Loss 4.2859 Epoch: 20 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:26,902-Speed 5195.60 samples/sec Loss 4.3556 Epoch: 20 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:36,466-Speed 5353.41 samples/sec Loss 4.3700 Epoch: 20 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:46,097-Speed 5316.56 samples/sec Loss 4.3469 Epoch: 20 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 03:59:55,513-Speed 5437.97 samples/sec Loss 4.3745 Epoch: 20 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:05,337-Speed 5211.90 samples/sec Loss 4.3540 Epoch: 20 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:15,257-Speed 5161.54 samples/sec Loss 4.3214 Epoch: 20 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:24,810-Speed 5360.43 samples/sec Loss 4.3655 Epoch: 20 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:34,467-Speed 5302.24 samples/sec Loss 4.3316 Epoch: 20 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:44,147-Speed 5289.22 samples/sec Loss 4.3281 Epoch: 20 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:00:54,174-Speed 5106.59 samples/sec Loss 4.4014 Epoch: 20 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:03,755-Speed 5344.19 samples/sec Loss 4.3723 Epoch: 20 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:13,297-Speed 5365.98 samples/sec Loss 4.3572 Epoch: 20 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:22,806-Speed 5385.21 samples/sec Loss 4.3487 Epoch: 20 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:32,279-Speed 5405.21 samples/sec Loss 4.3637 Epoch: 20 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:41,925-Speed 5308.18 samples/sec Loss 4.3941 Epoch: 20 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:01:51,326-Speed 5446.63 samples/sec Loss 4.3655 Epoch: 20 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:00,885-Speed 5356.32 samples/sec Loss 4.3219 Epoch: 20 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:10,374-Speed 5396.14 samples/sec Loss 4.4034 Epoch: 20 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:19,676-Speed 5504.90 samples/sec Loss 4.4308 Epoch: 20 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:29,103-Speed 5431.60 samples/sec Loss 4.4642 Epoch: 20 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:38,716-Speed 5326.54 samples/sec Loss 4.4064 Epoch: 20 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:48,321-Speed 5331.03 samples/sec Loss 4.3977 Epoch: 20 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:02:58,007-Speed 5286.42 samples/sec Loss 4.3866 Epoch: 20 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:07,495-Speed 5396.41 samples/sec Loss 4.4292 Epoch: 20 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:17,002-Speed 5385.93 samples/sec Loss 4.4315 Epoch: 20 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:26,624-Speed 5321.32 samples/sec Loss 4.3798 Epoch: 20 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:36,147-Speed 5376.95 samples/sec Loss 4.4507 Epoch: 20 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:45,451-Speed 5502.80 samples/sec Loss 4.4485 Epoch: 20 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:03:54,763-Speed 5499.00 samples/sec Loss 4.4816 Epoch: 20 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:04:04,230-Speed 5408.60 samples/sec Loss 4.4685 Epoch: 20 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:04:13,851-Speed 5322.02 samples/sec Loss 4.4569 Epoch: 20 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:04:23,458-Speed 5329.77 samples/sec Loss 4.4198 Epoch: 20 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:04:33,142-Speed 5287.27 samples/sec Loss 4.4500 Epoch: 20 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:04:49,697-[lfw][104000]XNorm: 23.137297 Training: 2021-03-18 04:04:49,697-[lfw][104000]Accuracy-Flip: 0.99700+-0.00245 Training: 2021-03-18 04:04:49,698-[lfw][104000]Accuracy-Highest: 0.99750 Training: 2021-03-18 04:05:08,249-[cfp_fp][104000]XNorm: 19.391811 Training: 2021-03-18 04:05:08,250-[cfp_fp][104000]Accuracy-Flip: 0.97443+-0.00580 Training: 2021-03-18 04:05:08,250-[cfp_fp][104000]Accuracy-Highest: 0.97600 Training: 2021-03-18 04:05:24,285-[agedb_30][104000]XNorm: 22.354291 Training: 2021-03-18 04:05:24,285-[agedb_30][104000]Accuracy-Flip: 0.97500+-0.00703 Training: 2021-03-18 04:05:24,285-[agedb_30][104000]Accuracy-Highest: 0.97583 Training: 2021-03-18 04:05:33,758-Speed 844.68 samples/sec Loss 4.4467 Epoch: 20 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:05:43,191-Speed 5427.67 samples/sec Loss 4.4451 Epoch: 20 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:05:53,150-Speed 5141.48 samples/sec Loss 4.4130 Epoch: 20 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:03,030-Speed 5182.58 samples/sec Loss 4.4418 Epoch: 20 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:12,627-Speed 5335.10 samples/sec Loss 4.4518 Epoch: 20 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:22,268-Speed 5311.01 samples/sec Loss 4.4563 Epoch: 20 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:31,819-Speed 5360.93 samples/sec Loss 4.5083 Epoch: 20 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:41,486-Speed 5296.96 samples/sec Loss 4.4433 Epoch: 20 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:06:51,039-Speed 5359.81 samples/sec Loss 4.4853 Epoch: 20 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:00,501-Speed 5411.36 samples/sec Loss 4.4899 Epoch: 20 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:09,927-Speed 5432.05 samples/sec Loss 4.4371 Epoch: 20 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:19,580-Speed 5304.35 samples/sec Loss 4.4733 Epoch: 20 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:32,071-Speed 4099.13 samples/sec Loss 4.1952 Epoch: 21 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:41,933-Speed 5192.02 samples/sec Loss 3.8984 Epoch: 21 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:07:51,445-Speed 5383.29 samples/sec Loss 3.8688 Epoch: 21 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:01,096-Speed 5305.22 samples/sec Loss 3.8725 Epoch: 21 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:10,803-Speed 5275.34 samples/sec Loss 3.8638 Epoch: 21 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:20,708-Speed 5169.49 samples/sec Loss 3.8634 Epoch: 21 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:30,124-Speed 5437.74 samples/sec Loss 3.9130 Epoch: 21 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:39,583-Speed 5413.54 samples/sec Loss 3.9031 Epoch: 21 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:49,153-Speed 5350.35 samples/sec Loss 3.8893 Epoch: 21 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:08:58,998-Speed 5200.60 samples/sec Loss 3.8601 Epoch: 21 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:08,824-Speed 5211.33 samples/sec Loss 3.8866 Epoch: 21 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:18,614-Speed 5230.18 samples/sec Loss 3.8196 Epoch: 21 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:28,507-Speed 5175.68 samples/sec Loss 3.8640 Epoch: 21 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:37,884-Speed 5460.65 samples/sec Loss 3.8942 Epoch: 21 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:47,625-Speed 5256.05 samples/sec Loss 3.8413 Epoch: 21 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:09:57,325-Speed 5278.81 samples/sec Loss 3.8747 Epoch: 21 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:06,800-Speed 5403.94 samples/sec Loss 3.8691 Epoch: 21 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:16,363-Speed 5354.43 samples/sec Loss 3.8666 Epoch: 21 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:25,922-Speed 5356.34 samples/sec Loss 3.8627 Epoch: 21 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:35,520-Speed 5334.81 samples/sec Loss 3.8767 Epoch: 21 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:45,200-Speed 5289.93 samples/sec Loss 3.8964 Epoch: 21 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:10:54,891-Speed 5283.57 samples/sec Loss 3.8079 Epoch: 21 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:04,603-Speed 5272.30 samples/sec Loss 3.8673 Epoch: 21 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:14,172-Speed 5350.90 samples/sec Loss 3.8949 Epoch: 21 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:23,738-Speed 5352.25 samples/sec Loss 3.8306 Epoch: 21 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:33,133-Speed 5450.30 samples/sec Loss 3.8344 Epoch: 21 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:42,546-Speed 5439.50 samples/sec Loss 3.9001 Epoch: 21 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:11:51,944-Speed 5448.03 samples/sec Loss 3.8910 Epoch: 21 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:12:08,656-[lfw][106000]XNorm: 23.184172 Training: 2021-03-18 04:12:08,656-[lfw][106000]Accuracy-Flip: 0.99733+-0.00213 Training: 2021-03-18 04:12:08,656-[lfw][106000]Accuracy-Highest: 0.99750 Training: 2021-03-18 04:12:27,310-[cfp_fp][106000]XNorm: 19.470251 Training: 2021-03-18 04:12:27,310-[cfp_fp][106000]Accuracy-Flip: 0.97729+-0.00559 Training: 2021-03-18 04:12:27,310-[cfp_fp][106000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:12:43,344-[agedb_30][106000]XNorm: 22.385181 Training: 2021-03-18 04:12:43,345-[agedb_30][106000]Accuracy-Flip: 0.97617+-0.00753 Training: 2021-03-18 04:12:43,345-[agedb_30][106000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:12:52,621-Speed 843.82 samples/sec Loss 3.9006 Epoch: 21 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:02,084-Speed 5411.22 samples/sec Loss 3.9502 Epoch: 21 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:11,764-Speed 5289.55 samples/sec Loss 3.8408 Epoch: 21 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:21,364-Speed 5333.63 samples/sec Loss 3.8643 Epoch: 21 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:31,265-Speed 5171.93 samples/sec Loss 3.8849 Epoch: 21 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:40,996-Speed 5261.90 samples/sec Loss 3.8736 Epoch: 21 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:13:50,650-Speed 5303.34 samples/sec Loss 3.8714 Epoch: 21 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:00,257-Speed 5330.22 samples/sec Loss 3.8902 Epoch: 21 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:09,725-Speed 5407.73 samples/sec Loss 3.8753 Epoch: 21 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:19,414-Speed 5284.79 samples/sec Loss 3.8641 Epoch: 21 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:29,045-Speed 5316.70 samples/sec Loss 3.8602 Epoch: 21 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:38,566-Speed 5378.16 samples/sec Loss 3.8905 Epoch: 21 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:48,194-Speed 5317.83 samples/sec Loss 3.8701 Epoch: 21 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:14:57,515-Speed 5493.49 samples/sec Loss 3.8748 Epoch: 21 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:07,142-Speed 5318.57 samples/sec Loss 3.8867 Epoch: 21 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:16,606-Speed 5410.38 samples/sec Loss 3.8488 Epoch: 21 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:26,068-Speed 5411.33 samples/sec Loss 3.8855 Epoch: 21 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:35,589-Speed 5377.93 samples/sec Loss 3.8720 Epoch: 21 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:45,401-Speed 5218.58 samples/sec Loss 3.8922 Epoch: 21 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:15:54,978-Speed 5346.46 samples/sec Loss 3.9084 Epoch: 21 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:04,550-Speed 5349.31 samples/sec Loss 3.8443 Epoch: 21 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:14,214-Speed 5298.57 samples/sec Loss 3.8522 Epoch: 21 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:23,882-Speed 5296.36 samples/sec Loss 3.8997 Epoch: 21 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:33,400-Speed 5379.45 samples/sec Loss 3.9064 Epoch: 21 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:42,822-Speed 5434.24 samples/sec Loss 3.8444 Epoch: 21 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:16:52,333-Speed 5383.78 samples/sec Loss 3.8755 Epoch: 21 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:01,847-Speed 5381.52 samples/sec Loss 3.8502 Epoch: 21 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:11,585-Speed 5258.00 samples/sec Loss 3.8719 Epoch: 21 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:21,017-Speed 5429.09 samples/sec Loss 3.8452 Epoch: 21 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:30,482-Speed 5409.56 samples/sec Loss 3.8602 Epoch: 21 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:40,700-Speed 5011.12 samples/sec Loss 3.8759 Epoch: 21 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:17:50,457-Speed 5248.00 samples/sec Loss 3.8764 Epoch: 21 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:00,236-Speed 5235.71 samples/sec Loss 3.9088 Epoch: 21 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:09,916-Speed 5289.46 samples/sec Loss 3.8665 Epoch: 21 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:19,492-Speed 5347.31 samples/sec Loss 3.9179 Epoch: 21 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:29,017-Speed 5375.71 samples/sec Loss 3.8662 Epoch: 21 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:38,413-Speed 5449.78 samples/sec Loss 3.8654 Epoch: 21 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:48,315-Speed 5170.51 samples/sec Loss 3.8649 Epoch: 21 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:18:57,856-Speed 5366.63 samples/sec Loss 3.8633 Epoch: 21 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:19:07,413-Speed 5358.03 samples/sec Loss 3.8253 Epoch: 21 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:19:24,075-[lfw][108000]XNorm: 23.158916 Training: 2021-03-18 04:19:24,075-[lfw][108000]Accuracy-Flip: 0.99717+-0.00236 Training: 2021-03-18 04:19:24,075-[lfw][108000]Accuracy-Highest: 0.99750 Training: 2021-03-18 04:19:42,710-[cfp_fp][108000]XNorm: 19.464138 Training: 2021-03-18 04:19:42,710-[cfp_fp][108000]Accuracy-Flip: 0.97714+-0.00606 Training: 2021-03-18 04:19:42,711-[cfp_fp][108000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:19:58,741-[agedb_30][108000]XNorm: 22.374326 Training: 2021-03-18 04:19:58,741-[agedb_30][108000]Accuracy-Flip: 0.97617+-0.00764 Training: 2021-03-18 04:19:58,741-[agedb_30][108000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:20:08,194-Speed 842.38 samples/sec Loss 3.8096 Epoch: 21 Global Step: 108050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:20:17,860-Speed 5296.69 samples/sec Loss 3.8599 Epoch: 21 Global Step: 108100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:20:27,305-Speed 5421.59 samples/sec Loss 3.9046 Epoch: 21 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:20:37,027-Speed 5266.91 samples/sec Loss 3.8447 Epoch: 21 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:20:46,703-Speed 5291.37 samples/sec Loss 3.9147 Epoch: 21 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:20:56,192-Speed 5396.28 samples/sec Loss 3.8762 Epoch: 21 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:05,921-Speed 5262.85 samples/sec Loss 3.8906 Epoch: 21 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:15,455-Speed 5371.03 samples/sec Loss 3.8686 Epoch: 21 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:24,917-Speed 5411.22 samples/sec Loss 3.8823 Epoch: 21 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:34,514-Speed 5335.49 samples/sec Loss 3.9366 Epoch: 21 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:43,994-Speed 5401.44 samples/sec Loss 3.8353 Epoch: 21 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:21:53,500-Speed 5386.16 samples/sec Loss 3.8538 Epoch: 21 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:02,989-Speed 5396.19 samples/sec Loss 3.8433 Epoch: 21 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:12,750-Speed 5245.82 samples/sec Loss 3.9009 Epoch: 21 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:22,234-Speed 5398.93 samples/sec Loss 3.9210 Epoch: 21 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:31,833-Speed 5334.35 samples/sec Loss 3.8696 Epoch: 21 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:41,376-Speed 5365.96 samples/sec Loss 3.8682 Epoch: 21 Global Step: 108850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:22:51,260-Speed 5180.03 samples/sec Loss 3.8918 Epoch: 21 Global Step: 108900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:00,879-Speed 5323.48 samples/sec Loss 3.8661 Epoch: 21 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:10,399-Speed 5378.46 samples/sec Loss 3.9102 Epoch: 21 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:19,962-Speed 5354.00 samples/sec Loss 3.8904 Epoch: 21 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:29,287-Speed 5491.33 samples/sec Loss 3.9267 Epoch: 21 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:39,398-Speed 5063.78 samples/sec Loss 3.8793 Epoch: 21 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:48,835-Speed 5426.26 samples/sec Loss 3.9119 Epoch: 21 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:23:58,369-Speed 5370.24 samples/sec Loss 3.8795 Epoch: 21 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:07,790-Speed 5435.14 samples/sec Loss 3.8682 Epoch: 21 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:17,127-Speed 5484.02 samples/sec Loss 3.9177 Epoch: 21 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:26,615-Speed 5396.57 samples/sec Loss 3.8572 Epoch: 21 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:36,016-Speed 5446.60 samples/sec Loss 3.8934 Epoch: 21 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:45,477-Speed 5411.55 samples/sec Loss 3.8918 Epoch: 21 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:24:54,961-Speed 5399.09 samples/sec Loss 3.9005 Epoch: 21 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:04,599-Speed 5312.47 samples/sec Loss 3.9077 Epoch: 21 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:17,169-Speed 4073.60 samples/sec Loss 3.7768 Epoch: 22 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:27,229-Speed 5089.40 samples/sec Loss 3.8192 Epoch: 22 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:36,907-Speed 5291.20 samples/sec Loss 3.7777 Epoch: 22 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:46,460-Speed 5359.88 samples/sec Loss 3.7730 Epoch: 22 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:25:56,032-Speed 5348.95 samples/sec Loss 3.8506 Epoch: 22 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:26:05,565-Speed 5371.49 samples/sec Loss 3.8312 Epoch: 22 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:26:15,622-Speed 5090.95 samples/sec Loss 3.8304 Epoch: 22 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:26:25,126-Speed 5387.56 samples/sec Loss 3.7780 Epoch: 22 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:26:42,074-[lfw][110000]XNorm: 23.207412 Training: 2021-03-18 04:26:42,074-[lfw][110000]Accuracy-Flip: 0.99750+-0.00227 Training: 2021-03-18 04:26:42,074-[lfw][110000]Accuracy-Highest: 0.99750 Training: 2021-03-18 04:27:00,556-[cfp_fp][110000]XNorm: 19.496690 Training: 2021-03-18 04:27:00,556-[cfp_fp][110000]Accuracy-Flip: 0.97557+-0.00587 Training: 2021-03-18 04:27:00,556-[cfp_fp][110000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:27:16,519-[agedb_30][110000]XNorm: 22.403991 Training: 2021-03-18 04:27:16,519-[agedb_30][110000]Accuracy-Flip: 0.97483+-0.00765 Training: 2021-03-18 04:27:16,519-[agedb_30][110000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:27:25,826-Speed 843.50 samples/sec Loss 3.7928 Epoch: 22 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:27:35,720-Speed 5175.19 samples/sec Loss 3.8492 Epoch: 22 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:27:45,282-Speed 5354.96 samples/sec Loss 3.8465 Epoch: 22 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:27:55,011-Speed 5262.82 samples/sec Loss 3.8572 Epoch: 22 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:04,986-Speed 5132.96 samples/sec Loss 3.8544 Epoch: 22 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:14,533-Speed 5363.57 samples/sec Loss 3.7955 Epoch: 22 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:24,056-Speed 5376.78 samples/sec Loss 3.7794 Epoch: 22 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:33,628-Speed 5349.38 samples/sec Loss 3.8172 Epoch: 22 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:43,434-Speed 5221.77 samples/sec Loss 3.8344 Epoch: 22 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:28:53,120-Speed 5286.19 samples/sec Loss 3.8411 Epoch: 22 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:02,768-Speed 5307.06 samples/sec Loss 3.8233 Epoch: 22 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:12,521-Speed 5250.03 samples/sec Loss 3.8353 Epoch: 22 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:22,025-Speed 5387.79 samples/sec Loss 3.8482 Epoch: 22 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:31,582-Speed 5357.39 samples/sec Loss 3.7931 Epoch: 22 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:41,442-Speed 5193.33 samples/sec Loss 3.7943 Epoch: 22 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:29:50,900-Speed 5413.73 samples/sec Loss 3.7583 Epoch: 22 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:00,280-Speed 5458.67 samples/sec Loss 3.8189 Epoch: 22 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:09,958-Speed 5290.33 samples/sec Loss 3.7617 Epoch: 22 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:19,986-Speed 5106.39 samples/sec Loss 3.8325 Epoch: 22 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:29,400-Speed 5438.98 samples/sec Loss 3.8238 Epoch: 22 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:39,048-Speed 5306.94 samples/sec Loss 3.8233 Epoch: 22 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:48,760-Speed 5272.46 samples/sec Loss 3.8168 Epoch: 22 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:30:58,126-Speed 5466.54 samples/sec Loss 3.8411 Epoch: 22 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:07,703-Speed 5346.66 samples/sec Loss 3.8247 Epoch: 22 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:17,154-Speed 5417.63 samples/sec Loss 3.8221 Epoch: 22 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:26,640-Speed 5397.99 samples/sec Loss 3.8418 Epoch: 22 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:36,539-Speed 5172.58 samples/sec Loss 3.7563 Epoch: 22 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:46,038-Speed 5390.34 samples/sec Loss 3.8173 Epoch: 22 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:31:55,677-Speed 5311.88 samples/sec Loss 3.8075 Epoch: 22 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:05,255-Speed 5346.18 samples/sec Loss 3.8040 Epoch: 22 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:14,862-Speed 5329.71 samples/sec Loss 3.7916 Epoch: 22 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:24,461-Speed 5334.17 samples/sec Loss 3.8490 Epoch: 22 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:33,883-Speed 5434.76 samples/sec Loss 3.8272 Epoch: 22 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:43,524-Speed 5310.97 samples/sec Loss 3.8239 Epoch: 22 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:32:53,004-Speed 5401.00 samples/sec Loss 3.8033 Epoch: 22 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:02,512-Speed 5385.26 samples/sec Loss 3.8307 Epoch: 22 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:12,207-Speed 5281.60 samples/sec Loss 3.8661 Epoch: 22 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:21,913-Speed 5275.46 samples/sec Loss 3.7894 Epoch: 22 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:31,637-Speed 5265.41 samples/sec Loss 3.8192 Epoch: 22 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:41,234-Speed 5335.57 samples/sec Loss 3.7970 Epoch: 22 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:33:57,674-[lfw][112000]XNorm: 23.142803 Training: 2021-03-18 04:33:57,674-[lfw][112000]Accuracy-Flip: 0.99767+-0.00238 Training: 2021-03-18 04:33:57,674-[lfw][112000]Accuracy-Highest: 0.99767 Training: 2021-03-18 04:34:16,112-[cfp_fp][112000]XNorm: 19.450851 Training: 2021-03-18 04:34:16,112-[cfp_fp][112000]Accuracy-Flip: 0.97714+-0.00557 Training: 2021-03-18 04:34:16,112-[cfp_fp][112000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:34:32,026-[agedb_30][112000]XNorm: 22.378808 Training: 2021-03-18 04:34:32,026-[agedb_30][112000]Accuracy-Flip: 0.97550+-0.00742 Training: 2021-03-18 04:34:32,026-[agedb_30][112000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:34:41,498-Speed 849.60 samples/sec Loss 3.8572 Epoch: 22 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:34:51,101-Speed 5331.78 samples/sec Loss 3.8177 Epoch: 22 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:00,528-Speed 5431.69 samples/sec Loss 3.8644 Epoch: 22 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:10,145-Speed 5324.31 samples/sec Loss 3.8486 Epoch: 22 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:19,733-Speed 5340.72 samples/sec Loss 3.8414 Epoch: 22 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:29,306-Speed 5348.38 samples/sec Loss 3.7666 Epoch: 22 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:38,762-Speed 5414.83 samples/sec Loss 3.8356 Epoch: 22 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:48,838-Speed 5081.63 samples/sec Loss 3.8138 Epoch: 22 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:35:58,315-Speed 5402.97 samples/sec Loss 3.8515 Epoch: 22 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:07,801-Speed 5397.73 samples/sec Loss 3.8478 Epoch: 22 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:17,474-Speed 5293.53 samples/sec Loss 3.8386 Epoch: 22 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:27,673-Speed 5020.27 samples/sec Loss 3.8182 Epoch: 22 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:37,757-Speed 5077.99 samples/sec Loss 3.8407 Epoch: 22 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:47,744-Speed 5126.80 samples/sec Loss 3.8803 Epoch: 22 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:36:57,861-Speed 5060.82 samples/sec Loss 3.8513 Epoch: 22 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:07,404-Speed 5365.74 samples/sec Loss 3.8806 Epoch: 22 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:17,323-Speed 5162.06 samples/sec Loss 3.8836 Epoch: 22 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:26,993-Speed 5295.06 samples/sec Loss 3.8990 Epoch: 22 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:36,584-Speed 5338.50 samples/sec Loss 3.9070 Epoch: 22 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:46,269-Speed 5286.94 samples/sec Loss 3.8280 Epoch: 22 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:37:55,815-Speed 5363.69 samples/sec Loss 3.8192 Epoch: 22 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:05,444-Speed 5317.68 samples/sec Loss 3.9045 Epoch: 22 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:14,803-Speed 5471.03 samples/sec Loss 3.8297 Epoch: 22 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:24,376-Speed 5348.58 samples/sec Loss 3.8189 Epoch: 22 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:33,882-Speed 5386.39 samples/sec Loss 3.8597 Epoch: 22 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:43,383-Speed 5389.52 samples/sec Loss 3.8740 Epoch: 22 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:38:53,043-Speed 5300.24 samples/sec Loss 3.8390 Epoch: 22 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:02,753-Speed 5273.34 samples/sec Loss 3.8734 Epoch: 22 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:12,307-Speed 5359.70 samples/sec Loss 3.8251 Epoch: 22 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:21,796-Speed 5395.75 samples/sec Loss 3.8789 Epoch: 22 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:31,433-Speed 5313.57 samples/sec Loss 3.8370 Epoch: 22 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:40,891-Speed 5414.11 samples/sec Loss 3.7833 Epoch: 22 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:50,499-Speed 5328.93 samples/sec Loss 3.8364 Epoch: 22 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:39:59,998-Speed 5390.84 samples/sec Loss 3.8679 Epoch: 22 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:09,727-Speed 5262.49 samples/sec Loss 3.8728 Epoch: 22 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:19,529-Speed 5223.97 samples/sec Loss 3.8549 Epoch: 22 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:29,119-Speed 5338.88 samples/sec Loss 3.8424 Epoch: 22 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:38,636-Speed 5380.49 samples/sec Loss 3.8675 Epoch: 22 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:48,442-Speed 5221.31 samples/sec Loss 3.8817 Epoch: 22 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:40:57,978-Speed 5369.28 samples/sec Loss 3.8245 Epoch: 22 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:41:14,586-[lfw][114000]XNorm: 23.094175 Training: 2021-03-18 04:41:14,586-[lfw][114000]Accuracy-Flip: 0.99733+-0.00249 Training: 2021-03-18 04:41:14,586-[lfw][114000]Accuracy-Highest: 0.99767 Training: 2021-03-18 04:41:33,067-[cfp_fp][114000]XNorm: 19.410398 Training: 2021-03-18 04:41:33,068-[cfp_fp][114000]Accuracy-Flip: 0.97686+-0.00538 Training: 2021-03-18 04:41:33,068-[cfp_fp][114000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:41:49,129-[agedb_30][114000]XNorm: 22.325174 Training: 2021-03-18 04:41:49,129-[agedb_30][114000]Accuracy-Flip: 0.97550+-0.00734 Training: 2021-03-18 04:41:49,130-[agedb_30][114000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:41:58,799-Speed 841.83 samples/sec Loss 3.8736 Epoch: 22 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:08,238-Speed 5424.68 samples/sec Loss 3.8592 Epoch: 22 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:17,730-Speed 5394.15 samples/sec Loss 3.8742 Epoch: 22 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:27,596-Speed 5190.23 samples/sec Loss 3.8558 Epoch: 22 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:37,030-Speed 5427.05 samples/sec Loss 3.8409 Epoch: 22 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:46,493-Speed 5411.26 samples/sec Loss 3.8244 Epoch: 22 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:42:55,844-Speed 5475.92 samples/sec Loss 3.8825 Epoch: 22 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:05,315-Speed 5406.08 samples/sec Loss 3.7984 Epoch: 22 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:14,922-Speed 5329.80 samples/sec Loss 3.8997 Epoch: 22 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:24,265-Speed 5480.27 samples/sec Loss 3.9083 Epoch: 22 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:33,708-Speed 5422.78 samples/sec Loss 3.8525 Epoch: 22 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:45,857-Speed 4214.41 samples/sec Loss 3.8491 Epoch: 23 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:43:55,614-Speed 5248.31 samples/sec Loss 3.7628 Epoch: 23 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:05,404-Speed 5230.16 samples/sec Loss 3.8220 Epoch: 23 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:14,984-Speed 5344.69 samples/sec Loss 3.7806 Epoch: 23 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:24,517-Speed 5370.95 samples/sec Loss 3.8354 Epoch: 23 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:34,237-Speed 5267.63 samples/sec Loss 3.7895 Epoch: 23 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:43,735-Speed 5391.15 samples/sec Loss 3.7705 Epoch: 23 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:44:53,484-Speed 5251.99 samples/sec Loss 3.8052 Epoch: 23 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:03,274-Speed 5230.56 samples/sec Loss 3.8073 Epoch: 23 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:12,987-Speed 5271.34 samples/sec Loss 3.7742 Epoch: 23 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:22,541-Speed 5359.36 samples/sec Loss 3.7962 Epoch: 23 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:32,538-Speed 5122.01 samples/sec Loss 3.8299 Epoch: 23 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:42,343-Speed 5222.28 samples/sec Loss 3.7765 Epoch: 23 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:45:52,028-Speed 5286.76 samples/sec Loss 3.7916 Epoch: 23 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:01,656-Speed 5318.08 samples/sec Loss 3.7799 Epoch: 23 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:11,274-Speed 5323.50 samples/sec Loss 3.8002 Epoch: 23 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:21,057-Speed 5233.85 samples/sec Loss 3.7844 Epoch: 23 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:30,657-Speed 5333.98 samples/sec Loss 3.8125 Epoch: 23 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:40,341-Speed 5287.57 samples/sec Loss 3.7756 Epoch: 23 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:49,973-Speed 5315.74 samples/sec Loss 3.8194 Epoch: 23 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:46:59,612-Speed 5312.18 samples/sec Loss 3.8386 Epoch: 23 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:09,347-Speed 5259.44 samples/sec Loss 3.8171 Epoch: 23 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:18,701-Speed 5474.01 samples/sec Loss 3.8033 Epoch: 23 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:28,269-Speed 5351.81 samples/sec Loss 3.7610 Epoch: 23 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:37,742-Speed 5404.85 samples/sec Loss 3.8145 Epoch: 23 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:47,339-Speed 5335.52 samples/sec Loss 3.7937 Epoch: 23 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:47:56,726-Speed 5454.74 samples/sec Loss 3.7691 Epoch: 23 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:48:06,388-Speed 5299.30 samples/sec Loss 3.8396 Epoch: 23 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:48:16,014-Speed 5319.43 samples/sec Loss 3.8202 Epoch: 23 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:48:32,404-[lfw][116000]XNorm: 23.128889 Training: 2021-03-18 04:48:32,404-[lfw][116000]Accuracy-Flip: 0.99717+-0.00248 Training: 2021-03-18 04:48:32,404-[lfw][116000]Accuracy-Highest: 0.99767 Training: 2021-03-18 04:48:50,830-[cfp_fp][116000]XNorm: 19.438901 Training: 2021-03-18 04:48:50,831-[cfp_fp][116000]Accuracy-Flip: 0.97657+-0.00565 Training: 2021-03-18 04:48:50,831-[cfp_fp][116000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:49:06,794-[agedb_30][116000]XNorm: 22.391299 Training: 2021-03-18 04:49:06,794-[agedb_30][116000]Accuracy-Flip: 0.97433+-0.00704 Training: 2021-03-18 04:49:06,794-[agedb_30][116000]Accuracy-Highest: 0.97617 Training: 2021-03-18 04:49:16,105-Speed 852.05 samples/sec Loss 3.8281 Epoch: 23 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:49:25,346-Speed 5540.67 samples/sec Loss 3.7452 Epoch: 23 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:49:34,927-Speed 5344.66 samples/sec Loss 3.8213 Epoch: 23 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:49:44,682-Speed 5248.44 samples/sec Loss 3.7882 Epoch: 23 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:49:54,253-Speed 5349.92 samples/sec Loss 3.7890 Epoch: 23 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-18 04:50:03,921-Speed 5296.26 samples/sec Loss 3.7972 Epoch: 23 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:50:13,643-Speed 5266.99 samples/sec Loss 3.7797 Epoch: 23 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:50:23,160-Speed 5380.30 samples/sec Loss 3.7813 Epoch: 23 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:50:32,847-Speed 5285.65 samples/sec Loss 3.7701 Epoch: 23 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:50:42,344-Speed 5391.87 samples/sec Loss 3.7377 Epoch: 23 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:50:51,742-Speed 5448.11 samples/sec Loss 3.8310 Epoch: 23 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:01,435-Speed 5282.20 samples/sec Loss 3.8193 Epoch: 23 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:11,284-Speed 5199.25 samples/sec Loss 3.8384 Epoch: 23 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:21,082-Speed 5225.82 samples/sec Loss 3.7957 Epoch: 23 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:30,470-Speed 5454.05 samples/sec Loss 3.7931 Epoch: 23 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:39,991-Speed 5377.60 samples/sec Loss 3.8117 Epoch: 23 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:49,685-Speed 5281.88 samples/sec Loss 3.7883 Epoch: 23 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:51:59,486-Speed 5224.67 samples/sec Loss 3.8262 Epoch: 23 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:08,910-Speed 5432.81 samples/sec Loss 3.7708 Epoch: 23 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:18,638-Speed 5263.75 samples/sec Loss 3.7927 Epoch: 23 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:28,064-Speed 5432.32 samples/sec Loss 3.7972 Epoch: 23 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:37,547-Speed 5399.03 samples/sec Loss 3.8290 Epoch: 23 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:46,841-Speed 5509.71 samples/sec Loss 3.8431 Epoch: 23 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:52:56,501-Speed 5300.57 samples/sec Loss 3.7966 Epoch: 23 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:06,324-Speed 5212.14 samples/sec Loss 3.8541 Epoch: 23 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:15,780-Speed 5414.99 samples/sec Loss 3.7748 Epoch: 23 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:25,827-Speed 5096.52 samples/sec Loss 3.7880 Epoch: 23 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:35,547-Speed 5267.64 samples/sec Loss 3.8186 Epoch: 23 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:45,061-Speed 5381.94 samples/sec Loss 3.8058 Epoch: 23 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:53:54,450-Speed 5453.92 samples/sec Loss 3.8718 Epoch: 23 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:04,073-Speed 5320.55 samples/sec Loss 3.8107 Epoch: 23 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:13,475-Speed 5445.80 samples/sec Loss 3.8086 Epoch: 23 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:23,566-Speed 5074.04 samples/sec Loss 3.8191 Epoch: 23 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:33,364-Speed 5226.26 samples/sec Loss 3.8227 Epoch: 23 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:42,850-Speed 5397.47 samples/sec Loss 3.8599 Epoch: 23 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:54:52,497-Speed 5307.62 samples/sec Loss 3.8486 Epoch: 23 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:55:02,110-Speed 5326.78 samples/sec Loss 3.7816 Epoch: 23 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:55:11,581-Speed 5406.26 samples/sec Loss 3.8090 Epoch: 23 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:55:21,359-Speed 5236.69 samples/sec Loss 3.8257 Epoch: 23 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:55:30,816-Speed 5414.07 samples/sec Loss 3.8035 Epoch: 23 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:55:47,374-[lfw][118000]XNorm: 23.115796 Training: 2021-03-18 04:55:47,374-[lfw][118000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-18 04:55:47,375-[lfw][118000]Accuracy-Highest: 0.99767 Training: 2021-03-18 04:56:05,782-[cfp_fp][118000]XNorm: 19.432631 Training: 2021-03-18 04:56:05,782-[cfp_fp][118000]Accuracy-Flip: 0.97686+-0.00622 Training: 2021-03-18 04:56:05,782-[cfp_fp][118000]Accuracy-Highest: 0.97729 Training: 2021-03-18 04:56:21,781-[agedb_30][118000]XNorm: 22.341703 Training: 2021-03-18 04:56:21,781-[agedb_30][118000]Accuracy-Flip: 0.97650+-0.00713 Training: 2021-03-18 04:56:21,781-[agedb_30][118000]Accuracy-Highest: 0.97650 Training: 2021-03-18 04:56:31,203-Speed 847.88 samples/sec Loss 3.8213 Epoch: 23 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:56:40,969-Speed 5243.23 samples/sec Loss 3.8199 Epoch: 23 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:56:50,428-Speed 5413.01 samples/sec Loss 3.8257 Epoch: 23 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:00,050-Speed 5321.37 samples/sec Loss 3.8556 Epoch: 23 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:09,510-Speed 5412.31 samples/sec Loss 3.8331 Epoch: 23 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:19,106-Speed 5335.78 samples/sec Loss 3.8092 Epoch: 23 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:28,674-Speed 5351.91 samples/sec Loss 3.8254 Epoch: 23 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:38,273-Speed 5333.81 samples/sec Loss 3.8263 Epoch: 23 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:47,917-Speed 5309.43 samples/sec Loss 3.8170 Epoch: 23 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:57:57,558-Speed 5311.57 samples/sec Loss 3.7671 Epoch: 23 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:07,121-Speed 5354.08 samples/sec Loss 3.8314 Epoch: 23 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:16,451-Speed 5488.14 samples/sec Loss 3.8068 Epoch: 23 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:26,250-Speed 5225.48 samples/sec Loss 3.8161 Epoch: 23 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:35,681-Speed 5428.90 samples/sec Loss 3.8516 Epoch: 23 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:45,441-Speed 5246.35 samples/sec Loss 3.8380 Epoch: 23 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:58:55,172-Speed 5261.74 samples/sec Loss 3.8574 Epoch: 23 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:04,732-Speed 5356.02 samples/sec Loss 3.8402 Epoch: 23 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:14,222-Speed 5395.20 samples/sec Loss 3.8294 Epoch: 23 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:23,750-Speed 5373.93 samples/sec Loss 3.8366 Epoch: 23 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:33,329-Speed 5345.60 samples/sec Loss 3.8437 Epoch: 23 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:43,221-Speed 5176.57 samples/sec Loss 3.8531 Epoch: 23 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 04:59:52,621-Speed 5446.79 samples/sec Loss 3.8239 Epoch: 23 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:02,246-Speed 5320.20 samples/sec Loss 3.8309 Epoch: 23 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:11,742-Speed 5392.03 samples/sec Loss 3.8204 Epoch: 23 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:21,236-Speed 5393.18 samples/sec Loss 3.8538 Epoch: 23 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:30,801-Speed 5352.81 samples/sec Loss 3.8529 Epoch: 23 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:40,269-Speed 5408.41 samples/sec Loss 3.8300 Epoch: 23 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:49,980-Speed 5272.56 samples/sec Loss 3.7990 Epoch: 23 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:00:59,622-Speed 5310.44 samples/sec Loss 3.8710 Epoch: 23 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:01:09,229-Speed 5330.24 samples/sec Loss 3.8040 Epoch: 23 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:01:18,704-Speed 5403.74 samples/sec Loss 3.8379 Epoch: 23 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:01:30,938-Speed 4185.36 samples/sec Loss 3.8038 Epoch: 24 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:01:40,635-Speed 5280.55 samples/sec Loss 3.7930 Epoch: 24 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:01:50,248-Speed 5326.29 samples/sec Loss 3.7623 Epoch: 24 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:00,095-Speed 5200.01 samples/sec Loss 3.7544 Epoch: 24 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:09,512-Speed 5437.28 samples/sec Loss 3.7913 Epoch: 24 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:19,276-Speed 5244.57 samples/sec Loss 3.7663 Epoch: 24 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:28,960-Speed 5286.91 samples/sec Loss 3.7352 Epoch: 24 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:38,547-Speed 5341.28 samples/sec Loss 3.8149 Epoch: 24 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:02:48,269-Speed 5266.90 samples/sec Loss 3.8154 Epoch: 24 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:03:04,779-[lfw][120000]XNorm: 23.052363 Training: 2021-03-18 05:03:04,780-[lfw][120000]Accuracy-Flip: 0.99717+-0.00259 Training: 2021-03-18 05:03:04,780-[lfw][120000]Accuracy-Highest: 0.99767 Training: 2021-03-18 05:03:23,340-[cfp_fp][120000]XNorm: 19.378179 Training: 2021-03-18 05:03:23,340-[cfp_fp][120000]Accuracy-Flip: 0.97614+-0.00616 Training: 2021-03-18 05:03:23,340-[cfp_fp][120000]Accuracy-Highest: 0.97729 Training: 2021-03-18 05:03:39,483-[agedb_30][120000]XNorm: 22.271415 Training: 2021-03-18 05:03:39,483-[agedb_30][120000]Accuracy-Flip: 0.97750+-0.00684 Training: 2021-03-18 05:03:39,483-[agedb_30][120000]Accuracy-Highest: 0.97750 Training: 2021-03-18 05:03:48,980-Speed 843.35 samples/sec Loss 3.8194 Epoch: 24 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:03:58,653-Speed 5293.28 samples/sec Loss 3.7377 Epoch: 24 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:08,966-Speed 4964.90 samples/sec Loss 3.7563 Epoch: 24 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:19,207-Speed 4999.80 samples/sec Loss 3.7757 Epoch: 24 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:28,694-Speed 5397.13 samples/sec Loss 3.7210 Epoch: 24 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:38,168-Speed 5404.30 samples/sec Loss 3.7491 Epoch: 24 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:47,608-Speed 5424.04 samples/sec Loss 3.7105 Epoch: 24 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:04:57,052-Speed 5422.22 samples/sec Loss 3.7843 Epoch: 24 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:06,631-Speed 5345.22 samples/sec Loss 3.7530 Epoch: 24 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:16,151-Speed 5378.48 samples/sec Loss 3.7707 Epoch: 24 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:25,770-Speed 5323.11 samples/sec Loss 3.7877 Epoch: 24 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:35,352-Speed 5343.89 samples/sec Loss 3.7810 Epoch: 24 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:44,690-Speed 5483.10 samples/sec Loss 3.7791 Epoch: 24 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:05:54,044-Speed 5473.64 samples/sec Loss 3.7927 Epoch: 24 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:03,695-Speed 5305.84 samples/sec Loss 3.7388 Epoch: 24 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:13,423-Speed 5263.73 samples/sec Loss 3.7816 Epoch: 24 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:23,103-Speed 5289.18 samples/sec Loss 3.8032 Epoch: 24 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:32,636-Speed 5371.57 samples/sec Loss 3.7513 Epoch: 24 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:42,026-Speed 5452.91 samples/sec Loss 3.7765 Epoch: 24 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:06:51,589-Speed 5354.64 samples/sec Loss 3.7266 Epoch: 24 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:01,159-Speed 5350.62 samples/sec Loss 3.8072 Epoch: 24 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:10,774-Speed 5325.36 samples/sec Loss 3.7939 Epoch: 24 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:20,575-Speed 5224.24 samples/sec Loss 3.7868 Epoch: 24 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:30,116-Speed 5366.90 samples/sec Loss 3.8017 Epoch: 24 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:39,452-Speed 5484.47 samples/sec Loss 3.7740 Epoch: 24 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:49,069-Speed 5324.32 samples/sec Loss 3.7645 Epoch: 24 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:07:58,564-Speed 5392.40 samples/sec Loss 3.8289 Epoch: 24 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:08,059-Speed 5392.85 samples/sec Loss 3.8170 Epoch: 24 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:17,687-Speed 5317.91 samples/sec Loss 3.7884 Epoch: 24 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:27,582-Speed 5175.10 samples/sec Loss 3.7851 Epoch: 24 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:37,475-Speed 5175.43 samples/sec Loss 3.7614 Epoch: 24 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:47,059-Speed 5342.87 samples/sec Loss 3.8344 Epoch: 24 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:08:56,642-Speed 5342.95 samples/sec Loss 3.7831 Epoch: 24 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:06,233-Speed 5338.62 samples/sec Loss 3.7681 Epoch: 24 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:15,955-Speed 5266.51 samples/sec Loss 3.8203 Epoch: 24 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:25,372-Speed 5437.27 samples/sec Loss 3.7733 Epoch: 24 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:34,704-Speed 5487.06 samples/sec Loss 3.7579 Epoch: 24 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:44,012-Speed 5501.30 samples/sec Loss 3.7998 Epoch: 24 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:09:53,511-Speed 5390.50 samples/sec Loss 3.7988 Epoch: 24 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:10:03,040-Speed 5373.56 samples/sec Loss 3.7869 Epoch: 24 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:10:19,431-[lfw][122000]XNorm: 23.121585 Training: 2021-03-18 05:10:19,431-[lfw][122000]Accuracy-Flip: 0.99650+-0.00229 Training: 2021-03-18 05:10:19,431-[lfw][122000]Accuracy-Highest: 0.99767 Training: 2021-03-18 05:10:37,955-[cfp_fp][122000]XNorm: 19.436992 Training: 2021-03-18 05:10:37,955-[cfp_fp][122000]Accuracy-Flip: 0.97686+-0.00514 Training: 2021-03-18 05:10:37,955-[cfp_fp][122000]Accuracy-Highest: 0.97729 Training: 2021-03-18 05:10:53,899-[agedb_30][122000]XNorm: 22.357297 Training: 2021-03-18 05:10:53,899-[agedb_30][122000]Accuracy-Flip: 0.97550+-0.00806 Training: 2021-03-18 05:10:53,899-[agedb_30][122000]Accuracy-Highest: 0.97750 Training: 2021-03-18 05:11:03,597-Speed 845.49 samples/sec Loss 3.8047 Epoch: 24 Global Step: 122050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:11:13,185-Speed 5340.62 samples/sec Loss 3.8122 Epoch: 24 Global Step: 122100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:11:22,843-Speed 5301.37 samples/sec Loss 3.8122 Epoch: 24 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:11:32,401-Speed 5356.93 samples/sec Loss 3.7922 Epoch: 24 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:11:42,325-Speed 5159.61 samples/sec Loss 3.8009 Epoch: 24 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:11:51,939-Speed 5325.67 samples/sec Loss 3.7769 Epoch: 24 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:01,557-Speed 5324.06 samples/sec Loss 3.8253 Epoch: 24 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:11,104-Speed 5363.03 samples/sec Loss 3.8050 Epoch: 24 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:20,575-Speed 5406.48 samples/sec Loss 3.7759 Epoch: 24 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:30,132-Speed 5357.94 samples/sec Loss 3.8321 Epoch: 24 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:39,650-Speed 5379.32 samples/sec Loss 3.8073 Epoch: 24 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:49,182-Speed 5371.55 samples/sec Loss 3.7761 Epoch: 24 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:12:59,394-Speed 5014.34 samples/sec Loss 3.7685 Epoch: 24 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:09,151-Speed 5247.78 samples/sec Loss 3.8174 Epoch: 24 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:18,627-Speed 5403.41 samples/sec Loss 3.8498 Epoch: 24 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:28,183-Speed 5358.39 samples/sec Loss 3.7917 Epoch: 24 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:37,781-Speed 5334.47 samples/sec Loss 3.7912 Epoch: 24 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:47,348-Speed 5352.17 samples/sec Loss 3.7620 Epoch: 24 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:13:56,944-Speed 5335.87 samples/sec Loss 3.7866 Epoch: 24 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:06,439-Speed 5392.61 samples/sec Loss 3.8782 Epoch: 24 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:16,166-Speed 5263.73 samples/sec Loss 3.7925 Epoch: 24 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:25,861-Speed 5281.62 samples/sec Loss 3.7585 Epoch: 24 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:35,403-Speed 5366.38 samples/sec Loss 3.8007 Epoch: 24 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:45,076-Speed 5293.16 samples/sec Loss 3.7538 Epoch: 24 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:14:54,598-Speed 5377.85 samples/sec Loss 3.8320 Epoch: 24 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:04,404-Speed 5221.66 samples/sec Loss 3.8298 Epoch: 24 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:13,858-Speed 5415.69 samples/sec Loss 3.8224 Epoch: 24 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:23,335-Speed 5403.16 samples/sec Loss 3.8362 Epoch: 24 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:32,732-Speed 5449.07 samples/sec Loss 3.7524 Epoch: 24 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:42,267-Speed 5370.01 samples/sec Loss 3.7884 Epoch: 24 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:15:51,676-Speed 5442.12 samples/sec Loss 3.8348 Epoch: 24 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:01,205-Speed 5373.42 samples/sec Loss 3.8172 Epoch: 24 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:10,850-Speed 5308.88 samples/sec Loss 3.7981 Epoch: 24 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:20,516-Speed 5297.38 samples/sec Loss 3.7696 Epoch: 24 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:30,072-Speed 5357.75 samples/sec Loss 3.8284 Epoch: 24 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:39,593-Speed 5378.06 samples/sec Loss 3.7580 Epoch: 24 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:48,907-Speed 5497.38 samples/sec Loss 3.8293 Epoch: 24 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:16:58,486-Speed 5345.19 samples/sec Loss 3.8350 Epoch: 24 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:17:07,928-Speed 5423.26 samples/sec Loss 3.8089 Epoch: 24 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:17:17,723-Speed 5227.17 samples/sec Loss 3.8049 Epoch: 24 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:17:34,164-[lfw][124000]XNorm: 23.157121 Training: 2021-03-18 05:17:34,164-[lfw][124000]Accuracy-Flip: 0.99767+-0.00238 Training: 2021-03-18 05:17:34,164-[lfw][124000]Accuracy-Highest: 0.99767 Training: 2021-03-18 05:17:52,674-[cfp_fp][124000]XNorm: 19.492861 Training: 2021-03-18 05:17:52,674-[cfp_fp][124000]Accuracy-Flip: 0.97686+-0.00538 Training: 2021-03-18 05:17:52,674-[cfp_fp][124000]Accuracy-Highest: 0.97729 Training: 2021-03-18 05:18:08,660-[agedb_30][124000]XNorm: 22.417105 Training: 2021-03-18 05:18:08,660-[agedb_30][124000]Accuracy-Flip: 0.97767+-0.00684 Training: 2021-03-18 05:18:08,660-[agedb_30][124000]Accuracy-Highest: 0.97767 Training: 2021-03-18 05:18:18,338-Speed 844.69 samples/sec Loss 3.8614 Epoch: 24 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:18:28,119-Speed 5235.38 samples/sec Loss 3.8173 Epoch: 24 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:18:37,591-Speed 5405.87 samples/sec Loss 3.8140 Epoch: 24 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:18:46,982-Speed 5452.26 samples/sec Loss 3.8536 Epoch: 24 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:18:56,525-Speed 5365.95 samples/sec Loss 3.8259 Epoch: 24 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:06,019-Speed 5393.08 samples/sec Loss 3.8518 Epoch: 24 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:15,766-Speed 5253.46 samples/sec Loss 3.8040 Epoch: 24 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:25,440-Speed 5292.54 samples/sec Loss 3.8297 Epoch: 24 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:34,834-Speed 5450.77 samples/sec Loss 3.8408 Epoch: 24 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:44,441-Speed 5329.80 samples/sec Loss 3.7828 Epoch: 24 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-18 05:19:53,960-Speed 5379.39 samples/sec Loss 3.8385 Epoch: 24 Global Step: 124550 Fp16 Grad Scale: 16384 Required: -0 hours