Training: 2021-03-18 20:30:23,706-rank_id: 0 Training: 2021-03-18 20:30:47,191-softmax weight init successfully! Training: 2021-03-18 20:30:47,191-softmax weight mom init successfully! Training: 2021-03-18 20:30:47,193-Total Step is: 333821 Training: 2021-03-18 20:31:28,779-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-18 20:31:50,881-Speed 4740.90 samples/sec Loss 47.6037 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 24 hours Training: 2021-03-18 20:32:03,477-Speed 4064.97 samples/sec Loss 46.8602 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 24 hours Training: 2021-03-18 20:32:14,353-Speed 4707.96 samples/sec Loss 45.6986 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 23 hours Training: 2021-03-18 20:32:24,447-Speed 5072.71 samples/sec Loss 44.6894 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 22 hours Training: 2021-03-18 20:32:34,294-Speed 5199.65 samples/sec Loss 43.7215 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 22 hours Training: 2021-03-18 20:32:44,180-Speed 5179.49 samples/sec Loss 42.8750 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 21 hours Training: 2021-03-18 20:32:53,965-Speed 5233.10 samples/sec Loss 42.2005 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 21 hours Training: 2021-03-18 20:33:06,479-Speed 4091.62 samples/sec Loss 41.5163 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 21 hours Training: 2021-03-18 20:33:19,447-Speed 3948.24 samples/sec Loss 40.6360 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 21 hours Training: 2021-03-18 20:33:29,144-Speed 5280.71 samples/sec Loss 39.8819 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 21 hours Training: 2021-03-18 20:33:39,037-Speed 5175.59 samples/sec Loss 39.2520 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 21 hours Training: 2021-03-18 20:33:48,996-Speed 5141.20 samples/sec Loss 38.6933 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 21 hours Training: 2021-03-18 20:33:58,952-Speed 5143.18 samples/sec Loss 38.3791 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:09,273-Speed 4960.71 samples/sec Loss 38.0443 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:19,956-Speed 4793.05 samples/sec Loss 37.7210 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:29,692-Speed 5259.46 samples/sec Loss 37.4834 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:39,483-Speed 5229.21 samples/sec Loss 37.2058 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:49,045-Speed 5355.20 samples/sec Loss 36.9106 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:34:58,787-Speed 5256.09 samples/sec Loss 36.6307 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:35:08,481-Speed 5281.67 samples/sec Loss 36.4154 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:35:18,012-Speed 5372.22 samples/sec Loss 36.1118 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:35:27,885-Speed 5186.20 samples/sec Loss 35.8640 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:35:37,765-Speed 5182.34 samples/sec Loss 35.6026 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:35:47,637-Speed 5186.72 samples/sec Loss 35.3376 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:35:57,429-Speed 5229.28 samples/sec Loss 35.0578 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:07,230-Speed 5224.12 samples/sec Loss 34.7629 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:17,071-Speed 5202.88 samples/sec Loss 34.4761 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:26,758-Speed 5285.72 samples/sec Loss 34.2617 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:36,213-Speed 5415.80 samples/sec Loss 33.8768 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:46,037-Speed 5211.61 samples/sec Loss 33.6715 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:36:55,901-Speed 5191.37 samples/sec Loss 33.3441 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:05,656-Speed 5248.84 samples/sec Loss 33.0042 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:15,495-Speed 5203.89 samples/sec Loss 32.6679 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:25,113-Speed 5323.55 samples/sec Loss 32.4158 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:35,091-Speed 5131.84 samples/sec Loss 32.0835 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:44,934-Speed 5201.75 samples/sec Loss 31.7975 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:37:54,907-Speed 5134.53 samples/sec Loss 31.4862 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:38:04,856-Speed 5146.31 samples/sec Loss 31.1196 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:38:15,484-Speed 4817.57 samples/sec Loss 30.8769 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 20:38:36,400-[lfw][2000]XNorm: 24.752521 Training: 2021-03-18 20:38:36,400-[lfw][2000]Accuracy-Flip: 0.94583+-0.00886 Training: 2021-03-18 20:38:36,401-[lfw][2000]Accuracy-Highest: 0.94583 Training: 2021-03-18 20:38:58,545-[cfp_fp][2000]XNorm: 22.946548 Training: 2021-03-18 20:38:58,545-[cfp_fp][2000]Accuracy-Flip: 0.71643+-0.02246 Training: 2021-03-18 20:38:58,545-[cfp_fp][2000]Accuracy-Highest: 0.71643 Training: 2021-03-18 20:39:17,462-[agedb_30][2000]XNorm: 23.033810 Training: 2021-03-18 20:39:17,463-[agedb_30][2000]Accuracy-Flip: 0.71700+-0.02095 Training: 2021-03-18 20:39:17,463-[agedb_30][2000]Accuracy-Highest: 0.71700 Training: 2021-03-18 20:39:27,704-Speed 708.96 samples/sec Loss 30.5622 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-18 20:39:37,164-Speed 5412.94 samples/sec Loss 30.2676 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-18 20:39:48,758-Speed 4415.92 samples/sec Loss 29.9723 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-18 20:39:59,493-Speed 4769.90 samples/sec Loss 29.6720 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-18 20:40:09,947-Speed 4898.00 samples/sec Loss 29.3153 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-18 20:40:20,923-Speed 4664.71 samples/sec Loss 29.0087 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:40:30,575-Speed 5304.82 samples/sec Loss 28.7210 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:40:40,282-Speed 5275.11 samples/sec Loss 28.4231 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:40:50,067-Speed 5232.59 samples/sec Loss 28.1193 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:40:59,753-Speed 5286.40 samples/sec Loss 27.8739 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:09,606-Speed 5196.82 samples/sec Loss 27.6067 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:19,607-Speed 5119.81 samples/sec Loss 27.3931 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:29,383-Speed 5237.28 samples/sec Loss 26.9791 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:39,075-Speed 5283.17 samples/sec Loss 26.8111 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:48,822-Speed 5253.31 samples/sec Loss 26.5103 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:41:59,210-Speed 4928.74 samples/sec Loss 26.2840 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:09,161-Speed 5145.84 samples/sec Loss 26.0561 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:19,030-Speed 5188.28 samples/sec Loss 25.7415 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:28,986-Speed 5143.00 samples/sec Loss 25.4467 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:39,032-Speed 5096.89 samples/sec Loss 25.1397 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:48,938-Speed 5169.14 samples/sec Loss 24.9719 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:42:59,017-Speed 5080.09 samples/sec Loss 24.6753 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:43:09,020-Speed 5118.63 samples/sec Loss 24.4738 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:43:18,698-Speed 5290.90 samples/sec Loss 24.2083 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:43:28,545-Speed 5199.66 samples/sec Loss 23.9974 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:43:38,283-Speed 5258.50 samples/sec Loss 23.7606 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:43:48,312-Speed 5105.55 samples/sec Loss 23.4927 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:43:58,220-Speed 5167.81 samples/sec Loss 23.3398 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:08,138-Speed 5162.85 samples/sec Loss 23.0142 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:17,887-Speed 5251.84 samples/sec Loss 22.8641 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:27,722-Speed 5206.80 samples/sec Loss 22.6730 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:37,685-Speed 5139.16 samples/sec Loss 22.3941 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:47,436-Speed 5250.64 samples/sec Loss 22.2487 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:44:57,161-Speed 5265.31 samples/sec Loss 22.0098 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:07,159-Speed 5121.14 samples/sec Loss 21.8604 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:17,041-Speed 5181.84 samples/sec Loss 21.7141 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:26,705-Speed 5298.08 samples/sec Loss 21.5171 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:36,884-Speed 5030.01 samples/sec Loss 21.3417 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:46,955-Speed 5084.70 samples/sec Loss 21.1086 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:45:56,629-Speed 5292.58 samples/sec Loss 20.8994 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:46:14,189-[lfw][4000]XNorm: 22.901893 Training: 2021-03-18 20:46:14,189-[lfw][4000]Accuracy-Flip: 0.97950+-0.00522 Training: 2021-03-18 20:46:14,189-[lfw][4000]Accuracy-Highest: 0.97950 Training: 2021-03-18 20:46:33,036-[cfp_fp][4000]XNorm: 19.400593 Training: 2021-03-18 20:46:33,036-[cfp_fp][4000]Accuracy-Flip: 0.81514+-0.01336 Training: 2021-03-18 20:46:33,036-[cfp_fp][4000]Accuracy-Highest: 0.81514 Training: 2021-03-18 20:46:49,165-[agedb_30][4000]XNorm: 21.843157 Training: 2021-03-18 20:46:49,165-[agedb_30][4000]Accuracy-Flip: 0.86133+-0.02279 Training: 2021-03-18 20:46:49,165-[agedb_30][4000]Accuracy-Highest: 0.86133 Training: 2021-03-18 20:46:59,038-Speed 820.40 samples/sec Loss 20.7498 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:08,749-Speed 5272.78 samples/sec Loss 20.4459 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:18,865-Speed 5061.87 samples/sec Loss 20.3752 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:29,085-Speed 5009.94 samples/sec Loss 20.2801 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:38,977-Speed 5176.18 samples/sec Loss 20.1078 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:48,714-Speed 5258.51 samples/sec Loss 19.9832 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:47:58,416-Speed 5277.80 samples/sec Loss 19.7258 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:08,848-Speed 4908.25 samples/sec Loss 19.5932 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:18,663-Speed 5216.50 samples/sec Loss 19.5112 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:29,406-Speed 4766.20 samples/sec Loss 19.3521 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:39,320-Speed 5164.84 samples/sec Loss 19.1955 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:49,153-Speed 5207.47 samples/sec Loss 18.8905 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:48:59,548-Speed 4925.45 samples/sec Loss 18.7766 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:49:11,817-Speed 4173.58 samples/sec Loss 18.7464 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:49:21,529-Speed 5271.96 samples/sec Loss 18.5666 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:49:32,152-Speed 4820.34 samples/sec Loss 18.4852 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:49:41,831-Speed 5289.84 samples/sec Loss 18.3035 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:49:51,762-Speed 5155.85 samples/sec Loss 18.2501 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:01,848-Speed 5076.90 samples/sec Loss 18.1802 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:11,715-Speed 5189.19 samples/sec Loss 17.9996 Epoch: 0 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:21,645-Speed 5156.56 samples/sec Loss 17.8760 Epoch: 0 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:31,524-Speed 5182.82 samples/sec Loss 17.8149 Epoch: 0 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:41,417-Speed 5175.69 samples/sec Loss 17.6527 Epoch: 0 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:50:51,409-Speed 5124.61 samples/sec Loss 17.5263 Epoch: 0 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:01,404-Speed 5122.72 samples/sec Loss 17.3724 Epoch: 0 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:11,395-Speed 5125.04 samples/sec Loss 17.2759 Epoch: 0 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:21,305-Speed 5166.47 samples/sec Loss 17.1260 Epoch: 0 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:32,114-Speed 4737.27 samples/sec Loss 16.9709 Epoch: 0 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:42,039-Speed 5158.96 samples/sec Loss 16.9204 Epoch: 0 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:51:51,939-Speed 5171.88 samples/sec Loss 16.8440 Epoch: 0 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:01,734-Speed 5227.46 samples/sec Loss 16.8150 Epoch: 0 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:11,455-Speed 5267.37 samples/sec Loss 16.6460 Epoch: 0 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:21,460-Speed 5117.94 samples/sec Loss 16.5205 Epoch: 0 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:31,201-Speed 5256.22 samples/sec Loss 16.4966 Epoch: 0 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:41,297-Speed 5071.63 samples/sec Loss 16.2411 Epoch: 0 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:52:51,087-Speed 5230.52 samples/sec Loss 16.2903 Epoch: 0 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:53:00,897-Speed 5219.33 samples/sec Loss 16.1719 Epoch: 0 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:53:10,799-Speed 5171.10 samples/sec Loss 16.1766 Epoch: 0 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:53:20,935-Speed 5051.36 samples/sec Loss 16.0127 Epoch: 0 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:53:30,832-Speed 5173.77 samples/sec Loss 15.9002 Epoch: 0 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:53:48,249-[lfw][6000]XNorm: 22.620626 Training: 2021-03-18 20:53:48,250-[lfw][6000]Accuracy-Flip: 0.98650+-0.00411 Training: 2021-03-18 20:53:48,250-[lfw][6000]Accuracy-Highest: 0.98650 Training: 2021-03-18 20:54:09,272-[cfp_fp][6000]XNorm: 19.997630 Training: 2021-03-18 20:54:09,272-[cfp_fp][6000]Accuracy-Flip: 0.85557+-0.01577 Training: 2021-03-18 20:54:09,272-[cfp_fp][6000]Accuracy-Highest: 0.85557 Training: 2021-03-18 20:54:27,385-[agedb_30][6000]XNorm: 21.959913 Training: 2021-03-18 20:54:27,385-[agedb_30][6000]Accuracy-Flip: 0.88583+-0.01639 Training: 2021-03-18 20:54:27,385-[agedb_30][6000]Accuracy-Highest: 0.88583 Training: 2021-03-18 20:54:37,348-Speed 769.75 samples/sec Loss 15.8990 Epoch: 0 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:54:47,402-Speed 5092.51 samples/sec Loss 15.7358 Epoch: 0 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:54:57,158-Speed 5248.15 samples/sec Loss 15.6314 Epoch: 0 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:07,228-Speed 5084.95 samples/sec Loss 15.5187 Epoch: 0 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:17,236-Speed 5116.47 samples/sec Loss 15.5104 Epoch: 0 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:27,168-Speed 5155.36 samples/sec Loss 15.3552 Epoch: 0 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:37,048-Speed 5182.48 samples/sec Loss 15.3183 Epoch: 0 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:46,814-Speed 5243.07 samples/sec Loss 15.2034 Epoch: 0 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:55:56,780-Speed 5137.54 samples/sec Loss 15.1989 Epoch: 0 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:06,610-Speed 5208.94 samples/sec Loss 15.1543 Epoch: 0 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:16,450-Speed 5203.64 samples/sec Loss 15.0492 Epoch: 0 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:26,372-Speed 5160.46 samples/sec Loss 14.9987 Epoch: 0 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:36,248-Speed 5184.82 samples/sec Loss 14.9528 Epoch: 0 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:46,464-Speed 5011.94 samples/sec Loss 14.9148 Epoch: 0 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:56:56,198-Speed 5260.19 samples/sec Loss 14.7864 Epoch: 0 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:06,155-Speed 5142.27 samples/sec Loss 14.6415 Epoch: 0 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:15,950-Speed 5227.25 samples/sec Loss 14.6452 Epoch: 0 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:25,966-Speed 5112.33 samples/sec Loss 14.6271 Epoch: 0 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:36,042-Speed 5082.02 samples/sec Loss 14.4526 Epoch: 0 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:46,232-Speed 5024.92 samples/sec Loss 14.5207 Epoch: 0 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 20:57:56,107-Speed 5185.11 samples/sec Loss 14.4057 Epoch: 0 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:05,862-Speed 5248.64 samples/sec Loss 14.4120 Epoch: 0 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:16,413-Speed 4853.11 samples/sec Loss 14.3119 Epoch: 0 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:26,389-Speed 5132.57 samples/sec Loss 14.1586 Epoch: 0 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:37,055-Speed 4800.21 samples/sec Loss 14.1544 Epoch: 0 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:46,877-Speed 5213.18 samples/sec Loss 14.1219 Epoch: 0 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:58:57,070-Speed 5023.68 samples/sec Loss 14.1041 Epoch: 0 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:59:08,519-Speed 4472.13 samples/sec Loss 14.0156 Epoch: 0 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:59:18,352-Speed 5207.39 samples/sec Loss 13.9528 Epoch: 0 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:59:29,205-Speed 4718.15 samples/sec Loss 13.8998 Epoch: 0 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:59:40,657-Speed 4471.00 samples/sec Loss 13.8102 Epoch: 0 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 20:59:50,494-Speed 5204.76 samples/sec Loss 13.8049 Epoch: 0 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:00,270-Speed 5237.92 samples/sec Loss 13.6281 Epoch: 0 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:10,230-Speed 5140.52 samples/sec Loss 13.7471 Epoch: 0 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:20,278-Speed 5096.13 samples/sec Loss 13.6346 Epoch: 0 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:30,372-Speed 5072.57 samples/sec Loss 13.5836 Epoch: 0 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:40,305-Speed 5155.02 samples/sec Loss 13.4843 Epoch: 0 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:00:50,270-Speed 5138.51 samples/sec Loss 13.5196 Epoch: 0 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:01:00,248-Speed 5131.26 samples/sec Loss 13.3686 Epoch: 0 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:01:10,199-Speed 5145.91 samples/sec Loss 13.3224 Epoch: 0 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:01:27,866-[lfw][8000]XNorm: 24.967895 Training: 2021-03-18 21:01:27,866-[lfw][8000]Accuracy-Flip: 0.98900+-0.00416 Training: 2021-03-18 21:01:27,867-[lfw][8000]Accuracy-Highest: 0.98900 Training: 2021-03-18 21:01:46,576-[cfp_fp][8000]XNorm: 21.016498 Training: 2021-03-18 21:01:46,576-[cfp_fp][8000]Accuracy-Flip: 0.84700+-0.01322 Training: 2021-03-18 21:01:46,576-[cfp_fp][8000]Accuracy-Highest: 0.85557 Training: 2021-03-18 21:02:02,758-[agedb_30][8000]XNorm: 23.840335 Training: 2021-03-18 21:02:02,759-[agedb_30][8000]Accuracy-Flip: 0.89567+-0.01814 Training: 2021-03-18 21:02:02,759-[agedb_30][8000]Accuracy-Highest: 0.89567 Training: 2021-03-18 21:02:12,409-Speed 823.03 samples/sec Loss 13.3670 Epoch: 0 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:02:22,126-Speed 5269.03 samples/sec Loss 13.3086 Epoch: 0 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:02:32,108-Speed 5129.92 samples/sec Loss 13.2323 Epoch: 0 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:02:42,825-Speed 4777.70 samples/sec Loss 13.1966 Epoch: 0 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:02:52,757-Speed 5155.47 samples/sec Loss 13.2897 Epoch: 0 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:02,849-Speed 5073.50 samples/sec Loss 13.1524 Epoch: 0 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:12,725-Speed 5184.52 samples/sec Loss 13.0422 Epoch: 0 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:22,725-Speed 5120.33 samples/sec Loss 13.0609 Epoch: 0 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:32,761-Speed 5101.82 samples/sec Loss 13.0058 Epoch: 0 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:42,923-Speed 5038.69 samples/sec Loss 13.0131 Epoch: 0 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:03:52,879-Speed 5143.29 samples/sec Loss 12.9356 Epoch: 0 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:04:02,889-Speed 5115.42 samples/sec Loss 12.8524 Epoch: 0 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:04:12,910-Speed 5109.69 samples/sec Loss 12.8712 Epoch: 0 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:04:22,727-Speed 5215.68 samples/sec Loss 12.8047 Epoch: 0 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:04:32,679-Speed 5145.05 samples/sec Loss 12.7604 Epoch: 0 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:04:42,512-Speed 5206.96 samples/sec Loss 12.7238 Epoch: 0 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:04:52,546-Speed 5102.97 samples/sec Loss 12.6221 Epoch: 0 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:02,471-Speed 5159.02 samples/sec Loss 12.5781 Epoch: 0 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:12,650-Speed 5030.59 samples/sec Loss 12.6816 Epoch: 0 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:22,719-Speed 5084.95 samples/sec Loss 12.5711 Epoch: 0 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:32,630-Speed 5166.25 samples/sec Loss 12.5673 Epoch: 0 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:42,685-Speed 5092.64 samples/sec Loss 12.5511 Epoch: 0 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:05:53,077-Speed 4926.72 samples/sec Loss 12.4096 Epoch: 0 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:03,155-Speed 5080.92 samples/sec Loss 12.4977 Epoch: 0 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:13,086-Speed 5155.78 samples/sec Loss 12.4244 Epoch: 0 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:22,964-Speed 5183.48 samples/sec Loss 12.4380 Epoch: 0 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:32,941-Speed 5131.85 samples/sec Loss 12.4022 Epoch: 0 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:42,769-Speed 5210.12 samples/sec Loss 12.3757 Epoch: 0 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:06:52,534-Speed 5243.50 samples/sec Loss 12.1484 Epoch: 0 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:02,638-Speed 5067.74 samples/sec Loss 12.2434 Epoch: 0 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:12,599-Speed 5140.01 samples/sec Loss 12.2338 Epoch: 0 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:22,649-Speed 5094.87 samples/sec Loss 12.2116 Epoch: 0 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:32,647-Speed 5121.15 samples/sec Loss 12.1406 Epoch: 0 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:42,699-Speed 5094.10 samples/sec Loss 12.1069 Epoch: 0 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:07:52,682-Speed 5129.04 samples/sec Loss 12.0904 Epoch: 0 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:08:02,514-Speed 5207.83 samples/sec Loss 12.0768 Epoch: 0 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:08:12,649-Speed 5051.98 samples/sec Loss 12.0352 Epoch: 0 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:08:22,747-Speed 5071.10 samples/sec Loss 12.0011 Epoch: 0 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:08:32,589-Speed 5202.34 samples/sec Loss 11.9687 Epoch: 0 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:08:43,415-Speed 4729.74 samples/sec Loss 11.9409 Epoch: 0 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:09:00,292-[lfw][10000]XNorm: 25.340126 Training: 2021-03-18 21:09:00,292-[lfw][10000]Accuracy-Flip: 0.99033+-0.00552 Training: 2021-03-18 21:09:00,293-[lfw][10000]Accuracy-Highest: 0.99033 Training: 2021-03-18 21:09:18,942-[cfp_fp][10000]XNorm: 20.310380 Training: 2021-03-18 21:09:18,942-[cfp_fp][10000]Accuracy-Flip: 0.86129+-0.01728 Training: 2021-03-18 21:09:18,942-[cfp_fp][10000]Accuracy-Highest: 0.86129 Training: 2021-03-18 21:09:35,055-[agedb_30][10000]XNorm: 23.250223 Training: 2021-03-18 21:09:35,055-[agedb_30][10000]Accuracy-Flip: 0.91450+-0.01474 Training: 2021-03-18 21:09:35,055-[agedb_30][10000]Accuracy-Highest: 0.91450 Training: 2021-03-18 21:09:44,804-Speed 834.04 samples/sec Loss 11.8988 Epoch: 0 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:09:55,031-Speed 5006.40 samples/sec Loss 11.8992 Epoch: 0 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:05,658-Speed 4818.53 samples/sec Loss 11.8905 Epoch: 0 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:16,294-Speed 4814.57 samples/sec Loss 11.7881 Epoch: 0 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:26,319-Speed 5107.69 samples/sec Loss 11.8801 Epoch: 0 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:36,220-Speed 5171.10 samples/sec Loss 11.7818 Epoch: 0 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:46,907-Speed 4791.48 samples/sec Loss 11.7439 Epoch: 0 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:10:58,353-Speed 4473.47 samples/sec Loss 11.7078 Epoch: 0 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-18 21:11:09,133-Speed 4749.85 samples/sec Loss 11.6534 Epoch: 0 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:11:19,336-Speed 5018.61 samples/sec Loss 11.6512 Epoch: 0 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:11:29,634-Speed 4971.91 samples/sec Loss 11.6286 Epoch: 0 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:11:39,733-Speed 5070.13 samples/sec Loss 11.5843 Epoch: 0 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:11:49,587-Speed 5196.03 samples/sec Loss 11.5656 Epoch: 0 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:11:59,675-Speed 5075.73 samples/sec Loss 11.6056 Epoch: 0 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:12:09,890-Speed 5012.66 samples/sec Loss 11.5548 Epoch: 0 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:12:20,036-Speed 5046.63 samples/sec Loss 11.5055 Epoch: 0 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:12:30,404-Speed 4938.30 samples/sec Loss 11.4807 Epoch: 0 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:12:40,545-Speed 5049.16 samples/sec Loss 11.5059 Epoch: 0 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:12:50,584-Speed 5100.57 samples/sec Loss 11.4646 Epoch: 0 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:00,454-Speed 5187.51 samples/sec Loss 11.4116 Epoch: 0 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:10,371-Speed 5163.12 samples/sec Loss 11.5037 Epoch: 0 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:20,355-Speed 5128.44 samples/sec Loss 11.3718 Epoch: 0 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:30,269-Speed 5165.05 samples/sec Loss 11.3600 Epoch: 0 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:41,032-Speed 4757.24 samples/sec Loss 11.3830 Epoch: 0 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:13:51,210-Speed 5030.84 samples/sec Loss 11.3240 Epoch: 0 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:01,079-Speed 5188.35 samples/sec Loss 11.3291 Epoch: 0 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:11,116-Speed 5101.08 samples/sec Loss 11.3714 Epoch: 0 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:21,309-Speed 5023.52 samples/sec Loss 11.3253 Epoch: 0 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:31,378-Speed 5085.25 samples/sec Loss 11.2582 Epoch: 0 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:41,361-Speed 5128.89 samples/sec Loss 11.2621 Epoch: 0 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:14:51,267-Speed 5168.80 samples/sec Loss 11.2019 Epoch: 0 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:01,519-Speed 4994.65 samples/sec Loss 11.2789 Epoch: 0 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:11,383-Speed 5190.56 samples/sec Loss 11.1830 Epoch: 0 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:21,366-Speed 5129.37 samples/sec Loss 11.1642 Epoch: 0 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:31,338-Speed 5134.38 samples/sec Loss 11.1952 Epoch: 0 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:41,318-Speed 5130.53 samples/sec Loss 11.1924 Epoch: 0 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:15:51,250-Speed 5155.37 samples/sec Loss 11.1057 Epoch: 0 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:16:01,252-Speed 5119.12 samples/sec Loss 11.1035 Epoch: 0 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:16:11,212-Speed 5141.25 samples/sec Loss 11.0728 Epoch: 0 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:16:21,158-Speed 5148.16 samples/sec Loss 11.0694 Epoch: 0 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:16:37,835-[lfw][12000]XNorm: 24.199877 Training: 2021-03-18 21:16:37,836-[lfw][12000]Accuracy-Flip: 0.99100+-0.00484 Training: 2021-03-18 21:16:37,836-[lfw][12000]Accuracy-Highest: 0.99100 Training: 2021-03-18 21:16:56,572-[cfp_fp][12000]XNorm: 19.965677 Training: 2021-03-18 21:16:56,572-[cfp_fp][12000]Accuracy-Flip: 0.87500+-0.01730 Training: 2021-03-18 21:16:56,572-[cfp_fp][12000]Accuracy-Highest: 0.87500 Training: 2021-03-18 21:17:12,704-[agedb_30][12000]XNorm: 23.137930 Training: 2021-03-18 21:17:12,704-[agedb_30][12000]Accuracy-Flip: 0.91433+-0.01834 Training: 2021-03-18 21:17:12,704-[agedb_30][12000]Accuracy-Highest: 0.91450 Training: 2021-03-18 21:17:22,466-Speed 835.13 samples/sec Loss 11.0989 Epoch: 0 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:17:32,340-Speed 5185.46 samples/sec Loss 11.0355 Epoch: 0 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:17:42,492-Speed 5043.64 samples/sec Loss 11.0358 Epoch: 0 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:17:52,484-Speed 5124.28 samples/sec Loss 10.9930 Epoch: 0 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:02,829-Speed 4949.39 samples/sec Loss 11.0556 Epoch: 0 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:12,932-Speed 5068.32 samples/sec Loss 10.9775 Epoch: 0 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:22,913-Speed 5129.75 samples/sec Loss 10.9511 Epoch: 0 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:33,074-Speed 5039.37 samples/sec Loss 10.9140 Epoch: 0 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:42,863-Speed 5230.53 samples/sec Loss 10.9241 Epoch: 0 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:18:52,711-Speed 5199.59 samples/sec Loss 10.9372 Epoch: 0 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:02,755-Speed 5097.41 samples/sec Loss 10.8982 Epoch: 0 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:12,621-Speed 5190.12 samples/sec Loss 10.9040 Epoch: 0 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:22,414-Speed 5228.58 samples/sec Loss 10.8765 Epoch: 0 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:32,202-Speed 5231.34 samples/sec Loss 10.8792 Epoch: 0 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:42,252-Speed 5094.54 samples/sec Loss 10.8520 Epoch: 0 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:19:52,349-Speed 5071.26 samples/sec Loss 10.9099 Epoch: 0 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:02,581-Speed 5004.56 samples/sec Loss 10.8684 Epoch: 0 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:12,617-Speed 5101.64 samples/sec Loss 10.8194 Epoch: 0 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:23,148-Speed 4862.24 samples/sec Loss 10.8205 Epoch: 0 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:33,431-Speed 4979.52 samples/sec Loss 10.8254 Epoch: 0 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:43,389-Speed 5141.99 samples/sec Loss 10.7455 Epoch: 0 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:20:53,672-Speed 4979.20 samples/sec Loss 10.6688 Epoch: 0 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:04,701-Speed 4642.48 samples/sec Loss 10.7334 Epoch: 0 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:16,180-Speed 4460.60 samples/sec Loss 10.6677 Epoch: 0 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:26,572-Speed 4927.20 samples/sec Loss 10.7248 Epoch: 0 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:36,487-Speed 5164.26 samples/sec Loss 10.7776 Epoch: 0 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:47,298-Speed 4736.55 samples/sec Loss 10.7254 Epoch: 0 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:21:58,955-Speed 4392.38 samples/sec Loss 10.6153 Epoch: 0 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:08,903-Speed 5147.11 samples/sec Loss 10.6091 Epoch: 0 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:18,842-Speed 5151.50 samples/sec Loss 10.5570 Epoch: 0 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:28,743-Speed 5171.37 samples/sec Loss 10.5979 Epoch: 0 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:38,630-Speed 5179.44 samples/sec Loss 10.6835 Epoch: 0 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:48,461-Speed 5208.36 samples/sec Loss 10.5725 Epoch: 0 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:22:58,275-Speed 5217.41 samples/sec Loss 10.6225 Epoch: 0 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:08,188-Speed 5165.09 samples/sec Loss 10.6188 Epoch: 0 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:18,145-Speed 5142.39 samples/sec Loss 10.5803 Epoch: 0 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:28,140-Speed 5122.75 samples/sec Loss 10.6290 Epoch: 0 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:38,031-Speed 5176.77 samples/sec Loss 10.5988 Epoch: 0 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:47,868-Speed 5205.15 samples/sec Loss 10.5637 Epoch: 0 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:23:57,720-Speed 5197.36 samples/sec Loss 10.4480 Epoch: 0 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:24:14,451-[lfw][14000]XNorm: 23.449534 Training: 2021-03-18 21:24:14,451-[lfw][14000]Accuracy-Flip: 0.99317+-0.00311 Training: 2021-03-18 21:24:14,451-[lfw][14000]Accuracy-Highest: 0.99317 Training: 2021-03-18 21:24:33,067-[cfp_fp][14000]XNorm: 19.417011 Training: 2021-03-18 21:24:33,067-[cfp_fp][14000]Accuracy-Flip: 0.88571+-0.01826 Training: 2021-03-18 21:24:33,068-[cfp_fp][14000]Accuracy-Highest: 0.88571 Training: 2021-03-18 21:24:49,317-[agedb_30][14000]XNorm: 21.743210 Training: 2021-03-18 21:24:49,317-[agedb_30][14000]Accuracy-Flip: 0.92583+-0.01677 Training: 2021-03-18 21:24:49,317-[agedb_30][14000]Accuracy-Highest: 0.92583 Training: 2021-03-18 21:24:58,928-Speed 836.49 samples/sec Loss 10.5325 Epoch: 0 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:08,677-Speed 5252.23 samples/sec Loss 10.5375 Epoch: 0 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:19,098-Speed 4913.39 samples/sec Loss 10.5046 Epoch: 0 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:28,898-Speed 5224.82 samples/sec Loss 10.5729 Epoch: 0 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:38,645-Speed 5253.50 samples/sec Loss 10.4735 Epoch: 0 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:48,553-Speed 5167.91 samples/sec Loss 10.4655 Epoch: 0 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:25:58,510-Speed 5142.03 samples/sec Loss 10.4194 Epoch: 0 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:08,292-Speed 5234.66 samples/sec Loss 10.5567 Epoch: 0 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:18,015-Speed 5266.29 samples/sec Loss 10.4896 Epoch: 0 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:27,871-Speed 5194.97 samples/sec Loss 10.5154 Epoch: 0 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:37,949-Speed 5080.60 samples/sec Loss 10.4794 Epoch: 0 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:47,946-Speed 5121.89 samples/sec Loss 10.4171 Epoch: 0 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:26:57,929-Speed 5128.95 samples/sec Loss 10.3977 Epoch: 0 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:07,802-Speed 5185.92 samples/sec Loss 10.3757 Epoch: 0 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:17,737-Speed 5154.21 samples/sec Loss 10.4156 Epoch: 0 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:27,452-Speed 5270.37 samples/sec Loss 10.3924 Epoch: 0 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:37,448-Speed 5122.44 samples/sec Loss 10.3896 Epoch: 0 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:47,146-Speed 5279.90 samples/sec Loss 10.3861 Epoch: 0 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:27:57,085-Speed 5151.39 samples/sec Loss 10.3294 Epoch: 0 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:06,764-Speed 5290.52 samples/sec Loss 10.3458 Epoch: 0 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:16,485-Speed 5267.23 samples/sec Loss 10.3902 Epoch: 0 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:26,345-Speed 5192.84 samples/sec Loss 10.3730 Epoch: 0 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:36,258-Speed 5165.57 samples/sec Loss 10.2724 Epoch: 0 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:45,961-Speed 5276.92 samples/sec Loss 10.2465 Epoch: 0 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:28:55,848-Speed 5179.06 samples/sec Loss 10.2833 Epoch: 0 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:06,049-Speed 5019.55 samples/sec Loss 10.2836 Epoch: 0 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:16,001-Speed 5145.22 samples/sec Loss 10.3401 Epoch: 0 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:25,867-Speed 5189.66 samples/sec Loss 10.2342 Epoch: 0 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:35,850-Speed 5128.81 samples/sec Loss 10.2685 Epoch: 0 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:45,647-Speed 5226.41 samples/sec Loss 10.2163 Epoch: 0 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:29:55,316-Speed 5295.62 samples/sec Loss 10.2078 Epoch: 0 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:05,095-Speed 5235.83 samples/sec Loss 10.2442 Epoch: 0 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:14,957-Speed 5192.06 samples/sec Loss 10.2440 Epoch: 0 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:24,563-Speed 5330.25 samples/sec Loss 10.2394 Epoch: 0 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:34,315-Speed 5250.42 samples/sec Loss 10.1763 Epoch: 0 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:44,309-Speed 5123.61 samples/sec Loss 10.1868 Epoch: 0 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:30:54,043-Speed 5260.15 samples/sec Loss 10.1748 Epoch: 0 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:31:04,773-Speed 4772.07 samples/sec Loss 10.1614 Epoch: 0 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:31:14,393-Speed 5322.41 samples/sec Loss 10.1389 Epoch: 0 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:31:24,371-Speed 5131.57 samples/sec Loss 10.1350 Epoch: 0 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:31:41,184-[lfw][16000]XNorm: 25.473226 Training: 2021-03-18 21:31:41,184-[lfw][16000]Accuracy-Flip: 0.99117+-0.00495 Training: 2021-03-18 21:31:41,184-[lfw][16000]Accuracy-Highest: 0.99317 Training: 2021-03-18 21:31:59,896-[cfp_fp][16000]XNorm: 20.771643 Training: 2021-03-18 21:31:59,896-[cfp_fp][16000]Accuracy-Flip: 0.86200+-0.01125 Training: 2021-03-18 21:31:59,896-[cfp_fp][16000]Accuracy-Highest: 0.88571 Training: 2021-03-18 21:32:15,922-[agedb_30][16000]XNorm: 24.362290 Training: 2021-03-18 21:32:15,923-[agedb_30][16000]Accuracy-Flip: 0.92717+-0.01588 Training: 2021-03-18 21:32:15,923-[agedb_30][16000]Accuracy-Highest: 0.92717 Training: 2021-03-18 21:32:25,400-Speed 838.96 samples/sec Loss 10.1570 Epoch: 0 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:32:35,478-Speed 5080.42 samples/sec Loss 10.1628 Epoch: 0 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:32:45,274-Speed 5226.97 samples/sec Loss 10.1398 Epoch: 0 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:32:56,670-Speed 4493.31 samples/sec Loss 10.0965 Epoch: 0 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:07,279-Speed 4826.25 samples/sec Loss 10.1399 Epoch: 0 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:17,169-Speed 5177.00 samples/sec Loss 10.1529 Epoch: 0 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:26,855-Speed 5286.39 samples/sec Loss 10.1026 Epoch: 0 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:38,152-Speed 4532.36 samples/sec Loss 10.0578 Epoch: 0 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:48,816-Speed 4801.49 samples/sec Loss 10.1074 Epoch: 0 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:33:58,760-Speed 5149.19 samples/sec Loss 10.0766 Epoch: 0 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:34:08,529-Speed 5241.36 samples/sec Loss 10.1598 Epoch: 0 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:34:18,313-Speed 5233.29 samples/sec Loss 10.0586 Epoch: 0 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:34:28,296-Speed 5129.04 samples/sec Loss 10.0441 Epoch: 0 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:34:42,482-Speed 3609.33 samples/sec Loss 9.9033 Epoch: 1 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:34:52,941-Speed 4895.49 samples/sec Loss 9.2378 Epoch: 1 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:03,075-Speed 5052.84 samples/sec Loss 9.2752 Epoch: 1 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:13,068-Speed 5123.85 samples/sec Loss 9.2554 Epoch: 1 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:23,095-Speed 5106.78 samples/sec Loss 9.2711 Epoch: 1 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:32,992-Speed 5173.14 samples/sec Loss 9.3474 Epoch: 1 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:42,845-Speed 5197.00 samples/sec Loss 9.3222 Epoch: 1 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:35:52,711-Speed 5189.63 samples/sec Loss 9.3043 Epoch: 1 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:02,472-Speed 5245.77 samples/sec Loss 9.3192 Epoch: 1 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:13,148-Speed 4796.17 samples/sec Loss 9.3939 Epoch: 1 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:22,830-Speed 5288.44 samples/sec Loss 9.3649 Epoch: 1 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:32,905-Speed 5081.84 samples/sec Loss 9.3586 Epoch: 1 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:42,988-Speed 5078.50 samples/sec Loss 9.4313 Epoch: 1 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:36:52,836-Speed 5199.26 samples/sec Loss 9.4742 Epoch: 1 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:02,850-Speed 5113.00 samples/sec Loss 9.3981 Epoch: 1 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:12,585-Speed 5259.70 samples/sec Loss 9.3912 Epoch: 1 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:22,554-Speed 5136.35 samples/sec Loss 9.4842 Epoch: 1 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:32,493-Speed 5151.47 samples/sec Loss 9.4869 Epoch: 1 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:42,241-Speed 5252.88 samples/sec Loss 9.4436 Epoch: 1 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:37:52,320-Speed 5079.89 samples/sec Loss 9.4482 Epoch: 1 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:02,387-Speed 5086.53 samples/sec Loss 9.4349 Epoch: 1 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:12,375-Speed 5126.42 samples/sec Loss 9.4183 Epoch: 1 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:22,447-Speed 5083.36 samples/sec Loss 9.4145 Epoch: 1 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:32,118-Speed 5294.75 samples/sec Loss 9.5365 Epoch: 1 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:42,128-Speed 5114.99 samples/sec Loss 9.4536 Epoch: 1 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:38:52,093-Speed 5138.54 samples/sec Loss 9.4050 Epoch: 1 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:39:02,059-Speed 5137.74 samples/sec Loss 9.4249 Epoch: 1 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:39:18,970-[lfw][18000]XNorm: 24.127735 Training: 2021-03-18 21:39:18,970-[lfw][18000]Accuracy-Flip: 0.98967+-0.00379 Training: 2021-03-18 21:39:18,970-[lfw][18000]Accuracy-Highest: 0.99317 Training: 2021-03-18 21:39:37,710-[cfp_fp][18000]XNorm: 19.175911 Training: 2021-03-18 21:39:37,710-[cfp_fp][18000]Accuracy-Flip: 0.87843+-0.01508 Training: 2021-03-18 21:39:37,710-[cfp_fp][18000]Accuracy-Highest: 0.88571 Training: 2021-03-18 21:39:53,903-[agedb_30][18000]XNorm: 23.031235 Training: 2021-03-18 21:39:53,904-[agedb_30][18000]Accuracy-Flip: 0.92600+-0.01334 Training: 2021-03-18 21:39:53,904-[agedb_30][18000]Accuracy-Highest: 0.92717 Training: 2021-03-18 21:40:03,534-Speed 832.87 samples/sec Loss 9.5033 Epoch: 1 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:40:13,349-Speed 5216.80 samples/sec Loss 9.5273 Epoch: 1 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:40:23,320-Speed 5135.04 samples/sec Loss 9.5325 Epoch: 1 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:40:33,544-Speed 5008.51 samples/sec Loss 9.4518 Epoch: 1 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:40:43,313-Speed 5241.54 samples/sec Loss 9.4965 Epoch: 1 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:40:53,137-Speed 5211.78 samples/sec Loss 9.4966 Epoch: 1 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:03,057-Speed 5161.58 samples/sec Loss 9.4903 Epoch: 1 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:12,909-Speed 5197.37 samples/sec Loss 9.4878 Epoch: 1 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:22,698-Speed 5230.68 samples/sec Loss 9.4632 Epoch: 1 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:32,643-Speed 5148.91 samples/sec Loss 9.5276 Epoch: 1 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:42,602-Speed 5141.01 samples/sec Loss 9.5050 Epoch: 1 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:41:52,649-Speed 5096.34 samples/sec Loss 9.5599 Epoch: 1 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:02,735-Speed 5076.75 samples/sec Loss 9.5565 Epoch: 1 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:12,602-Speed 5189.40 samples/sec Loss 9.5531 Epoch: 1 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:22,451-Speed 5199.00 samples/sec Loss 9.5140 Epoch: 1 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:32,408-Speed 5142.00 samples/sec Loss 9.5179 Epoch: 1 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:42,419-Speed 5114.87 samples/sec Loss 9.4838 Epoch: 1 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:42:52,322-Speed 5170.46 samples/sec Loss 9.5298 Epoch: 1 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:03,377-Speed 4631.43 samples/sec Loss 9.5128 Epoch: 1 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:13,642-Speed 4988.63 samples/sec Loss 9.5283 Epoch: 1 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:23,476-Speed 5206.31 samples/sec Loss 9.5212 Epoch: 1 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:33,331-Speed 5195.80 samples/sec Loss 9.5351 Epoch: 1 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:43,094-Speed 5244.81 samples/sec Loss 9.5273 Epoch: 1 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:43:52,796-Speed 5277.68 samples/sec Loss 9.5484 Epoch: 1 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:03,973-Speed 4580.75 samples/sec Loss 9.5923 Epoch: 1 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:14,006-Speed 5103.45 samples/sec Loss 9.5510 Epoch: 1 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:24,740-Speed 4770.35 samples/sec Loss 9.5235 Epoch: 1 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:34,545-Speed 5222.20 samples/sec Loss 9.5522 Epoch: 1 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:45,327-Speed 4748.88 samples/sec Loss 9.4882 Epoch: 1 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:44:57,140-Speed 4334.63 samples/sec Loss 9.5426 Epoch: 1 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:06,966-Speed 5210.75 samples/sec Loss 9.5446 Epoch: 1 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:16,876-Speed 5166.62 samples/sec Loss 9.5202 Epoch: 1 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:26,744-Speed 5188.73 samples/sec Loss 9.5154 Epoch: 1 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:36,837-Speed 5073.11 samples/sec Loss 9.5565 Epoch: 1 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:46,721-Speed 5180.71 samples/sec Loss 9.5595 Epoch: 1 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:45:56,565-Speed 5201.20 samples/sec Loss 9.5473 Epoch: 1 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:46:06,443-Speed 5183.79 samples/sec Loss 9.5713 Epoch: 1 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:46:16,662-Speed 5010.20 samples/sec Loss 9.5263 Epoch: 1 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:46:26,990-Speed 4957.84 samples/sec Loss 9.4873 Epoch: 1 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:46:36,832-Speed 5202.79 samples/sec Loss 9.4862 Epoch: 1 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:46:53,498-[lfw][20000]XNorm: 24.773519 Training: 2021-03-18 21:46:53,499-[lfw][20000]Accuracy-Flip: 0.99267+-0.00300 Training: 2021-03-18 21:46:53,499-[lfw][20000]Accuracy-Highest: 0.99317 Training: 2021-03-18 21:47:12,145-[cfp_fp][20000]XNorm: 20.126243 Training: 2021-03-18 21:47:12,145-[cfp_fp][20000]Accuracy-Flip: 0.86829+-0.00998 Training: 2021-03-18 21:47:12,145-[cfp_fp][20000]Accuracy-Highest: 0.88571 Training: 2021-03-18 21:47:28,381-[agedb_30][20000]XNorm: 23.211160 Training: 2021-03-18 21:47:28,381-[agedb_30][20000]Accuracy-Flip: 0.92650+-0.01477 Training: 2021-03-18 21:47:28,381-[agedb_30][20000]Accuracy-Highest: 0.92717 Training: 2021-03-18 21:47:37,976-Speed 837.38 samples/sec Loss 9.5250 Epoch: 1 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:47:47,720-Speed 5254.69 samples/sec Loss 9.5029 Epoch: 1 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:47:57,488-Speed 5241.71 samples/sec Loss 9.5376 Epoch: 1 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:07,416-Speed 5157.49 samples/sec Loss 9.5583 Epoch: 1 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:18,281-Speed 4712.37 samples/sec Loss 9.5684 Epoch: 1 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:28,267-Speed 5127.91 samples/sec Loss 9.5613 Epoch: 1 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:38,308-Speed 5099.38 samples/sec Loss 9.5278 Epoch: 1 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:48,345-Speed 5101.22 samples/sec Loss 9.5559 Epoch: 1 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:48:58,430-Speed 5077.10 samples/sec Loss 9.5534 Epoch: 1 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:08,122-Speed 5282.98 samples/sec Loss 9.6088 Epoch: 1 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:18,274-Speed 5043.51 samples/sec Loss 9.5124 Epoch: 1 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:28,050-Speed 5237.68 samples/sec Loss 9.5705 Epoch: 1 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:37,773-Speed 5266.27 samples/sec Loss 9.5725 Epoch: 1 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:47,400-Speed 5318.54 samples/sec Loss 9.5034 Epoch: 1 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:49:57,354-Speed 5144.18 samples/sec Loss 9.5058 Epoch: 1 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:07,129-Speed 5238.28 samples/sec Loss 9.5144 Epoch: 1 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:17,034-Speed 5169.13 samples/sec Loss 9.5788 Epoch: 1 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:27,052-Speed 5111.34 samples/sec Loss 9.5658 Epoch: 1 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:36,902-Speed 5198.42 samples/sec Loss 9.5121 Epoch: 1 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:46,976-Speed 5082.78 samples/sec Loss 9.4818 Epoch: 1 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:50:57,026-Speed 5094.94 samples/sec Loss 9.5110 Epoch: 1 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:06,797-Speed 5240.21 samples/sec Loss 9.4329 Epoch: 1 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:16,731-Speed 5154.15 samples/sec Loss 9.4729 Epoch: 1 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:26,529-Speed 5226.07 samples/sec Loss 9.4671 Epoch: 1 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:36,630-Speed 5069.16 samples/sec Loss 9.4987 Epoch: 1 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:46,548-Speed 5162.65 samples/sec Loss 9.5237 Epoch: 1 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:51:56,491-Speed 5149.59 samples/sec Loss 9.5104 Epoch: 1 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:06,351-Speed 5193.12 samples/sec Loss 9.4677 Epoch: 1 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:16,483-Speed 5053.35 samples/sec Loss 9.4437 Epoch: 1 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:26,325-Speed 5202.74 samples/sec Loss 9.5036 Epoch: 1 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:36,211-Speed 5179.59 samples/sec Loss 9.4332 Epoch: 1 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:46,063-Speed 5196.90 samples/sec Loss 9.4390 Epoch: 1 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:52:55,917-Speed 5196.47 samples/sec Loss 9.5102 Epoch: 1 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:05,753-Speed 5205.70 samples/sec Loss 9.4649 Epoch: 1 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:15,887-Speed 5052.65 samples/sec Loss 9.5169 Epoch: 1 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:25,874-Speed 5126.72 samples/sec Loss 9.5144 Epoch: 1 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:36,091-Speed 5011.61 samples/sec Loss 9.4545 Epoch: 1 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:46,185-Speed 5072.91 samples/sec Loss 9.5148 Epoch: 1 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:53:56,070-Speed 5179.51 samples/sec Loss 9.5342 Epoch: 1 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:54:05,779-Speed 5273.70 samples/sec Loss 9.4521 Epoch: 1 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:54:22,753-[lfw][22000]XNorm: 24.335207 Training: 2021-03-18 21:54:22,753-[lfw][22000]Accuracy-Flip: 0.99317+-0.00444 Training: 2021-03-18 21:54:22,753-[lfw][22000]Accuracy-Highest: 0.99317 Training: 2021-03-18 21:54:41,569-[cfp_fp][22000]XNorm: 19.993129 Training: 2021-03-18 21:54:41,569-[cfp_fp][22000]Accuracy-Flip: 0.89400+-0.01487 Training: 2021-03-18 21:54:41,569-[cfp_fp][22000]Accuracy-Highest: 0.89400 Training: 2021-03-18 21:54:57,776-[agedb_30][22000]XNorm: 22.850339 Training: 2021-03-18 21:54:57,777-[agedb_30][22000]Accuracy-Flip: 0.93317+-0.01433 Training: 2021-03-18 21:54:57,777-[agedb_30][22000]Accuracy-Highest: 0.93317 Training: 2021-03-18 21:55:08,430-Speed 817.24 samples/sec Loss 9.5000 Epoch: 1 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:55:18,175-Speed 5254.26 samples/sec Loss 9.5096 Epoch: 1 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:55:27,944-Speed 5241.26 samples/sec Loss 9.5474 Epoch: 1 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:55:37,777-Speed 5207.79 samples/sec Loss 9.4411 Epoch: 1 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:55:47,725-Speed 5146.65 samples/sec Loss 9.4901 Epoch: 1 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:55:58,467-Speed 4766.45 samples/sec Loss 9.4639 Epoch: 1 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:56:08,406-Speed 5151.89 samples/sec Loss 9.4204 Epoch: 1 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:56:18,742-Speed 4953.67 samples/sec Loss 9.4163 Epoch: 1 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:56:29,255-Speed 4870.63 samples/sec Loss 9.3982 Epoch: 1 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:56:39,221-Speed 5137.53 samples/sec Loss 9.4837 Epoch: 1 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:56:49,067-Speed 5200.52 samples/sec Loss 9.4730 Epoch: 1 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:01,576-Speed 4093.31 samples/sec Loss 9.4257 Epoch: 1 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:11,173-Speed 5335.67 samples/sec Loss 9.4654 Epoch: 1 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:21,054-Speed 5182.14 samples/sec Loss 9.3877 Epoch: 1 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:30,960-Speed 5168.56 samples/sec Loss 9.3928 Epoch: 1 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:40,789-Speed 5209.86 samples/sec Loss 9.3700 Epoch: 1 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:57:50,790-Speed 5119.43 samples/sec Loss 9.4804 Epoch: 1 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:00,573-Speed 5234.27 samples/sec Loss 9.4305 Epoch: 1 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:10,647-Speed 5082.79 samples/sec Loss 9.4630 Epoch: 1 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:20,655-Speed 5116.08 samples/sec Loss 9.4157 Epoch: 1 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:30,456-Speed 5224.25 samples/sec Loss 9.4654 Epoch: 1 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:40,393-Speed 5153.01 samples/sec Loss 9.4268 Epoch: 1 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:58:50,374-Speed 5130.11 samples/sec Loss 9.4502 Epoch: 1 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:00,600-Speed 5006.67 samples/sec Loss 9.4196 Epoch: 1 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:10,345-Speed 5254.26 samples/sec Loss 9.3814 Epoch: 1 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:20,390-Speed 5097.62 samples/sec Loss 9.3780 Epoch: 1 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:30,284-Speed 5175.03 samples/sec Loss 9.4051 Epoch: 1 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:41,157-Speed 4709.03 samples/sec Loss 9.4551 Epoch: 1 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 21:59:50,971-Speed 5217.47 samples/sec Loss 9.3740 Epoch: 1 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:00:00,998-Speed 5106.28 samples/sec Loss 9.4201 Epoch: 1 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:00:10,812-Speed 5217.74 samples/sec Loss 9.4150 Epoch: 1 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:00:20,817-Speed 5117.97 samples/sec Loss 9.4028 Epoch: 1 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:00:30,700-Speed 5180.59 samples/sec Loss 9.3834 Epoch: 1 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:00:40,944-Speed 4998.72 samples/sec Loss 9.3124 Epoch: 1 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:00:51,013-Speed 5085.04 samples/sec Loss 9.4098 Epoch: 1 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:01,080-Speed 5086.14 samples/sec Loss 9.4303 Epoch: 1 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:11,084-Speed 5118.46 samples/sec Loss 9.4000 Epoch: 1 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:21,024-Speed 5151.32 samples/sec Loss 9.4260 Epoch: 1 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:30,802-Speed 5236.35 samples/sec Loss 9.3740 Epoch: 1 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:40,836-Speed 5103.32 samples/sec Loss 9.3684 Epoch: 1 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:01:57,628-[lfw][24000]XNorm: 23.898153 Training: 2021-03-18 22:01:57,628-[lfw][24000]Accuracy-Flip: 0.99100+-0.00410 Training: 2021-03-18 22:01:57,629-[lfw][24000]Accuracy-Highest: 0.99317 Training: 2021-03-18 22:02:16,269-[cfp_fp][24000]XNorm: 19.179093 Training: 2021-03-18 22:02:16,269-[cfp_fp][24000]Accuracy-Flip: 0.88371+-0.01569 Training: 2021-03-18 22:02:16,270-[cfp_fp][24000]Accuracy-Highest: 0.89400 Training: 2021-03-18 22:02:32,421-[agedb_30][24000]XNorm: 22.627844 Training: 2021-03-18 22:02:32,421-[agedb_30][24000]Accuracy-Flip: 0.93650+-0.01099 Training: 2021-03-18 22:02:32,421-[agedb_30][24000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:02:42,176-Speed 834.70 samples/sec Loss 9.3857 Epoch: 1 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:02:52,221-Speed 5097.40 samples/sec Loss 9.3562 Epoch: 1 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:02,338-Speed 5061.21 samples/sec Loss 9.3662 Epoch: 1 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:12,396-Speed 5090.71 samples/sec Loss 9.3147 Epoch: 1 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:22,391-Speed 5122.74 samples/sec Loss 9.2625 Epoch: 1 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:32,311-Speed 5161.90 samples/sec Loss 9.3650 Epoch: 1 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:42,336-Speed 5107.37 samples/sec Loss 9.3806 Epoch: 1 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:03:52,407-Speed 5084.59 samples/sec Loss 9.3891 Epoch: 1 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:02,346-Speed 5151.56 samples/sec Loss 9.3451 Epoch: 1 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:12,405-Speed 5090.60 samples/sec Loss 9.3873 Epoch: 1 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:22,560-Speed 5042.07 samples/sec Loss 9.3450 Epoch: 1 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:32,457-Speed 5173.44 samples/sec Loss 9.3717 Epoch: 1 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:42,612-Speed 5042.36 samples/sec Loss 9.3560 Epoch: 1 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-18 22:04:52,504-Speed 5176.07 samples/sec Loss 9.3713 Epoch: 1 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:02,545-Speed 5099.46 samples/sec Loss 9.3638 Epoch: 1 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:12,365-Speed 5214.36 samples/sec Loss 9.3255 Epoch: 1 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:22,472-Speed 5065.67 samples/sec Loss 9.3615 Epoch: 1 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:32,378-Speed 5168.97 samples/sec Loss 9.3244 Epoch: 1 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:42,522-Speed 5047.60 samples/sec Loss 9.3954 Epoch: 1 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:05:52,403-Speed 5181.95 samples/sec Loss 9.4026 Epoch: 1 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:03,215-Speed 4735.70 samples/sec Loss 9.3537 Epoch: 1 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:13,367-Speed 5043.64 samples/sec Loss 9.3008 Epoch: 1 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:23,393-Speed 5106.93 samples/sec Loss 9.2747 Epoch: 1 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:33,339-Speed 5148.15 samples/sec Loss 9.3099 Epoch: 1 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:43,296-Speed 5142.43 samples/sec Loss 9.3040 Epoch: 1 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:06:53,230-Speed 5153.93 samples/sec Loss 9.3080 Epoch: 1 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:03,997-Speed 4755.90 samples/sec Loss 9.3197 Epoch: 1 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:13,697-Speed 5278.82 samples/sec Loss 9.3214 Epoch: 1 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:23,581-Speed 5180.28 samples/sec Loss 9.3714 Epoch: 1 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:33,480-Speed 5172.60 samples/sec Loss 9.3713 Epoch: 1 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:44,180-Speed 4785.31 samples/sec Loss 9.3442 Epoch: 1 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:07:54,731-Speed 4853.03 samples/sec Loss 9.3531 Epoch: 1 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:04,548-Speed 5215.86 samples/sec Loss 9.2982 Epoch: 1 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:15,386-Speed 4724.09 samples/sec Loss 9.2858 Epoch: 1 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:26,635-Speed 4551.67 samples/sec Loss 9.3279 Epoch: 1 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:36,559-Speed 5159.61 samples/sec Loss 9.2735 Epoch: 1 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:46,566-Speed 5116.66 samples/sec Loss 9.2983 Epoch: 1 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:08:56,354-Speed 5231.59 samples/sec Loss 9.2997 Epoch: 1 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:09:06,322-Speed 5136.32 samples/sec Loss 9.3050 Epoch: 1 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:09:16,149-Speed 5210.43 samples/sec Loss 9.2719 Epoch: 1 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:09:33,260-[lfw][26000]XNorm: 24.897368 Training: 2021-03-18 22:09:33,261-[lfw][26000]Accuracy-Flip: 0.99250+-0.00410 Training: 2021-03-18 22:09:33,262-[lfw][26000]Accuracy-Highest: 0.99317 Training: 2021-03-18 22:09:51,894-[cfp_fp][26000]XNorm: 19.727668 Training: 2021-03-18 22:09:51,894-[cfp_fp][26000]Accuracy-Flip: 0.88386+-0.01314 Training: 2021-03-18 22:09:51,894-[cfp_fp][26000]Accuracy-Highest: 0.89400 Training: 2021-03-18 22:10:07,988-[agedb_30][26000]XNorm: 23.355218 Training: 2021-03-18 22:10:07,988-[agedb_30][26000]Accuracy-Flip: 0.93583+-0.01596 Training: 2021-03-18 22:10:07,988-[agedb_30][26000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:10:17,603-Speed 833.15 samples/sec Loss 9.2610 Epoch: 1 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:10:27,552-Speed 5146.71 samples/sec Loss 9.3161 Epoch: 1 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:10:37,502-Speed 5146.07 samples/sec Loss 9.3247 Epoch: 1 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:10:47,325-Speed 5212.87 samples/sec Loss 9.3229 Epoch: 1 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:10:57,102-Speed 5237.19 samples/sec Loss 9.2987 Epoch: 1 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:07,251-Speed 5045.24 samples/sec Loss 9.3019 Epoch: 1 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:17,139-Speed 5177.88 samples/sec Loss 9.2735 Epoch: 1 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:27,069-Speed 5156.80 samples/sec Loss 9.2749 Epoch: 1 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:37,007-Speed 5152.12 samples/sec Loss 9.2635 Epoch: 1 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:47,558-Speed 4852.88 samples/sec Loss 9.2916 Epoch: 1 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:11:57,507-Speed 5146.42 samples/sec Loss 9.2646 Epoch: 1 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:07,381-Speed 5185.52 samples/sec Loss 9.3223 Epoch: 1 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:17,278-Speed 5173.65 samples/sec Loss 9.2506 Epoch: 1 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:27,136-Speed 5194.29 samples/sec Loss 9.2965 Epoch: 1 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:37,218-Speed 5078.66 samples/sec Loss 9.2591 Epoch: 1 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:47,142-Speed 5159.33 samples/sec Loss 9.3497 Epoch: 1 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:12:57,266-Speed 5058.04 samples/sec Loss 9.3127 Epoch: 1 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:07,118-Speed 5196.86 samples/sec Loss 9.3157 Epoch: 1 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:17,045-Speed 5158.14 samples/sec Loss 9.2446 Epoch: 1 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:26,942-Speed 5173.76 samples/sec Loss 9.2903 Epoch: 1 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:36,771-Speed 5209.26 samples/sec Loss 9.2848 Epoch: 1 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:46,860-Speed 5075.28 samples/sec Loss 9.2753 Epoch: 1 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:13:56,690-Speed 5208.78 samples/sec Loss 9.2400 Epoch: 1 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:06,809-Speed 5060.09 samples/sec Loss 9.3187 Epoch: 1 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:16,859-Speed 5095.11 samples/sec Loss 9.2503 Epoch: 1 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:26,895-Speed 5102.10 samples/sec Loss 9.2562 Epoch: 1 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:37,014-Speed 5059.98 samples/sec Loss 9.2804 Epoch: 1 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:47,053-Speed 5100.75 samples/sec Loss 9.2515 Epoch: 1 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:14:57,266-Speed 5013.16 samples/sec Loss 9.2441 Epoch: 1 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:07,267-Speed 5120.35 samples/sec Loss 9.2968 Epoch: 1 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:17,183-Speed 5163.31 samples/sec Loss 9.3101 Epoch: 1 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:27,157-Speed 5133.56 samples/sec Loss 9.2734 Epoch: 1 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:37,097-Speed 5151.47 samples/sec Loss 9.2262 Epoch: 1 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:47,270-Speed 5033.12 samples/sec Loss 9.2196 Epoch: 1 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:15:57,472-Speed 5019.03 samples/sec Loss 9.1735 Epoch: 1 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:16:07,527-Speed 5092.16 samples/sec Loss 9.2634 Epoch: 1 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:16:17,370-Speed 5201.90 samples/sec Loss 9.2241 Epoch: 1 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:16:27,319-Speed 5146.75 samples/sec Loss 9.2422 Epoch: 1 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:16:37,182-Speed 5191.45 samples/sec Loss 9.2236 Epoch: 1 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:16:47,220-Speed 5101.05 samples/sec Loss 9.2587 Epoch: 1 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:17:04,073-[lfw][28000]XNorm: 23.185539 Training: 2021-03-18 22:17:04,073-[lfw][28000]Accuracy-Flip: 0.99083+-0.00410 Training: 2021-03-18 22:17:04,073-[lfw][28000]Accuracy-Highest: 0.99317 Training: 2021-03-18 22:17:22,748-[cfp_fp][28000]XNorm: 18.596823 Training: 2021-03-18 22:17:22,749-[cfp_fp][28000]Accuracy-Flip: 0.88686+-0.01230 Training: 2021-03-18 22:17:22,749-[cfp_fp][28000]Accuracy-Highest: 0.89400 Training: 2021-03-18 22:17:38,811-[agedb_30][28000]XNorm: 21.818582 Training: 2021-03-18 22:17:38,812-[agedb_30][28000]Accuracy-Flip: 0.92417+-0.01223 Training: 2021-03-18 22:17:38,812-[agedb_30][28000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:17:48,568-Speed 834.58 samples/sec Loss 9.2145 Epoch: 1 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:17:58,625-Speed 5091.51 samples/sec Loss 9.2179 Epoch: 1 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:09,423-Speed 4741.70 samples/sec Loss 9.1930 Epoch: 1 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:19,380-Speed 5142.84 samples/sec Loss 9.1685 Epoch: 1 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:29,279-Speed 5172.34 samples/sec Loss 9.2514 Epoch: 1 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:39,241-Speed 5140.20 samples/sec Loss 9.2219 Epoch: 1 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:49,088-Speed 5199.76 samples/sec Loss 9.2511 Epoch: 1 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:18:59,103-Speed 5112.38 samples/sec Loss 9.2109 Epoch: 1 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:19:09,160-Speed 5091.30 samples/sec Loss 9.2214 Epoch: 1 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:19:19,869-Speed 4781.00 samples/sec Loss 9.1692 Epoch: 1 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:19:29,516-Speed 5307.74 samples/sec Loss 9.2153 Epoch: 1 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:19:40,175-Speed 4803.88 samples/sec Loss 9.1986 Epoch: 1 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:19:50,156-Speed 5129.95 samples/sec Loss 9.2619 Epoch: 1 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:01,160-Speed 4653.20 samples/sec Loss 9.1874 Epoch: 1 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:11,019-Speed 5193.51 samples/sec Loss 9.1677 Epoch: 1 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:21,209-Speed 5025.01 samples/sec Loss 9.2083 Epoch: 1 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:32,913-Speed 4374.67 samples/sec Loss 9.1569 Epoch: 1 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:42,910-Speed 5121.75 samples/sec Loss 9.2197 Epoch: 1 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:20:53,955-Speed 4635.81 samples/sec Loss 9.1814 Epoch: 1 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:04,059-Speed 5067.63 samples/sec Loss 9.1303 Epoch: 1 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:14,104-Speed 5097.31 samples/sec Loss 9.2149 Epoch: 1 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:24,051-Speed 5147.90 samples/sec Loss 9.1691 Epoch: 1 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:33,899-Speed 5199.14 samples/sec Loss 9.1559 Epoch: 1 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:43,923-Speed 5107.95 samples/sec Loss 9.2110 Epoch: 1 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:21:53,968-Speed 5097.96 samples/sec Loss 9.2189 Epoch: 1 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:04,018-Speed 5094.58 samples/sec Loss 9.2372 Epoch: 1 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:14,058-Speed 5100.05 samples/sec Loss 9.2042 Epoch: 1 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:23,801-Speed 5255.11 samples/sec Loss 9.1820 Epoch: 1 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:33,734-Speed 5154.84 samples/sec Loss 9.1377 Epoch: 1 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:43,748-Speed 5113.36 samples/sec Loss 9.2420 Epoch: 1 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:22:53,909-Speed 5039.03 samples/sec Loss 9.1625 Epoch: 1 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:03,579-Speed 5295.05 samples/sec Loss 9.1758 Epoch: 1 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:14,712-Speed 4599.54 samples/sec Loss 9.2016 Epoch: 1 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:25,227-Speed 4869.69 samples/sec Loss 9.1416 Epoch: 1 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:35,334-Speed 5065.69 samples/sec Loss 9.1906 Epoch: 1 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:45,150-Speed 5216.28 samples/sec Loss 9.1805 Epoch: 1 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:23:55,331-Speed 5029.26 samples/sec Loss 9.1746 Epoch: 1 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:24:05,216-Speed 5179.95 samples/sec Loss 9.1559 Epoch: 1 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:24:15,331-Speed 5062.16 samples/sec Loss 9.1606 Epoch: 1 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:24:25,397-Speed 5087.08 samples/sec Loss 9.2082 Epoch: 1 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:24:42,318-[lfw][30000]XNorm: 24.988940 Training: 2021-03-18 22:24:42,318-[lfw][30000]Accuracy-Flip: 0.99333+-0.00435 Training: 2021-03-18 22:24:42,318-[lfw][30000]Accuracy-Highest: 0.99333 Training: 2021-03-18 22:25:01,167-[cfp_fp][30000]XNorm: 19.930955 Training: 2021-03-18 22:25:01,167-[cfp_fp][30000]Accuracy-Flip: 0.89786+-0.01152 Training: 2021-03-18 22:25:01,167-[cfp_fp][30000]Accuracy-Highest: 0.89786 Training: 2021-03-18 22:25:17,253-[agedb_30][30000]XNorm: 23.382117 Training: 2021-03-18 22:25:17,254-[agedb_30][30000]Accuracy-Flip: 0.93333+-0.00879 Training: 2021-03-18 22:25:17,254-[agedb_30][30000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:25:26,801-Speed 833.82 samples/sec Loss 9.1492 Epoch: 1 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:25:36,533-Speed 5261.33 samples/sec Loss 9.1870 Epoch: 1 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:25:46,466-Speed 5155.02 samples/sec Loss 9.1702 Epoch: 1 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:25:56,760-Speed 4973.71 samples/sec Loss 9.1019 Epoch: 1 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:06,883-Speed 5058.20 samples/sec Loss 9.1936 Epoch: 1 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:16,837-Speed 5144.01 samples/sec Loss 9.1783 Epoch: 1 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:26,654-Speed 5215.50 samples/sec Loss 9.1651 Epoch: 1 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:36,850-Speed 5022.00 samples/sec Loss 9.1350 Epoch: 1 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:46,568-Speed 5268.68 samples/sec Loss 9.1488 Epoch: 1 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:26:56,713-Speed 5047.28 samples/sec Loss 9.1741 Epoch: 1 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:06,572-Speed 5193.58 samples/sec Loss 9.1073 Epoch: 1 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:16,506-Speed 5154.45 samples/sec Loss 9.1355 Epoch: 1 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:26,309-Speed 5223.15 samples/sec Loss 9.1279 Epoch: 1 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:35,989-Speed 5289.50 samples/sec Loss 9.1392 Epoch: 1 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:46,067-Speed 5081.11 samples/sec Loss 9.1694 Epoch: 1 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:27:55,974-Speed 5168.46 samples/sec Loss 9.2038 Epoch: 1 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:05,690-Speed 5269.71 samples/sec Loss 9.1231 Epoch: 1 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:15,506-Speed 5215.96 samples/sec Loss 9.1559 Epoch: 1 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:25,461-Speed 5143.68 samples/sec Loss 9.1141 Epoch: 1 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:35,393-Speed 5155.50 samples/sec Loss 9.1079 Epoch: 1 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:45,675-Speed 4979.57 samples/sec Loss 9.1660 Epoch: 1 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:28:55,870-Speed 5022.55 samples/sec Loss 9.0913 Epoch: 1 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:05,842-Speed 5134.69 samples/sec Loss 9.0669 Epoch: 1 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:15,756-Speed 5164.89 samples/sec Loss 9.1173 Epoch: 1 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:25,604-Speed 5199.35 samples/sec Loss 9.1381 Epoch: 1 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:35,855-Speed 4994.59 samples/sec Loss 9.1541 Epoch: 1 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:46,795-Speed 4680.41 samples/sec Loss 9.1723 Epoch: 1 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:29:56,783-Speed 5126.30 samples/sec Loss 9.1001 Epoch: 1 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:06,560-Speed 5237.36 samples/sec Loss 9.1252 Epoch: 1 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:16,635-Speed 5082.07 samples/sec Loss 9.1083 Epoch: 1 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:26,715-Speed 5079.77 samples/sec Loss 9.1335 Epoch: 1 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:36,862-Speed 5046.48 samples/sec Loss 9.1196 Epoch: 1 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:46,936-Speed 5082.51 samples/sec Loss 9.0854 Epoch: 1 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:30:57,662-Speed 4773.77 samples/sec Loss 9.0742 Epoch: 1 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:07,581-Speed 5162.40 samples/sec Loss 9.1318 Epoch: 1 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:18,508-Speed 4685.72 samples/sec Loss 9.0977 Epoch: 1 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:28,426-Speed 5162.99 samples/sec Loss 9.1478 Epoch: 1 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:39,179-Speed 4761.86 samples/sec Loss 9.1439 Epoch: 1 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:49,695-Speed 4868.60 samples/sec Loss 9.1395 Epoch: 1 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:31:59,812-Speed 5061.40 samples/sec Loss 9.0701 Epoch: 1 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:32:16,465-[lfw][32000]XNorm: 22.389624 Training: 2021-03-18 22:32:16,465-[lfw][32000]Accuracy-Flip: 0.99250+-0.00410 Training: 2021-03-18 22:32:16,465-[lfw][32000]Accuracy-Highest: 0.99333 Training: 2021-03-18 22:32:35,136-[cfp_fp][32000]XNorm: 18.021461 Training: 2021-03-18 22:32:35,136-[cfp_fp][32000]Accuracy-Flip: 0.88314+-0.01179 Training: 2021-03-18 22:32:35,136-[cfp_fp][32000]Accuracy-Highest: 0.89786 Training: 2021-03-18 22:32:51,214-[agedb_30][32000]XNorm: 21.435182 Training: 2021-03-18 22:32:51,215-[agedb_30][32000]Accuracy-Flip: 0.93300+-0.01227 Training: 2021-03-18 22:32:51,215-[agedb_30][32000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:33:02,291-Speed 819.48 samples/sec Loss 9.1349 Epoch: 1 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:33:13,061-Speed 4754.36 samples/sec Loss 9.1245 Epoch: 1 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:33:22,934-Speed 5186.21 samples/sec Loss 9.1248 Epoch: 1 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:33:32,796-Speed 5191.95 samples/sec Loss 9.0542 Epoch: 1 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:33:42,587-Speed 5229.64 samples/sec Loss 9.1218 Epoch: 1 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:33:52,688-Speed 5069.38 samples/sec Loss 9.0877 Epoch: 1 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:02,914-Speed 5006.77 samples/sec Loss 9.0614 Epoch: 1 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:13,278-Speed 4940.46 samples/sec Loss 9.1279 Epoch: 1 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:23,414-Speed 5051.63 samples/sec Loss 9.1055 Epoch: 1 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:33,336-Speed 5160.82 samples/sec Loss 9.1618 Epoch: 1 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:43,612-Speed 4982.80 samples/sec Loss 9.1293 Epoch: 1 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:34:53,630-Speed 5111.07 samples/sec Loss 8.9976 Epoch: 1 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:03,722-Speed 5073.39 samples/sec Loss 9.0613 Epoch: 1 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:13,624-Speed 5170.94 samples/sec Loss 9.1315 Epoch: 1 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:23,774-Speed 5044.91 samples/sec Loss 9.0931 Epoch: 1 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:34,701-Speed 4686.00 samples/sec Loss 9.0753 Epoch: 1 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:44,771-Speed 5084.62 samples/sec Loss 9.0731 Epoch: 1 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:35:54,997-Speed 5006.71 samples/sec Loss 9.0463 Epoch: 1 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:05,134-Speed 5051.39 samples/sec Loss 9.0362 Epoch: 1 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:14,769-Speed 5314.46 samples/sec Loss 9.0371 Epoch: 1 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:24,889-Speed 5059.35 samples/sec Loss 9.0254 Epoch: 1 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:34,910-Speed 5109.72 samples/sec Loss 9.0047 Epoch: 1 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:45,316-Speed 4920.63 samples/sec Loss 9.0320 Epoch: 1 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:36:55,438-Speed 5058.36 samples/sec Loss 9.1225 Epoch: 1 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:37:05,310-Speed 5187.03 samples/sec Loss 9.0860 Epoch: 1 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:37:15,367-Speed 5090.97 samples/sec Loss 9.0897 Epoch: 1 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:37:25,574-Speed 5016.75 samples/sec Loss 9.0536 Epoch: 1 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:37:48,633-Speed 2220.43 samples/sec Loss 8.7573 Epoch: 2 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:37:59,732-Speed 4613.07 samples/sec Loss 8.2417 Epoch: 2 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:38:10,342-Speed 4826.37 samples/sec Loss 8.3054 Epoch: 2 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:38:20,694-Speed 4946.12 samples/sec Loss 8.3365 Epoch: 2 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:38:31,414-Speed 4776.31 samples/sec Loss 8.3653 Epoch: 2 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:38:41,977-Speed 4847.44 samples/sec Loss 8.4078 Epoch: 2 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:38:52,426-Speed 4900.65 samples/sec Loss 8.3360 Epoch: 2 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:02,865-Speed 4904.93 samples/sec Loss 8.3592 Epoch: 2 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:13,423-Speed 4849.80 samples/sec Loss 8.3856 Epoch: 2 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:23,696-Speed 4983.99 samples/sec Loss 8.4409 Epoch: 2 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:34,197-Speed 4876.03 samples/sec Loss 8.4895 Epoch: 2 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:44,489-Speed 4974.97 samples/sec Loss 8.4769 Epoch: 2 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:39:55,087-Speed 4831.43 samples/sec Loss 8.4674 Epoch: 2 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:40:12,217-[lfw][34000]XNorm: 21.837708 Training: 2021-03-18 22:40:12,217-[lfw][34000]Accuracy-Flip: 0.99150+-0.00524 Training: 2021-03-18 22:40:12,218-[lfw][34000]Accuracy-Highest: 0.99333 Training: 2021-03-18 22:40:31,221-[cfp_fp][34000]XNorm: 18.026195 Training: 2021-03-18 22:40:31,221-[cfp_fp][34000]Accuracy-Flip: 0.89500+-0.01452 Training: 2021-03-18 22:40:31,222-[cfp_fp][34000]Accuracy-Highest: 0.89786 Training: 2021-03-18 22:40:47,577-[agedb_30][34000]XNorm: 21.185281 Training: 2021-03-18 22:40:47,577-[agedb_30][34000]Accuracy-Flip: 0.92900+-0.01566 Training: 2021-03-18 22:40:47,577-[agedb_30][34000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:40:57,444-Speed 821.09 samples/sec Loss 8.5388 Epoch: 2 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:41:07,918-Speed 4888.76 samples/sec Loss 8.5051 Epoch: 2 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:41:18,180-Speed 4989.69 samples/sec Loss 8.4694 Epoch: 2 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:41:28,606-Speed 4910.84 samples/sec Loss 8.5420 Epoch: 2 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:41:38,805-Speed 5020.50 samples/sec Loss 8.5471 Epoch: 2 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:41:49,275-Speed 4890.71 samples/sec Loss 8.5634 Epoch: 2 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:00,059-Speed 4748.02 samples/sec Loss 8.5525 Epoch: 2 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:10,347-Speed 4977.10 samples/sec Loss 8.5779 Epoch: 2 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:20,859-Speed 4870.94 samples/sec Loss 8.6402 Epoch: 2 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:32,095-Speed 4556.94 samples/sec Loss 8.6407 Epoch: 2 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:42,519-Speed 4912.15 samples/sec Loss 8.6228 Epoch: 2 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:42:52,853-Speed 4954.85 samples/sec Loss 8.6501 Epoch: 2 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:03,354-Speed 4876.11 samples/sec Loss 8.6201 Epoch: 2 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:13,845-Speed 4880.63 samples/sec Loss 8.5995 Epoch: 2 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:24,481-Speed 4814.07 samples/sec Loss 8.7048 Epoch: 2 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:34,887-Speed 4920.93 samples/sec Loss 8.6777 Epoch: 2 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:45,509-Speed 4820.27 samples/sec Loss 8.6574 Epoch: 2 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:43:56,708-Speed 4572.28 samples/sec Loss 8.6962 Epoch: 2 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:44:06,933-Speed 5007.56 samples/sec Loss 8.6488 Epoch: 2 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:44:18,028-Speed 4614.87 samples/sec Loss 8.7220 Epoch: 2 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:44:28,465-Speed 4905.84 samples/sec Loss 8.6762 Epoch: 2 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:44:39,419-Speed 4674.43 samples/sec Loss 8.7192 Epoch: 2 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:44:50,819-Speed 4491.94 samples/sec Loss 8.7402 Epoch: 2 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:01,085-Speed 4987.22 samples/sec Loss 8.6942 Epoch: 2 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:13,064-Speed 4274.40 samples/sec Loss 8.6363 Epoch: 2 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:24,185-Speed 4604.47 samples/sec Loss 8.6942 Epoch: 2 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:35,000-Speed 4734.19 samples/sec Loss 8.7258 Epoch: 2 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:45,280-Speed 4981.14 samples/sec Loss 8.7594 Epoch: 2 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:45:55,819-Speed 4858.17 samples/sec Loss 8.7321 Epoch: 2 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:05,916-Speed 5071.06 samples/sec Loss 8.7951 Epoch: 2 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:16,596-Speed 4794.31 samples/sec Loss 8.8412 Epoch: 2 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:26,868-Speed 4984.84 samples/sec Loss 8.8089 Epoch: 2 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:37,195-Speed 4958.45 samples/sec Loss 8.7530 Epoch: 2 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:47,398-Speed 5018.40 samples/sec Loss 8.8385 Epoch: 2 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:46:57,660-Speed 4989.45 samples/sec Loss 8.7291 Epoch: 2 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:47:07,953-Speed 4974.83 samples/sec Loss 8.7545 Epoch: 2 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:47:18,293-Speed 4951.58 samples/sec Loss 8.7569 Epoch: 2 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:47:28,583-Speed 4976.64 samples/sec Loss 8.8039 Epoch: 2 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:47:38,368-Speed 5233.14 samples/sec Loss 8.7699 Epoch: 2 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:47:49,685-Speed 4524.31 samples/sec Loss 8.8026 Epoch: 2 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:48:06,221-[lfw][36000]XNorm: 23.048993 Training: 2021-03-18 22:48:06,222-[lfw][36000]Accuracy-Flip: 0.99267+-0.00327 Training: 2021-03-18 22:48:06,222-[lfw][36000]Accuracy-Highest: 0.99333 Training: 2021-03-18 22:48:25,201-[cfp_fp][36000]XNorm: 18.494992 Training: 2021-03-18 22:48:25,201-[cfp_fp][36000]Accuracy-Flip: 0.88729+-0.01296 Training: 2021-03-18 22:48:25,201-[cfp_fp][36000]Accuracy-Highest: 0.89786 Training: 2021-03-18 22:48:41,513-[agedb_30][36000]XNorm: 21.643645 Training: 2021-03-18 22:48:41,514-[agedb_30][36000]Accuracy-Flip: 0.93500+-0.01511 Training: 2021-03-18 22:48:41,514-[agedb_30][36000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:48:51,533-Speed 827.86 samples/sec Loss 8.7739 Epoch: 2 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:02,042-Speed 4872.56 samples/sec Loss 8.8216 Epoch: 2 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:12,292-Speed 4995.32 samples/sec Loss 8.7869 Epoch: 2 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:22,555-Speed 4988.90 samples/sec Loss 8.8568 Epoch: 2 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:32,741-Speed 5026.81 samples/sec Loss 8.8487 Epoch: 2 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:42,911-Speed 5035.07 samples/sec Loss 8.8677 Epoch: 2 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:49:53,256-Speed 4949.26 samples/sec Loss 8.8388 Epoch: 2 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:03,580-Speed 4959.61 samples/sec Loss 8.8154 Epoch: 2 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:13,811-Speed 5004.96 samples/sec Loss 8.8305 Epoch: 2 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:23,949-Speed 5050.85 samples/sec Loss 8.8356 Epoch: 2 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:34,183-Speed 5003.33 samples/sec Loss 8.7968 Epoch: 2 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:44,382-Speed 5020.07 samples/sec Loss 8.8155 Epoch: 2 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:50:54,819-Speed 4906.09 samples/sec Loss 8.8944 Epoch: 2 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:05,271-Speed 4899.02 samples/sec Loss 8.8430 Epoch: 2 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:15,893-Speed 4820.47 samples/sec Loss 8.8848 Epoch: 2 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:26,049-Speed 5041.64 samples/sec Loss 8.8607 Epoch: 2 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:36,663-Speed 4823.94 samples/sec Loss 8.8379 Epoch: 2 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:46,860-Speed 5021.47 samples/sec Loss 8.8939 Epoch: 2 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:51:57,095-Speed 5002.78 samples/sec Loss 8.8620 Epoch: 2 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:07,380-Speed 4978.33 samples/sec Loss 8.8135 Epoch: 2 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:17,767-Speed 4929.68 samples/sec Loss 8.8443 Epoch: 2 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:28,030-Speed 4989.07 samples/sec Loss 8.8594 Epoch: 2 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:38,409-Speed 4933.57 samples/sec Loss 8.8636 Epoch: 2 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:48,498-Speed 5075.06 samples/sec Loss 8.9499 Epoch: 2 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:52:59,154-Speed 4805.35 samples/sec Loss 8.8554 Epoch: 2 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:53:09,509-Speed 4944.85 samples/sec Loss 8.8646 Epoch: 2 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:53:19,750-Speed 5000.15 samples/sec Loss 8.8332 Epoch: 2 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:53:30,098-Speed 4948.08 samples/sec Loss 8.8608 Epoch: 2 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:53:40,454-Speed 4944.08 samples/sec Loss 8.8729 Epoch: 2 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:53:50,690-Speed 5002.22 samples/sec Loss 8.9470 Epoch: 2 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:01,179-Speed 4881.94 samples/sec Loss 8.8880 Epoch: 2 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:11,477-Speed 4972.17 samples/sec Loss 8.8559 Epoch: 2 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:21,889-Speed 4917.79 samples/sec Loss 8.8483 Epoch: 2 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:33,136-Speed 4552.70 samples/sec Loss 8.8455 Epoch: 2 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:43,562-Speed 4911.10 samples/sec Loss 8.8695 Epoch: 2 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:54:53,884-Speed 4960.71 samples/sec Loss 8.9062 Epoch: 2 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:55:04,331-Speed 4901.34 samples/sec Loss 8.8235 Epoch: 2 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:55:14,339-Speed 5115.95 samples/sec Loss 8.9172 Epoch: 2 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:55:24,277-Speed 5152.41 samples/sec Loss 8.8768 Epoch: 2 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:55:35,297-Speed 4646.64 samples/sec Loss 8.8779 Epoch: 2 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:55:52,172-[lfw][38000]XNorm: 22.747334 Training: 2021-03-18 22:55:52,172-[lfw][38000]Accuracy-Flip: 0.99367+-0.00348 Training: 2021-03-18 22:55:52,172-[lfw][38000]Accuracy-Highest: 0.99367 Training: 2021-03-18 22:56:10,944-[cfp_fp][38000]XNorm: 18.470395 Training: 2021-03-18 22:56:10,944-[cfp_fp][38000]Accuracy-Flip: 0.89700+-0.01612 Training: 2021-03-18 22:56:10,944-[cfp_fp][38000]Accuracy-Highest: 0.89786 Training: 2021-03-18 22:56:27,330-[agedb_30][38000]XNorm: 21.221100 Training: 2021-03-18 22:56:27,331-[agedb_30][38000]Accuracy-Flip: 0.93467+-0.01308 Training: 2021-03-18 22:56:27,331-[agedb_30][38000]Accuracy-Highest: 0.93650 Training: 2021-03-18 22:56:37,049-Speed 829.13 samples/sec Loss 8.8666 Epoch: 2 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:56:47,306-Speed 4992.33 samples/sec Loss 8.9405 Epoch: 2 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:56:57,514-Speed 5016.09 samples/sec Loss 8.8570 Epoch: 2 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:57:08,561-Speed 4634.89 samples/sec Loss 8.9363 Epoch: 2 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:57:18,887-Speed 4958.64 samples/sec Loss 8.8555 Epoch: 2 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:57:29,078-Speed 5024.44 samples/sec Loss 8.8823 Epoch: 2 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:57:40,722-Speed 4397.44 samples/sec Loss 8.9340 Epoch: 2 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:57:51,207-Speed 4883.56 samples/sec Loss 8.9358 Epoch: 2 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:03,100-Speed 4305.40 samples/sec Loss 8.9166 Epoch: 2 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:13,313-Speed 5013.39 samples/sec Loss 8.8857 Epoch: 2 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:23,467-Speed 5042.45 samples/sec Loss 8.8865 Epoch: 2 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:33,626-Speed 5040.29 samples/sec Loss 8.9098 Epoch: 2 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:43,861-Speed 5003.12 samples/sec Loss 8.8486 Epoch: 2 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:58:54,223-Speed 4941.54 samples/sec Loss 8.9879 Epoch: 2 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:04,559-Speed 4953.59 samples/sec Loss 8.9614 Epoch: 2 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:14,863-Speed 4969.17 samples/sec Loss 8.8810 Epoch: 2 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:24,952-Speed 5075.42 samples/sec Loss 8.9266 Epoch: 2 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:35,406-Speed 4897.88 samples/sec Loss 8.8568 Epoch: 2 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:45,815-Speed 4919.51 samples/sec Loss 8.8649 Epoch: 2 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 22:59:56,236-Speed 4913.12 samples/sec Loss 8.9228 Epoch: 2 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:06,531-Speed 4973.92 samples/sec Loss 8.9214 Epoch: 2 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:17,195-Speed 4801.59 samples/sec Loss 8.9153 Epoch: 2 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:27,354-Speed 5040.10 samples/sec Loss 8.8692 Epoch: 2 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:38,573-Speed 4564.10 samples/sec Loss 8.9373 Epoch: 2 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:48,694-Speed 5059.05 samples/sec Loss 8.9641 Epoch: 2 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:00:59,169-Speed 4888.05 samples/sec Loss 8.9494 Epoch: 2 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:09,322-Speed 5043.29 samples/sec Loss 8.9434 Epoch: 2 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:19,325-Speed 5118.39 samples/sec Loss 8.9386 Epoch: 2 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:29,354-Speed 5105.49 samples/sec Loss 8.9440 Epoch: 2 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:39,589-Speed 5003.16 samples/sec Loss 8.9036 Epoch: 2 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:49,582-Speed 5123.55 samples/sec Loss 8.8948 Epoch: 2 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:01:59,655-Speed 5083.44 samples/sec Loss 8.8952 Epoch: 2 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:02:09,615-Speed 5140.79 samples/sec Loss 8.9216 Epoch: 2 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:02:19,659-Speed 5097.87 samples/sec Loss 8.8693 Epoch: 2 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:02:29,984-Speed 4959.16 samples/sec Loss 8.9149 Epoch: 2 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:02:40,247-Speed 4989.37 samples/sec Loss 8.8604 Epoch: 2 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:02:50,578-Speed 4956.08 samples/sec Loss 8.9366 Epoch: 2 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:03:01,390-Speed 4735.94 samples/sec Loss 8.9044 Epoch: 2 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:03:11,392-Speed 5119.26 samples/sec Loss 8.9557 Epoch: 2 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:03:21,578-Speed 5026.88 samples/sec Loss 8.9127 Epoch: 2 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:03:38,455-[lfw][40000]XNorm: 23.776284 Training: 2021-03-18 23:03:38,455-[lfw][40000]Accuracy-Flip: 0.99333+-0.00435 Training: 2021-03-18 23:03:38,456-[lfw][40000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:03:59,300-[cfp_fp][40000]XNorm: 19.172226 Training: 2021-03-18 23:03:59,300-[cfp_fp][40000]Accuracy-Flip: 0.88614+-0.01252 Training: 2021-03-18 23:03:59,301-[cfp_fp][40000]Accuracy-Highest: 0.89786 Training: 2021-03-18 23:04:15,508-[agedb_30][40000]XNorm: 22.613576 Training: 2021-03-18 23:04:15,508-[agedb_30][40000]Accuracy-Flip: 0.93550+-0.01368 Training: 2021-03-18 23:04:15,508-[agedb_30][40000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:04:25,602-Speed 799.71 samples/sec Loss 8.9228 Epoch: 2 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:04:35,445-Speed 5202.03 samples/sec Loss 8.9686 Epoch: 2 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:04:45,666-Speed 5009.51 samples/sec Loss 8.9038 Epoch: 2 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:04:55,957-Speed 4975.56 samples/sec Loss 8.8891 Epoch: 2 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:06,070-Speed 5063.22 samples/sec Loss 8.9342 Epoch: 2 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:16,370-Speed 4970.93 samples/sec Loss 8.8627 Epoch: 2 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:26,344-Speed 5134.05 samples/sec Loss 8.8876 Epoch: 2 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:36,899-Speed 4850.80 samples/sec Loss 8.9619 Epoch: 2 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:47,539-Speed 4812.50 samples/sec Loss 8.9168 Epoch: 2 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:05:57,794-Speed 4993.15 samples/sec Loss 8.9649 Epoch: 2 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:08,248-Speed 4897.88 samples/sec Loss 8.9151 Epoch: 2 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:18,744-Speed 4878.32 samples/sec Loss 8.9400 Epoch: 2 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:28,865-Speed 5058.87 samples/sec Loss 8.9620 Epoch: 2 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:39,060-Speed 5022.26 samples/sec Loss 8.9244 Epoch: 2 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:49,127-Speed 5086.27 samples/sec Loss 8.9571 Epoch: 2 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:06:59,321-Speed 5022.85 samples/sec Loss 8.8593 Epoch: 2 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:07:09,516-Speed 5022.28 samples/sec Loss 8.9218 Epoch: 2 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:07:20,407-Speed 4701.37 samples/sec Loss 8.9136 Epoch: 2 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:07:30,660-Speed 4994.06 samples/sec Loss 8.9364 Epoch: 2 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:07:40,774-Speed 5062.75 samples/sec Loss 8.9193 Epoch: 2 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:07:50,896-Speed 5058.74 samples/sec Loss 8.9004 Epoch: 2 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:01,028-Speed 5053.50 samples/sec Loss 8.9028 Epoch: 2 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:11,426-Speed 4924.45 samples/sec Loss 8.8768 Epoch: 2 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:21,866-Speed 4904.34 samples/sec Loss 8.9157 Epoch: 2 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:31,875-Speed 5115.91 samples/sec Loss 8.9378 Epoch: 2 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:42,784-Speed 4693.52 samples/sec Loss 8.9262 Epoch: 2 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:08:53,039-Speed 4992.88 samples/sec Loss 8.9144 Epoch: 2 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:04,202-Speed 4587.30 samples/sec Loss 8.9581 Epoch: 2 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:14,498-Speed 4973.04 samples/sec Loss 8.8940 Epoch: 2 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:24,805-Speed 4967.59 samples/sec Loss 8.9078 Epoch: 2 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:35,034-Speed 5005.56 samples/sec Loss 8.9414 Epoch: 2 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:45,796-Speed 4757.95 samples/sec Loss 8.9256 Epoch: 2 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:09:57,469-Speed 4386.35 samples/sec Loss 8.9230 Epoch: 2 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:08,593-Speed 4603.14 samples/sec Loss 8.9546 Epoch: 2 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:18,806-Speed 5014.07 samples/sec Loss 8.9292 Epoch: 2 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:28,901-Speed 5072.15 samples/sec Loss 8.8989 Epoch: 2 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:39,080-Speed 5030.46 samples/sec Loss 8.8478 Epoch: 2 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:49,378-Speed 4972.08 samples/sec Loss 8.9639 Epoch: 2 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:10:59,806-Speed 4910.21 samples/sec Loss 8.9425 Epoch: 2 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:11:10,677-Speed 4709.92 samples/sec Loss 8.9119 Epoch: 2 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:11:27,686-[lfw][42000]XNorm: 23.710975 Training: 2021-03-18 23:11:27,687-[lfw][42000]Accuracy-Flip: 0.99350+-0.00273 Training: 2021-03-18 23:11:27,687-[lfw][42000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:11:46,635-[cfp_fp][42000]XNorm: 18.679264 Training: 2021-03-18 23:11:46,635-[cfp_fp][42000]Accuracy-Flip: 0.88000+-0.01337 Training: 2021-03-18 23:11:46,635-[cfp_fp][42000]Accuracy-Highest: 0.89786 Training: 2021-03-18 23:12:03,032-[agedb_30][42000]XNorm: 22.334844 Training: 2021-03-18 23:12:03,033-[agedb_30][42000]Accuracy-Flip: 0.93067+-0.00886 Training: 2021-03-18 23:12:03,033-[agedb_30][42000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:12:13,065-Speed 820.69 samples/sec Loss 8.9337 Epoch: 2 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:12:23,235-Speed 5034.81 samples/sec Loss 8.9429 Epoch: 2 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:12:33,343-Speed 5065.42 samples/sec Loss 8.9832 Epoch: 2 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:12:43,512-Speed 5035.38 samples/sec Loss 8.9086 Epoch: 2 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:12:54,002-Speed 4880.94 samples/sec Loss 8.9057 Epoch: 2 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:04,071-Speed 5085.39 samples/sec Loss 8.9017 Epoch: 2 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:14,189-Speed 5061.02 samples/sec Loss 8.8568 Epoch: 2 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:24,459-Speed 4985.42 samples/sec Loss 8.9978 Epoch: 2 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:35,438-Speed 4663.87 samples/sec Loss 8.8922 Epoch: 2 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:45,630-Speed 5024.14 samples/sec Loss 8.8277 Epoch: 2 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:13:56,109-Speed 4886.30 samples/sec Loss 8.8723 Epoch: 2 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:06,373-Speed 4988.45 samples/sec Loss 8.8977 Epoch: 2 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:16,464-Speed 5074.17 samples/sec Loss 8.9660 Epoch: 2 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:26,833-Speed 4937.82 samples/sec Loss 8.9258 Epoch: 2 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:36,843-Speed 5115.19 samples/sec Loss 8.8446 Epoch: 2 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:46,980-Speed 5051.28 samples/sec Loss 8.8859 Epoch: 2 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:14:57,258-Speed 4981.99 samples/sec Loss 8.9079 Epoch: 2 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:15:07,303-Speed 5097.32 samples/sec Loss 8.9083 Epoch: 2 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-18 23:15:17,434-Speed 5054.26 samples/sec Loss 8.9305 Epoch: 2 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:15:27,439-Speed 5117.64 samples/sec Loss 8.9220 Epoch: 2 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:15:37,707-Speed 4986.83 samples/sec Loss 8.9268 Epoch: 2 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:15:47,885-Speed 5030.68 samples/sec Loss 8.8954 Epoch: 2 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:15:58,016-Speed 5053.93 samples/sec Loss 8.9196 Epoch: 2 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:08,168-Speed 5043.83 samples/sec Loss 8.9040 Epoch: 2 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:18,299-Speed 5053.78 samples/sec Loss 8.9475 Epoch: 2 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:28,485-Speed 5027.21 samples/sec Loss 8.8446 Epoch: 2 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:38,470-Speed 5127.88 samples/sec Loss 8.9500 Epoch: 2 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:48,882-Speed 4917.82 samples/sec Loss 8.9449 Epoch: 2 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:16:58,945-Speed 5087.80 samples/sec Loss 8.8946 Epoch: 2 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:17:09,310-Speed 4940.10 samples/sec Loss 8.8962 Epoch: 2 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:17:19,741-Speed 4908.73 samples/sec Loss 8.9141 Epoch: 2 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:17:30,046-Speed 4968.69 samples/sec Loss 8.9248 Epoch: 2 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:17:40,350-Speed 4969.55 samples/sec Loss 8.9411 Epoch: 2 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:17:51,026-Speed 4796.09 samples/sec Loss 8.9430 Epoch: 2 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:01,344-Speed 4962.12 samples/sec Loss 8.8642 Epoch: 2 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:11,846-Speed 4876.06 samples/sec Loss 8.8869 Epoch: 2 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:21,962-Speed 5061.15 samples/sec Loss 8.9219 Epoch: 2 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:32,169-Speed 5016.64 samples/sec Loss 8.9312 Epoch: 2 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:42,274-Speed 5067.20 samples/sec Loss 8.8929 Epoch: 2 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:18:52,457-Speed 5028.06 samples/sec Loss 8.8536 Epoch: 2 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:19:09,224-[lfw][44000]XNorm: 22.331200 Training: 2021-03-18 23:19:09,224-[lfw][44000]Accuracy-Flip: 0.99317+-0.00456 Training: 2021-03-18 23:19:09,224-[lfw][44000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:19:28,237-[cfp_fp][44000]XNorm: 17.896400 Training: 2021-03-18 23:19:28,237-[cfp_fp][44000]Accuracy-Flip: 0.88400+-0.01983 Training: 2021-03-18 23:19:28,238-[cfp_fp][44000]Accuracy-Highest: 0.89786 Training: 2021-03-18 23:19:44,516-[agedb_30][44000]XNorm: 21.129000 Training: 2021-03-18 23:19:44,517-[agedb_30][44000]Accuracy-Flip: 0.92800+-0.01251 Training: 2021-03-18 23:19:44,517-[agedb_30][44000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:19:54,604-Speed 823.87 samples/sec Loss 8.8760 Epoch: 2 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:04,658-Speed 5092.83 samples/sec Loss 8.8804 Epoch: 2 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:16,140-Speed 4459.37 samples/sec Loss 8.9200 Epoch: 2 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:26,351-Speed 5014.96 samples/sec Loss 8.8594 Epoch: 2 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:36,475-Speed 5057.35 samples/sec Loss 8.8647 Epoch: 2 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:46,858-Speed 4931.81 samples/sec Loss 8.9538 Epoch: 2 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:20:57,358-Speed 4876.21 samples/sec Loss 8.9427 Epoch: 2 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:21:07,807-Speed 4900.67 samples/sec Loss 8.8997 Epoch: 2 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:21:19,472-Speed 4389.35 samples/sec Loss 8.9301 Epoch: 2 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:21:29,566-Speed 5072.78 samples/sec Loss 8.8919 Epoch: 2 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:21:40,073-Speed 4873.37 samples/sec Loss 8.9008 Epoch: 2 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:21:50,509-Speed 4906.27 samples/sec Loss 8.8786 Epoch: 2 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:00,676-Speed 5036.13 samples/sec Loss 8.8825 Epoch: 2 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:10,888-Speed 5013.95 samples/sec Loss 8.9357 Epoch: 2 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:21,310-Speed 4913.09 samples/sec Loss 8.8788 Epoch: 2 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:32,195-Speed 4704.18 samples/sec Loss 8.8248 Epoch: 2 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:44,194-Speed 4267.23 samples/sec Loss 8.9249 Epoch: 2 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:22:55,238-Speed 4636.21 samples/sec Loss 8.8644 Epoch: 2 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:05,678-Speed 4904.60 samples/sec Loss 8.9064 Epoch: 2 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:16,271-Speed 4833.49 samples/sec Loss 8.9358 Epoch: 2 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:26,972-Speed 4785.12 samples/sec Loss 8.8496 Epoch: 2 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:37,502-Speed 4862.73 samples/sec Loss 8.8463 Epoch: 2 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:47,588-Speed 5076.56 samples/sec Loss 8.8054 Epoch: 2 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:23:57,729-Speed 5049.14 samples/sec Loss 8.9323 Epoch: 2 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:08,181-Speed 4898.89 samples/sec Loss 8.9263 Epoch: 2 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:18,632-Speed 4899.69 samples/sec Loss 8.8537 Epoch: 2 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:28,737-Speed 5067.09 samples/sec Loss 8.8877 Epoch: 2 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:39,175-Speed 4905.40 samples/sec Loss 8.9344 Epoch: 2 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:49,267-Speed 5073.49 samples/sec Loss 8.8824 Epoch: 2 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:24:59,311-Speed 5097.86 samples/sec Loss 8.8720 Epoch: 2 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:25:09,626-Speed 4963.95 samples/sec Loss 8.9259 Epoch: 2 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:25:19,813-Speed 5026.41 samples/sec Loss 8.9118 Epoch: 2 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:25:30,150-Speed 4953.40 samples/sec Loss 8.8644 Epoch: 2 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:25:41,225-Speed 4623.62 samples/sec Loss 8.8962 Epoch: 2 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:25:51,689-Speed 4893.27 samples/sec Loss 8.8382 Epoch: 2 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:01,944-Speed 4992.66 samples/sec Loss 8.8424 Epoch: 2 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:12,294-Speed 4947.22 samples/sec Loss 8.8762 Epoch: 2 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:22,422-Speed 5055.62 samples/sec Loss 8.8818 Epoch: 2 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:32,780-Speed 4943.32 samples/sec Loss 8.9527 Epoch: 2 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:43,189-Speed 4919.36 samples/sec Loss 8.9176 Epoch: 2 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:26:59,904-[lfw][46000]XNorm: 23.485040 Training: 2021-03-18 23:26:59,904-[lfw][46000]Accuracy-Flip: 0.99133+-0.00521 Training: 2021-03-18 23:26:59,904-[lfw][46000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:27:18,569-[cfp_fp][46000]XNorm: 18.815770 Training: 2021-03-18 23:27:18,569-[cfp_fp][46000]Accuracy-Flip: 0.89100+-0.01583 Training: 2021-03-18 23:27:18,569-[cfp_fp][46000]Accuracy-Highest: 0.89786 Training: 2021-03-18 23:27:34,558-[agedb_30][46000]XNorm: 21.973313 Training: 2021-03-18 23:27:34,558-[agedb_30][46000]Accuracy-Flip: 0.93350+-0.01669 Training: 2021-03-18 23:27:34,559-[agedb_30][46000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:27:44,340-Speed 837.28 samples/sec Loss 8.8467 Epoch: 2 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:27:54,362-Speed 5109.16 samples/sec Loss 8.8560 Epoch: 2 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:04,943-Speed 4839.24 samples/sec Loss 8.8624 Epoch: 2 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:15,069-Speed 5056.90 samples/sec Loss 8.9292 Epoch: 2 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:25,779-Speed 4780.77 samples/sec Loss 8.8853 Epoch: 2 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:36,597-Speed 4733.10 samples/sec Loss 8.8902 Epoch: 2 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:46,795-Speed 5020.80 samples/sec Loss 8.8755 Epoch: 2 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:28:57,072-Speed 4982.47 samples/sec Loss 8.8119 Epoch: 2 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:07,485-Speed 4917.08 samples/sec Loss 8.8715 Epoch: 2 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:17,766-Speed 4980.67 samples/sec Loss 8.8513 Epoch: 2 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:28,196-Speed 4908.80 samples/sec Loss 8.8940 Epoch: 2 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:38,388-Speed 5024.03 samples/sec Loss 8.8336 Epoch: 2 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:48,726-Speed 4953.17 samples/sec Loss 8.8787 Epoch: 2 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:29:59,096-Speed 4937.55 samples/sec Loss 8.8700 Epoch: 2 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:09,335-Speed 5000.65 samples/sec Loss 8.8558 Epoch: 2 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:19,632-Speed 4972.90 samples/sec Loss 8.9152 Epoch: 2 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:29,701-Speed 5085.58 samples/sec Loss 8.8724 Epoch: 2 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:39,574-Speed 5186.09 samples/sec Loss 8.8030 Epoch: 2 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:49,289-Speed 5270.32 samples/sec Loss 8.8553 Epoch: 2 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:30:59,378-Speed 5075.25 samples/sec Loss 8.8790 Epoch: 2 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:31:10,087-Speed 4781.52 samples/sec Loss 8.9189 Epoch: 2 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:31:20,354-Speed 4987.05 samples/sec Loss 8.8201 Epoch: 2 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:31:30,726-Speed 4936.45 samples/sec Loss 8.8547 Epoch: 2 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:31:40,779-Speed 5093.31 samples/sec Loss 8.8248 Epoch: 2 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:31:51,103-Speed 4960.05 samples/sec Loss 8.9330 Epoch: 2 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:01,264-Speed 5038.72 samples/sec Loss 8.8917 Epoch: 2 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:12,250-Speed 4660.92 samples/sec Loss 8.9249 Epoch: 2 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:22,207-Speed 5142.46 samples/sec Loss 8.9170 Epoch: 2 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:32,503-Speed 4973.00 samples/sec Loss 8.8523 Epoch: 2 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:42,659-Speed 5041.98 samples/sec Loss 8.8430 Epoch: 2 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:32:52,665-Speed 5117.38 samples/sec Loss 8.8630 Epoch: 2 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:02,882-Speed 5011.28 samples/sec Loss 8.8698 Epoch: 2 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:12,906-Speed 5108.56 samples/sec Loss 8.8815 Epoch: 2 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:24,526-Speed 4406.35 samples/sec Loss 8.8838 Epoch: 2 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:34,704-Speed 5030.73 samples/sec Loss 8.8728 Epoch: 2 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:44,807-Speed 5068.31 samples/sec Loss 8.9432 Epoch: 2 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:33:54,927-Speed 5059.16 samples/sec Loss 8.8972 Epoch: 2 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:34:05,186-Speed 4991.28 samples/sec Loss 8.8764 Epoch: 2 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:34:15,481-Speed 4973.86 samples/sec Loss 8.9020 Epoch: 2 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:34:25,431-Speed 5145.66 samples/sec Loss 8.7990 Epoch: 2 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:34:42,192-[lfw][48000]XNorm: 23.747966 Training: 2021-03-18 23:34:42,193-[lfw][48000]Accuracy-Flip: 0.99317+-0.00369 Training: 2021-03-18 23:34:42,193-[lfw][48000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:35:01,121-[cfp_fp][48000]XNorm: 18.794321 Training: 2021-03-18 23:35:01,121-[cfp_fp][48000]Accuracy-Flip: 0.89243+-0.01619 Training: 2021-03-18 23:35:01,121-[cfp_fp][48000]Accuracy-Highest: 0.89786 Training: 2021-03-18 23:35:17,302-[agedb_30][48000]XNorm: 22.710558 Training: 2021-03-18 23:35:17,302-[agedb_30][48000]Accuracy-Flip: 0.93183+-0.01452 Training: 2021-03-18 23:35:17,302-[agedb_30][48000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:35:28,267-Speed 814.84 samples/sec Loss 8.8023 Epoch: 2 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:35:39,052-Speed 4747.46 samples/sec Loss 8.8696 Epoch: 2 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:35:50,886-Speed 4326.83 samples/sec Loss 8.8623 Epoch: 2 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:00,837-Speed 5145.45 samples/sec Loss 8.8925 Epoch: 2 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:10,942-Speed 5067.18 samples/sec Loss 8.8191 Epoch: 2 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:21,122-Speed 5029.70 samples/sec Loss 8.8759 Epoch: 2 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:31,236-Speed 5063.01 samples/sec Loss 8.8715 Epoch: 2 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:41,528-Speed 4975.47 samples/sec Loss 8.8312 Epoch: 2 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:36:52,092-Speed 4846.91 samples/sec Loss 8.8321 Epoch: 2 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:02,098-Speed 5117.46 samples/sec Loss 8.8558 Epoch: 2 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:12,631-Speed 4861.00 samples/sec Loss 8.8500 Epoch: 2 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:22,964-Speed 4955.49 samples/sec Loss 8.8524 Epoch: 2 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:33,238-Speed 4983.38 samples/sec Loss 8.9161 Epoch: 2 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:43,393-Speed 5042.12 samples/sec Loss 8.8415 Epoch: 2 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:37:53,235-Speed 5202.66 samples/sec Loss 8.8754 Epoch: 2 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:03,351-Speed 5061.82 samples/sec Loss 8.7869 Epoch: 2 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:13,458-Speed 5065.71 samples/sec Loss 8.8708 Epoch: 2 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:24,227-Speed 4755.06 samples/sec Loss 8.8368 Epoch: 2 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:34,748-Speed 4866.68 samples/sec Loss 8.8390 Epoch: 2 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:44,953-Speed 5017.40 samples/sec Loss 8.8679 Epoch: 2 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:38:55,204-Speed 4995.30 samples/sec Loss 8.8356 Epoch: 2 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:05,534-Speed 4956.47 samples/sec Loss 8.8359 Epoch: 2 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:15,620-Speed 5076.78 samples/sec Loss 8.8809 Epoch: 2 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:25,762-Speed 5048.46 samples/sec Loss 8.8744 Epoch: 2 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:36,446-Speed 4792.40 samples/sec Loss 8.8240 Epoch: 2 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:46,907-Speed 4894.72 samples/sec Loss 8.9074 Epoch: 2 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:39:57,398-Speed 4881.00 samples/sec Loss 8.8399 Epoch: 2 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:07,795-Speed 4924.59 samples/sec Loss 8.8901 Epoch: 2 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:18,713-Speed 4689.84 samples/sec Loss 8.8719 Epoch: 2 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:28,953-Speed 5000.60 samples/sec Loss 8.8065 Epoch: 2 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:39,000-Speed 5096.51 samples/sec Loss 8.8768 Epoch: 2 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:49,456-Speed 4896.67 samples/sec Loss 8.8715 Epoch: 2 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:40:59,672-Speed 5012.24 samples/sec Loss 8.8895 Epoch: 2 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:41:09,900-Speed 5006.04 samples/sec Loss 8.8626 Epoch: 2 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:41:20,065-Speed 5037.25 samples/sec Loss 8.9059 Epoch: 2 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:41:30,522-Speed 4896.70 samples/sec Loss 8.8822 Epoch: 2 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:41:40,533-Speed 5114.45 samples/sec Loss 8.8409 Epoch: 2 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:41:50,599-Speed 5087.31 samples/sec Loss 8.8029 Epoch: 2 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:42:00,689-Speed 5074.27 samples/sec Loss 8.8091 Epoch: 2 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:42:10,942-Speed 4994.46 samples/sec Loss 8.8686 Epoch: 2 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:42:27,874-[lfw][50000]XNorm: 22.338836 Training: 2021-03-18 23:42:27,875-[lfw][50000]Accuracy-Flip: 0.99100+-0.00473 Training: 2021-03-18 23:42:27,875-[lfw][50000]Accuracy-Highest: 0.99367 Training: 2021-03-18 23:42:46,802-[cfp_fp][50000]XNorm: 17.801734 Training: 2021-03-18 23:42:46,802-[cfp_fp][50000]Accuracy-Flip: 0.89800+-0.01660 Training: 2021-03-18 23:42:46,802-[cfp_fp][50000]Accuracy-Highest: 0.89800 Training: 2021-03-18 23:43:03,047-[agedb_30][50000]XNorm: 20.945497 Training: 2021-03-18 23:43:03,048-[agedb_30][50000]Accuracy-Flip: 0.93450+-0.01745 Training: 2021-03-18 23:43:03,048-[agedb_30][50000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:43:13,175-Speed 822.73 samples/sec Loss 8.8866 Epoch: 2 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:43:35,691-Speed 2273.98 samples/sec Loss 8.4093 Epoch: 3 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:43:46,767-Speed 4622.77 samples/sec Loss 8.0430 Epoch: 3 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:43:57,132-Speed 4939.80 samples/sec Loss 8.0625 Epoch: 3 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:07,422-Speed 4976.45 samples/sec Loss 8.0788 Epoch: 3 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:17,815-Speed 4926.49 samples/sec Loss 8.0895 Epoch: 3 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:27,708-Speed 5175.65 samples/sec Loss 8.1099 Epoch: 3 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:37,764-Speed 5091.76 samples/sec Loss 8.1923 Epoch: 3 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:47,656-Speed 5176.74 samples/sec Loss 8.1476 Epoch: 3 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:44:57,725-Speed 5085.01 samples/sec Loss 8.1568 Epoch: 3 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:07,956-Speed 5004.74 samples/sec Loss 8.2240 Epoch: 3 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:19,044-Speed 4618.11 samples/sec Loss 8.2592 Epoch: 3 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:28,857-Speed 5218.00 samples/sec Loss 8.2767 Epoch: 3 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:38,829-Speed 5134.72 samples/sec Loss 8.1761 Epoch: 3 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:49,012-Speed 5028.44 samples/sec Loss 8.2297 Epoch: 3 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:45:59,025-Speed 5113.52 samples/sec Loss 8.2234 Epoch: 3 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:46:08,968-Speed 5149.91 samples/sec Loss 8.3135 Epoch: 3 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:46:18,965-Speed 5121.59 samples/sec Loss 8.3214 Epoch: 3 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:46:29,513-Speed 4854.49 samples/sec Loss 8.3052 Epoch: 3 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:46:40,377-Speed 4712.87 samples/sec Loss 8.3800 Epoch: 3 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:46:50,234-Speed 5194.71 samples/sec Loss 8.3099 Epoch: 3 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:00,422-Speed 5025.83 samples/sec Loss 8.4041 Epoch: 3 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:10,358-Speed 5153.45 samples/sec Loss 8.4168 Epoch: 3 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:20,582-Speed 5007.86 samples/sec Loss 8.3969 Epoch: 3 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:31,696-Speed 4606.98 samples/sec Loss 8.3837 Epoch: 3 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:41,720-Speed 5108.41 samples/sec Loss 8.4055 Epoch: 3 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:47:52,365-Speed 4809.82 samples/sec Loss 8.4003 Epoch: 3 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:03,890-Speed 4443.04 samples/sec Loss 8.4914 Epoch: 3 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:13,527-Speed 5312.93 samples/sec Loss 8.4899 Epoch: 3 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:23,592-Speed 5087.60 samples/sec Loss 8.4875 Epoch: 3 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:33,695-Speed 5068.23 samples/sec Loss 8.4152 Epoch: 3 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:43,692-Speed 5121.68 samples/sec Loss 8.4908 Epoch: 3 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:48:53,679-Speed 5127.14 samples/sec Loss 8.4699 Epoch: 3 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:03,725-Speed 5096.85 samples/sec Loss 8.4324 Epoch: 3 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:13,968-Speed 4998.95 samples/sec Loss 8.4571 Epoch: 3 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:23,904-Speed 5153.01 samples/sec Loss 8.4726 Epoch: 3 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:33,910-Speed 5117.32 samples/sec Loss 8.5282 Epoch: 3 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:43,822-Speed 5165.56 samples/sec Loss 8.5316 Epoch: 3 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:49:53,951-Speed 5054.86 samples/sec Loss 8.5234 Epoch: 3 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:50:03,907-Speed 5142.98 samples/sec Loss 8.4685 Epoch: 3 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:50:20,652-[lfw][52000]XNorm: 21.497889 Training: 2021-03-18 23:50:20,653-[lfw][52000]Accuracy-Flip: 0.99400+-0.00318 Training: 2021-03-18 23:50:20,653-[lfw][52000]Accuracy-Highest: 0.99400 Training: 2021-03-18 23:50:39,321-[cfp_fp][52000]XNorm: 17.060642 Training: 2021-03-18 23:50:39,321-[cfp_fp][52000]Accuracy-Flip: 0.89986+-0.01659 Training: 2021-03-18 23:50:39,321-[cfp_fp][52000]Accuracy-Highest: 0.89986 Training: 2021-03-18 23:50:55,540-[agedb_30][52000]XNorm: 20.328255 Training: 2021-03-18 23:50:55,541-[agedb_30][52000]Accuracy-Flip: 0.93617+-0.01003 Training: 2021-03-18 23:50:55,541-[agedb_30][52000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:51:06,352-Speed 819.93 samples/sec Loss 8.5442 Epoch: 3 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:51:16,308-Speed 5143.04 samples/sec Loss 8.5717 Epoch: 3 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:51:26,643-Speed 4954.67 samples/sec Loss 8.5441 Epoch: 3 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:51:36,543-Speed 5172.10 samples/sec Loss 8.5512 Epoch: 3 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:51:46,453-Speed 5166.83 samples/sec Loss 8.5009 Epoch: 3 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:51:56,277-Speed 5211.80 samples/sec Loss 8.5409 Epoch: 3 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:06,240-Speed 5139.13 samples/sec Loss 8.5705 Epoch: 3 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:16,039-Speed 5225.52 samples/sec Loss 8.5380 Epoch: 3 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:26,053-Speed 5112.92 samples/sec Loss 8.5254 Epoch: 3 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:36,024-Speed 5135.53 samples/sec Loss 8.5337 Epoch: 3 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:46,001-Speed 5131.99 samples/sec Loss 8.5913 Epoch: 3 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:52:55,979-Speed 5131.90 samples/sec Loss 8.5900 Epoch: 3 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:05,891-Speed 5165.43 samples/sec Loss 8.6103 Epoch: 3 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:15,913-Speed 5109.06 samples/sec Loss 8.6017 Epoch: 3 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:25,929-Speed 5112.35 samples/sec Loss 8.5967 Epoch: 3 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:36,174-Speed 4997.47 samples/sec Loss 8.5783 Epoch: 3 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:46,025-Speed 5198.08 samples/sec Loss 8.6738 Epoch: 3 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:53:56,160-Speed 5052.07 samples/sec Loss 8.5806 Epoch: 3 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:06,144-Speed 5128.34 samples/sec Loss 8.6150 Epoch: 3 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:16,162-Speed 5111.24 samples/sec Loss 8.6017 Epoch: 3 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:25,998-Speed 5206.18 samples/sec Loss 8.6130 Epoch: 3 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:35,830-Speed 5207.72 samples/sec Loss 8.5881 Epoch: 3 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:45,665-Speed 5206.43 samples/sec Loss 8.6395 Epoch: 3 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:54:55,431-Speed 5242.70 samples/sec Loss 8.6061 Epoch: 3 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:05,424-Speed 5123.80 samples/sec Loss 8.6213 Epoch: 3 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:15,255-Speed 5208.35 samples/sec Loss 8.6470 Epoch: 3 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:25,061-Speed 5221.98 samples/sec Loss 8.6805 Epoch: 3 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:35,126-Speed 5087.01 samples/sec Loss 8.7030 Epoch: 3 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:45,078-Speed 5145.05 samples/sec Loss 8.6897 Epoch: 3 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:55:54,927-Speed 5198.75 samples/sec Loss 8.6658 Epoch: 3 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:05,076-Speed 5045.20 samples/sec Loss 8.6789 Epoch: 3 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:14,966-Speed 5177.22 samples/sec Loss 8.6070 Epoch: 3 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:24,905-Speed 5151.47 samples/sec Loss 8.6904 Epoch: 3 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:34,744-Speed 5204.12 samples/sec Loss 8.6402 Epoch: 3 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:44,806-Speed 5088.80 samples/sec Loss 8.6485 Epoch: 3 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:56:54,674-Speed 5188.97 samples/sec Loss 8.5798 Epoch: 3 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:57:04,526-Speed 5197.34 samples/sec Loss 8.6468 Epoch: 3 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:57:15,238-Speed 4779.76 samples/sec Loss 8.6488 Epoch: 3 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:57:25,148-Speed 5167.07 samples/sec Loss 8.6298 Epoch: 3 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:57:35,107-Speed 5140.89 samples/sec Loss 8.6361 Epoch: 3 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:57:51,900-[lfw][54000]XNorm: 24.413838 Training: 2021-03-18 23:57:51,900-[lfw][54000]Accuracy-Flip: 0.99433+-0.00335 Training: 2021-03-18 23:57:51,900-[lfw][54000]Accuracy-Highest: 0.99433 Training: 2021-03-18 23:58:10,566-[cfp_fp][54000]XNorm: 19.341941 Training: 2021-03-18 23:58:10,566-[cfp_fp][54000]Accuracy-Flip: 0.89829+-0.01575 Training: 2021-03-18 23:58:10,566-[cfp_fp][54000]Accuracy-Highest: 0.89986 Training: 2021-03-18 23:58:26,777-[agedb_30][54000]XNorm: 22.510003 Training: 2021-03-18 23:58:26,778-[agedb_30][54000]Accuracy-Flip: 0.93067+-0.01381 Training: 2021-03-18 23:58:26,778-[agedb_30][54000]Accuracy-Highest: 0.93650 Training: 2021-03-18 23:58:36,531-Speed 833.56 samples/sec Loss 8.6558 Epoch: 3 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:58:46,368-Speed 5204.97 samples/sec Loss 8.6634 Epoch: 3 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:58:56,352-Speed 5128.64 samples/sec Loss 8.6781 Epoch: 3 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:06,961-Speed 4826.55 samples/sec Loss 8.6576 Epoch: 3 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:17,699-Speed 4768.28 samples/sec Loss 8.7010 Epoch: 3 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:27,420-Speed 5267.56 samples/sec Loss 8.6678 Epoch: 3 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:37,463-Speed 5098.18 samples/sec Loss 8.7142 Epoch: 3 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:47,466-Speed 5119.11 samples/sec Loss 8.6897 Epoch: 3 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-18 23:59:57,415-Speed 5146.16 samples/sec Loss 8.7070 Epoch: 3 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:00:08,436-Speed 4646.20 samples/sec Loss 8.6535 Epoch: 3 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:00:18,879-Speed 4902.92 samples/sec Loss 8.6722 Epoch: 3 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:00:29,898-Speed 4646.87 samples/sec Loss 8.7126 Epoch: 3 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:00:40,831-Speed 4683.06 samples/sec Loss 8.7468 Epoch: 3 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:00:50,866-Speed 5102.62 samples/sec Loss 8.6798 Epoch: 3 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:00,890-Speed 5108.17 samples/sec Loss 8.6908 Epoch: 3 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:10,826-Speed 5153.19 samples/sec Loss 8.7065 Epoch: 3 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:20,725-Speed 5172.55 samples/sec Loss 8.6563 Epoch: 3 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:30,598-Speed 5186.26 samples/sec Loss 8.6741 Epoch: 3 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:40,339-Speed 5256.41 samples/sec Loss 8.6376 Epoch: 3 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:01:50,401-Speed 5088.88 samples/sec Loss 8.6978 Epoch: 3 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:00,582-Speed 5029.07 samples/sec Loss 8.6305 Epoch: 3 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:10,526-Speed 5149.14 samples/sec Loss 8.7378 Epoch: 3 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:20,437-Speed 5166.11 samples/sec Loss 8.7159 Epoch: 3 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:30,369-Speed 5155.53 samples/sec Loss 8.7241 Epoch: 3 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:40,338-Speed 5136.30 samples/sec Loss 8.7383 Epoch: 3 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:02:50,267-Speed 5157.14 samples/sec Loss 8.7340 Epoch: 3 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:01,290-Speed 4644.71 samples/sec Loss 8.7947 Epoch: 3 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:11,455-Speed 5037.37 samples/sec Loss 8.6664 Epoch: 3 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:21,418-Speed 5139.33 samples/sec Loss 8.7433 Epoch: 3 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:31,440-Speed 5109.07 samples/sec Loss 8.7322 Epoch: 3 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:41,169-Speed 5262.79 samples/sec Loss 8.7194 Epoch: 3 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:03:51,264-Speed 5072.05 samples/sec Loss 8.7491 Epoch: 3 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:01,313-Speed 5095.45 samples/sec Loss 8.7428 Epoch: 3 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:11,248-Speed 5154.11 samples/sec Loss 8.6817 Epoch: 3 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:21,137-Speed 5177.43 samples/sec Loss 8.7391 Epoch: 3 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:31,056-Speed 5162.54 samples/sec Loss 8.7363 Epoch: 3 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:41,099-Speed 5098.43 samples/sec Loss 8.7124 Epoch: 3 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:04:51,142-Speed 5098.27 samples/sec Loss 8.7383 Epoch: 3 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:05:01,244-Speed 5068.34 samples/sec Loss 8.7310 Epoch: 3 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:05:11,370-Speed 5056.67 samples/sec Loss 8.7620 Epoch: 3 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:05:28,157-[lfw][56000]XNorm: 23.187419 Training: 2021-03-19 00:05:28,158-[lfw][56000]Accuracy-Flip: 0.99417+-0.00310 Training: 2021-03-19 00:05:28,158-[lfw][56000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:05:46,883-[cfp_fp][56000]XNorm: 18.337508 Training: 2021-03-19 00:05:46,883-[cfp_fp][56000]Accuracy-Flip: 0.89386+-0.01501 Training: 2021-03-19 00:05:46,883-[cfp_fp][56000]Accuracy-Highest: 0.89986 Training: 2021-03-19 00:06:02,982-[agedb_30][56000]XNorm: 22.123894 Training: 2021-03-19 00:06:02,982-[agedb_30][56000]Accuracy-Flip: 0.93817+-0.01018 Training: 2021-03-19 00:06:02,982-[agedb_30][56000]Accuracy-Highest: 0.93817 Training: 2021-03-19 00:06:13,052-Speed 830.08 samples/sec Loss 8.7593 Epoch: 3 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:06:22,816-Speed 5243.60 samples/sec Loss 8.7093 Epoch: 3 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:06:32,915-Speed 5070.31 samples/sec Loss 8.7339 Epoch: 3 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:06:43,062-Speed 5046.21 samples/sec Loss 8.6789 Epoch: 3 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:06:53,006-Speed 5149.21 samples/sec Loss 8.7771 Epoch: 3 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:03,032-Speed 5107.06 samples/sec Loss 8.6890 Epoch: 3 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:12,976-Speed 5149.10 samples/sec Loss 8.7614 Epoch: 3 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:22,829-Speed 5196.49 samples/sec Loss 8.7062 Epoch: 3 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:32,718-Speed 5177.90 samples/sec Loss 8.7645 Epoch: 3 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:42,595-Speed 5184.33 samples/sec Loss 8.7299 Epoch: 3 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:07:52,811-Speed 5012.18 samples/sec Loss 8.7506 Epoch: 3 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:02,725-Speed 5164.44 samples/sec Loss 8.7895 Epoch: 3 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:12,569-Speed 5201.62 samples/sec Loss 8.7321 Epoch: 3 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:22,476-Speed 5168.20 samples/sec Loss 8.7321 Epoch: 3 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:32,497-Speed 5109.62 samples/sec Loss 8.7175 Epoch: 3 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:42,492-Speed 5123.18 samples/sec Loss 8.7790 Epoch: 3 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:08:52,503-Speed 5114.51 samples/sec Loss 8.7506 Epoch: 3 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:02,412-Speed 5167.25 samples/sec Loss 8.7815 Epoch: 3 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:12,214-Speed 5224.09 samples/sec Loss 8.7363 Epoch: 3 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:22,145-Speed 5155.60 samples/sec Loss 8.7691 Epoch: 3 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:32,223-Speed 5080.98 samples/sec Loss 8.8121 Epoch: 3 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:42,277-Speed 5092.97 samples/sec Loss 8.7517 Epoch: 3 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:09:52,925-Speed 4808.58 samples/sec Loss 8.7836 Epoch: 3 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:02,739-Speed 5217.39 samples/sec Loss 8.7753 Epoch: 3 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:12,623-Speed 5180.25 samples/sec Loss 8.7289 Epoch: 3 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:22,413-Speed 5230.32 samples/sec Loss 8.7127 Epoch: 3 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:32,246-Speed 5207.07 samples/sec Loss 8.7987 Epoch: 3 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:42,010-Speed 5244.16 samples/sec Loss 8.7476 Epoch: 3 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:10:52,648-Speed 4813.01 samples/sec Loss 8.7171 Epoch: 3 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:02,762-Speed 5062.93 samples/sec Loss 8.7300 Epoch: 3 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:12,916-Speed 5042.63 samples/sec Loss 8.7037 Epoch: 3 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:23,573-Speed 4804.66 samples/sec Loss 8.7994 Epoch: 3 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:33,578-Speed 5117.46 samples/sec Loss 8.7275 Epoch: 3 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:43,696-Speed 5060.88 samples/sec Loss 8.7296 Epoch: 3 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:11:54,317-Speed 4820.85 samples/sec Loss 8.7962 Epoch: 3 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:12:04,989-Speed 4797.69 samples/sec Loss 8.7601 Epoch: 3 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:12:15,787-Speed 4742.08 samples/sec Loss 8.7488 Epoch: 3 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:12:26,440-Speed 4806.33 samples/sec Loss 8.7240 Epoch: 3 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:12:36,516-Speed 5081.31 samples/sec Loss 8.7095 Epoch: 3 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:12:46,532-Speed 5112.45 samples/sec Loss 8.7141 Epoch: 3 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:13:03,576-[lfw][58000]XNorm: 24.798009 Training: 2021-03-19 00:13:03,576-[lfw][58000]Accuracy-Flip: 0.99283+-0.00334 Training: 2021-03-19 00:13:03,576-[lfw][58000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:13:22,381-[cfp_fp][58000]XNorm: 19.659535 Training: 2021-03-19 00:13:22,382-[cfp_fp][58000]Accuracy-Flip: 0.88086+-0.01705 Training: 2021-03-19 00:13:22,382-[cfp_fp][58000]Accuracy-Highest: 0.89986 Training: 2021-03-19 00:13:38,603-[agedb_30][58000]XNorm: 23.390468 Training: 2021-03-19 00:13:38,603-[agedb_30][58000]Accuracy-Flip: 0.93717+-0.01569 Training: 2021-03-19 00:13:38,603-[agedb_30][58000]Accuracy-Highest: 0.93817 Training: 2021-03-19 00:13:48,200-Speed 830.26 samples/sec Loss 8.7354 Epoch: 3 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:13:58,134-Speed 5154.14 samples/sec Loss 8.8557 Epoch: 3 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:08,010-Speed 5184.63 samples/sec Loss 8.7294 Epoch: 3 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:18,214-Speed 5018.01 samples/sec Loss 8.7321 Epoch: 3 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:28,130-Speed 5164.04 samples/sec Loss 8.7157 Epoch: 3 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:38,298-Speed 5035.43 samples/sec Loss 8.7251 Epoch: 3 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:48,450-Speed 5043.72 samples/sec Loss 8.7929 Epoch: 3 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:14:58,501-Speed 5094.39 samples/sec Loss 8.7270 Epoch: 3 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:08,503-Speed 5119.29 samples/sec Loss 8.8123 Epoch: 3 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:19,380-Speed 4707.42 samples/sec Loss 8.6849 Epoch: 3 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:29,395-Speed 5112.60 samples/sec Loss 8.7452 Epoch: 3 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:39,433-Speed 5100.94 samples/sec Loss 8.8007 Epoch: 3 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:49,244-Speed 5218.51 samples/sec Loss 8.7350 Epoch: 3 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:15:59,305-Speed 5089.49 samples/sec Loss 8.7786 Epoch: 3 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:09,247-Speed 5150.24 samples/sec Loss 8.7334 Epoch: 3 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:19,250-Speed 5118.95 samples/sec Loss 8.7479 Epoch: 3 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:29,340-Speed 5074.35 samples/sec Loss 8.7148 Epoch: 3 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:39,240-Speed 5172.09 samples/sec Loss 8.7526 Epoch: 3 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:49,622-Speed 4932.05 samples/sec Loss 8.7591 Epoch: 3 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:16:59,746-Speed 5057.66 samples/sec Loss 8.7575 Epoch: 3 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:17:09,720-Speed 5133.43 samples/sec Loss 8.7356 Epoch: 3 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:17:19,812-Speed 5073.96 samples/sec Loss 8.7883 Epoch: 3 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-19 00:17:29,996-Speed 5027.30 samples/sec Loss 8.7418 Epoch: 3 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:17:40,095-Speed 5070.60 samples/sec Loss 8.7735 Epoch: 3 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:17:50,331-Speed 5002.18 samples/sec Loss 8.7689 Epoch: 3 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:00,459-Speed 5055.34 samples/sec Loss 8.7938 Epoch: 3 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:10,222-Speed 5244.44 samples/sec Loss 8.7487 Epoch: 3 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:20,254-Speed 5104.33 samples/sec Loss 8.7845 Epoch: 3 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:30,173-Speed 5162.00 samples/sec Loss 8.8164 Epoch: 3 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:40,195-Speed 5109.72 samples/sec Loss 8.7649 Epoch: 3 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:49,826-Speed 5316.42 samples/sec Loss 8.7849 Epoch: 3 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:18:59,839-Speed 5113.26 samples/sec Loss 8.7609 Epoch: 3 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:19:09,998-Speed 5040.42 samples/sec Loss 8.7422 Epoch: 3 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:19:20,147-Speed 5045.11 samples/sec Loss 8.7319 Epoch: 3 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:19:30,113-Speed 5137.71 samples/sec Loss 8.7058 Epoch: 3 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:19:40,115-Speed 5119.19 samples/sec Loss 8.7739 Epoch: 3 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:19:50,373-Speed 4991.53 samples/sec Loss 8.7820 Epoch: 3 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:20:00,893-Speed 4867.28 samples/sec Loss 8.7335 Epoch: 3 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:20:10,960-Speed 5085.83 samples/sec Loss 8.7903 Epoch: 3 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:20:20,897-Speed 5153.27 samples/sec Loss 8.7657 Epoch: 3 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:20:37,646-[lfw][60000]XNorm: 22.198546 Training: 2021-03-19 00:20:37,646-[lfw][60000]Accuracy-Flip: 0.99300+-0.00379 Training: 2021-03-19 00:20:37,646-[lfw][60000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:20:56,333-[cfp_fp][60000]XNorm: 17.826701 Training: 2021-03-19 00:20:56,333-[cfp_fp][60000]Accuracy-Flip: 0.89729+-0.01673 Training: 2021-03-19 00:20:56,333-[cfp_fp][60000]Accuracy-Highest: 0.89986 Training: 2021-03-19 00:21:12,435-[agedb_30][60000]XNorm: 20.978973 Training: 2021-03-19 00:21:12,435-[agedb_30][60000]Accuracy-Flip: 0.93700+-0.01335 Training: 2021-03-19 00:21:12,435-[agedb_30][60000]Accuracy-Highest: 0.93817 Training: 2021-03-19 00:21:22,082-Speed 836.81 samples/sec Loss 8.7439 Epoch: 3 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:21:32,042-Speed 5141.27 samples/sec Loss 8.7276 Epoch: 3 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:21:42,036-Speed 5123.14 samples/sec Loss 8.6904 Epoch: 3 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:21:51,943-Speed 5168.47 samples/sec Loss 8.7638 Epoch: 3 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:01,859-Speed 5163.70 samples/sec Loss 8.7547 Epoch: 3 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:11,707-Speed 5199.34 samples/sec Loss 8.7230 Epoch: 3 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:21,927-Speed 5010.22 samples/sec Loss 8.7296 Epoch: 3 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:31,976-Speed 5095.45 samples/sec Loss 8.7150 Epoch: 3 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:42,719-Speed 4766.11 samples/sec Loss 8.7344 Epoch: 3 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:22:52,464-Speed 5254.47 samples/sec Loss 8.7518 Epoch: 3 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:02,514-Speed 5094.69 samples/sec Loss 8.7340 Epoch: 3 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:12,274-Speed 5245.91 samples/sec Loss 8.7894 Epoch: 3 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:23,145-Speed 4709.98 samples/sec Loss 8.7617 Epoch: 3 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:33,079-Speed 5154.41 samples/sec Loss 8.7196 Epoch: 3 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:42,931-Speed 5197.28 samples/sec Loss 8.7163 Epoch: 3 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:23:53,473-Speed 4857.33 samples/sec Loss 8.7168 Epoch: 3 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:03,484-Speed 5114.60 samples/sec Loss 8.7473 Epoch: 3 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:13,454-Speed 5135.77 samples/sec Loss 8.7410 Epoch: 3 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:23,231-Speed 5236.80 samples/sec Loss 8.7087 Epoch: 3 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:33,252-Speed 5109.65 samples/sec Loss 8.7938 Epoch: 3 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:44,932-Speed 4383.82 samples/sec Loss 8.7385 Epoch: 3 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:24:55,878-Speed 4677.89 samples/sec Loss 8.7746 Epoch: 3 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:06,400-Speed 4866.44 samples/sec Loss 8.7659 Epoch: 3 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:16,164-Speed 5243.61 samples/sec Loss 8.6963 Epoch: 3 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:26,010-Speed 5200.29 samples/sec Loss 8.7166 Epoch: 3 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:36,132-Speed 5058.71 samples/sec Loss 8.6833 Epoch: 3 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:46,002-Speed 5187.80 samples/sec Loss 8.7729 Epoch: 3 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:25:55,752-Speed 5251.66 samples/sec Loss 8.7606 Epoch: 3 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:05,654-Speed 5170.89 samples/sec Loss 8.7924 Epoch: 3 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:15,661-Speed 5117.09 samples/sec Loss 8.6870 Epoch: 3 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:25,652-Speed 5124.64 samples/sec Loss 8.7463 Epoch: 3 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:35,853-Speed 5019.63 samples/sec Loss 8.7270 Epoch: 3 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:45,835-Speed 5129.89 samples/sec Loss 8.7976 Epoch: 3 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:26:55,780-Speed 5148.58 samples/sec Loss 8.7749 Epoch: 3 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:06,620-Speed 4723.33 samples/sec Loss 8.7723 Epoch: 3 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:16,746-Speed 5056.78 samples/sec Loss 8.7714 Epoch: 3 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:26,745-Speed 5120.84 samples/sec Loss 8.7364 Epoch: 3 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:36,793-Speed 5095.71 samples/sec Loss 8.7648 Epoch: 3 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:46,912-Speed 5059.90 samples/sec Loss 8.7695 Epoch: 3 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:27:56,954-Speed 5099.13 samples/sec Loss 8.6516 Epoch: 3 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:28:13,781-[lfw][62000]XNorm: 22.480359 Training: 2021-03-19 00:28:13,781-[lfw][62000]Accuracy-Flip: 0.99300+-0.00446 Training: 2021-03-19 00:28:13,781-[lfw][62000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:28:32,433-[cfp_fp][62000]XNorm: 17.771605 Training: 2021-03-19 00:28:32,434-[cfp_fp][62000]Accuracy-Flip: 0.89329+-0.01648 Training: 2021-03-19 00:28:32,434-[cfp_fp][62000]Accuracy-Highest: 0.89986 Training: 2021-03-19 00:28:48,664-[agedb_30][62000]XNorm: 20.910689 Training: 2021-03-19 00:28:48,665-[agedb_30][62000]Accuracy-Flip: 0.93600+-0.01459 Training: 2021-03-19 00:28:48,665-[agedb_30][62000]Accuracy-Highest: 0.93817 Training: 2021-03-19 00:28:58,452-Speed 832.55 samples/sec Loss 8.6775 Epoch: 3 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:08,565-Speed 5063.23 samples/sec Loss 8.7352 Epoch: 3 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:18,491-Speed 5158.29 samples/sec Loss 8.7279 Epoch: 3 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:28,376-Speed 5179.61 samples/sec Loss 8.7101 Epoch: 3 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:38,230-Speed 5196.63 samples/sec Loss 8.7472 Epoch: 3 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:48,193-Speed 5139.17 samples/sec Loss 8.8058 Epoch: 3 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:29:58,082-Speed 5177.66 samples/sec Loss 8.7936 Epoch: 3 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:07,980-Speed 5173.24 samples/sec Loss 8.8014 Epoch: 3 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:18,052-Speed 5083.67 samples/sec Loss 8.7198 Epoch: 3 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:28,088-Speed 5101.85 samples/sec Loss 8.8056 Epoch: 3 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:38,237-Speed 5045.02 samples/sec Loss 8.7631 Epoch: 3 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:48,541-Speed 4969.63 samples/sec Loss 8.6721 Epoch: 3 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:30:58,675-Speed 5052.41 samples/sec Loss 8.7037 Epoch: 3 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:08,528-Speed 5196.67 samples/sec Loss 8.7769 Epoch: 3 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:18,478-Speed 5145.97 samples/sec Loss 8.7593 Epoch: 3 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:28,197-Speed 5268.56 samples/sec Loss 8.7589 Epoch: 3 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:38,055-Speed 5193.76 samples/sec Loss 8.7746 Epoch: 3 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:48,003-Speed 5147.40 samples/sec Loss 8.6990 Epoch: 3 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:31:57,990-Speed 5126.65 samples/sec Loss 8.6829 Epoch: 3 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:08,035-Speed 5097.60 samples/sec Loss 8.7000 Epoch: 3 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:17,786-Speed 5251.07 samples/sec Loss 8.6934 Epoch: 3 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:27,918-Speed 5053.54 samples/sec Loss 8.7787 Epoch: 3 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:37,803-Speed 5179.80 samples/sec Loss 8.7506 Epoch: 3 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:47,941-Speed 5050.69 samples/sec Loss 8.7196 Epoch: 3 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:32:57,888-Speed 5147.37 samples/sec Loss 8.7093 Epoch: 3 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:08,007-Speed 5060.28 samples/sec Loss 8.7203 Epoch: 3 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:17,955-Speed 5146.71 samples/sec Loss 8.7809 Epoch: 3 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:27,910-Speed 5143.88 samples/sec Loss 8.7440 Epoch: 3 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:37,949-Speed 5100.34 samples/sec Loss 8.7580 Epoch: 3 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:47,753-Speed 5223.07 samples/sec Loss 8.7043 Epoch: 3 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:33:57,656-Speed 5170.20 samples/sec Loss 8.7565 Epoch: 3 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:07,719-Speed 5088.31 samples/sec Loss 8.8093 Epoch: 3 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:18,362-Speed 4810.87 samples/sec Loss 8.7965 Epoch: 3 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:28,243-Speed 5182.14 samples/sec Loss 8.7596 Epoch: 3 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:38,226-Speed 5129.19 samples/sec Loss 8.7248 Epoch: 3 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:48,062-Speed 5206.05 samples/sec Loss 8.7253 Epoch: 3 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:34:58,651-Speed 4835.37 samples/sec Loss 8.7105 Epoch: 3 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:35:08,967-Speed 4964.10 samples/sec Loss 8.7307 Epoch: 3 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:35:18,973-Speed 5117.25 samples/sec Loss 8.6984 Epoch: 3 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:35:28,998-Speed 5107.87 samples/sec Loss 8.6882 Epoch: 3 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:35:45,928-[lfw][64000]XNorm: 23.928340 Training: 2021-03-19 00:35:45,929-[lfw][64000]Accuracy-Flip: 0.99183+-0.00320 Training: 2021-03-19 00:35:45,929-[lfw][64000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:36:04,577-[cfp_fp][64000]XNorm: 19.674411 Training: 2021-03-19 00:36:04,578-[cfp_fp][64000]Accuracy-Flip: 0.90186+-0.01273 Training: 2021-03-19 00:36:04,578-[cfp_fp][64000]Accuracy-Highest: 0.90186 Training: 2021-03-19 00:36:20,708-[agedb_30][64000]XNorm: 22.942308 Training: 2021-03-19 00:36:20,708-[agedb_30][64000]Accuracy-Flip: 0.93367+-0.01513 Training: 2021-03-19 00:36:20,708-[agedb_30][64000]Accuracy-Highest: 0.93817 Training: 2021-03-19 00:36:31,213-Speed 822.96 samples/sec Loss 8.7203 Epoch: 3 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:36:41,444-Speed 5004.39 samples/sec Loss 8.6898 Epoch: 3 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:36:51,326-Speed 5181.70 samples/sec Loss 8.7217 Epoch: 3 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:02,914-Speed 4418.55 samples/sec Loss 8.6576 Epoch: 3 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:13,595-Speed 4793.81 samples/sec Loss 8.6897 Epoch: 3 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:23,589-Speed 5123.12 samples/sec Loss 8.7664 Epoch: 3 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:33,543-Speed 5144.11 samples/sec Loss 8.6942 Epoch: 3 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:44,325-Speed 4748.76 samples/sec Loss 8.7220 Epoch: 3 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:37:54,251-Speed 5158.89 samples/sec Loss 8.7158 Epoch: 3 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:04,211-Speed 5141.20 samples/sec Loss 8.7700 Epoch: 3 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:14,058-Speed 5199.62 samples/sec Loss 8.7561 Epoch: 3 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:23,958-Speed 5171.99 samples/sec Loss 8.7488 Epoch: 3 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:33,981-Speed 5108.88 samples/sec Loss 8.7183 Epoch: 3 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:43,914-Speed 5154.72 samples/sec Loss 8.7649 Epoch: 3 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:38:53,807-Speed 5175.46 samples/sec Loss 8.8017 Epoch: 3 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:03,844-Speed 5101.88 samples/sec Loss 8.7676 Epoch: 3 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:13,873-Speed 5105.07 samples/sec Loss 8.7847 Epoch: 3 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:24,383-Speed 4871.80 samples/sec Loss 8.7056 Epoch: 3 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:34,516-Speed 5053.08 samples/sec Loss 8.7122 Epoch: 3 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:44,734-Speed 5011.11 samples/sec Loss 8.7567 Epoch: 3 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:39:54,559-Speed 5211.68 samples/sec Loss 8.7723 Epoch: 3 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:04,422-Speed 5191.82 samples/sec Loss 8.6810 Epoch: 3 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:14,200-Speed 5236.25 samples/sec Loss 8.7047 Epoch: 3 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:24,159-Speed 5141.51 samples/sec Loss 8.7135 Epoch: 3 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:34,069-Speed 5166.99 samples/sec Loss 8.7269 Epoch: 3 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:43,979-Speed 5166.68 samples/sec Loss 8.7583 Epoch: 3 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:40:54,052-Speed 5083.29 samples/sec Loss 8.7492 Epoch: 3 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:03,975-Speed 5160.08 samples/sec Loss 8.7049 Epoch: 3 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:14,069-Speed 5072.79 samples/sec Loss 8.6878 Epoch: 3 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:24,307-Speed 5001.38 samples/sec Loss 8.7449 Epoch: 3 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:34,171-Speed 5190.76 samples/sec Loss 8.6685 Epoch: 3 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:44,157-Speed 5127.54 samples/sec Loss 8.7135 Epoch: 3 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:41:54,264-Speed 5066.50 samples/sec Loss 8.7298 Epoch: 3 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:04,335-Speed 5084.21 samples/sec Loss 8.8153 Epoch: 3 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:14,458-Speed 5057.88 samples/sec Loss 8.7854 Epoch: 3 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:24,494-Speed 5102.14 samples/sec Loss 8.6737 Epoch: 3 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:34,584-Speed 5074.68 samples/sec Loss 8.7148 Epoch: 3 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:44,578-Speed 5123.54 samples/sec Loss 8.7461 Epoch: 3 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:42:54,549-Speed 5134.99 samples/sec Loss 8.6901 Epoch: 3 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:43:04,659-Speed 5064.52 samples/sec Loss 8.7015 Epoch: 3 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:43:21,496-[lfw][66000]XNorm: 23.068485 Training: 2021-03-19 00:43:21,496-[lfw][66000]Accuracy-Flip: 0.99417+-0.00375 Training: 2021-03-19 00:43:21,497-[lfw][66000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:43:40,243-[cfp_fp][66000]XNorm: 18.803877 Training: 2021-03-19 00:43:40,243-[cfp_fp][66000]Accuracy-Flip: 0.89629+-0.01459 Training: 2021-03-19 00:43:40,243-[cfp_fp][66000]Accuracy-Highest: 0.90186 Training: 2021-03-19 00:43:56,328-[agedb_30][66000]XNorm: 21.907708 Training: 2021-03-19 00:43:56,328-[agedb_30][66000]Accuracy-Flip: 0.94400+-0.01289 Training: 2021-03-19 00:43:56,328-[agedb_30][66000]Accuracy-Highest: 0.94400 Training: 2021-03-19 00:44:06,176-Speed 832.30 samples/sec Loss 8.7391 Epoch: 3 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:44:16,385-Speed 5015.94 samples/sec Loss 8.7755 Epoch: 3 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:44:26,381-Speed 5122.35 samples/sec Loss 8.7298 Epoch: 3 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:44:36,507-Speed 5056.59 samples/sec Loss 8.6860 Epoch: 3 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:44:46,618-Speed 5064.32 samples/sec Loss 8.7014 Epoch: 3 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:44:56,801-Speed 5028.01 samples/sec Loss 8.6555 Epoch: 3 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:06,776-Speed 5133.33 samples/sec Loss 8.7230 Epoch: 3 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:16,841-Speed 5087.17 samples/sec Loss 8.7084 Epoch: 3 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:26,835-Speed 5123.25 samples/sec Loss 8.7559 Epoch: 3 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:36,633-Speed 5225.80 samples/sec Loss 8.7293 Epoch: 3 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:46,944-Speed 4966.05 samples/sec Loss 8.7001 Epoch: 3 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:45:56,897-Speed 5144.75 samples/sec Loss 8.7498 Epoch: 3 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:46:06,856-Speed 5141.00 samples/sec Loss 8.7320 Epoch: 3 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:46:16,799-Speed 5149.65 samples/sec Loss 8.6702 Epoch: 3 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:46:26,495-Speed 5280.87 samples/sec Loss 8.7314 Epoch: 3 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:46:49,335-Speed 2241.78 samples/sec Loss 8.1758 Epoch: 4 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:00,353-Speed 4647.30 samples/sec Loss 7.9539 Epoch: 4 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:10,731-Speed 4933.60 samples/sec Loss 7.9824 Epoch: 4 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:20,965-Speed 5003.44 samples/sec Loss 7.9587 Epoch: 4 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:31,325-Speed 4942.41 samples/sec Loss 8.0211 Epoch: 4 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:42,315-Speed 4659.45 samples/sec Loss 8.0056 Epoch: 4 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:47:52,454-Speed 5050.29 samples/sec Loss 8.0194 Epoch: 4 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:02,867-Speed 4917.37 samples/sec Loss 8.0652 Epoch: 4 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:12,925-Speed 5090.91 samples/sec Loss 8.0868 Epoch: 4 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:23,207-Speed 4979.68 samples/sec Loss 8.1398 Epoch: 4 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:34,064-Speed 4716.33 samples/sec Loss 8.1101 Epoch: 4 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:45,243-Speed 4580.55 samples/sec Loss 8.1237 Epoch: 4 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:48:55,319-Speed 5081.65 samples/sec Loss 8.1236 Epoch: 4 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:05,223-Speed 5170.04 samples/sec Loss 8.1567 Epoch: 4 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:16,188-Speed 4669.40 samples/sec Loss 8.1607 Epoch: 4 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:26,228-Speed 5099.85 samples/sec Loss 8.1595 Epoch: 4 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:36,993-Speed 4756.39 samples/sec Loss 8.1694 Epoch: 4 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:46,812-Speed 5215.22 samples/sec Loss 8.2416 Epoch: 4 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:49:56,862-Speed 5094.40 samples/sec Loss 8.1800 Epoch: 4 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:07,945-Speed 4619.88 samples/sec Loss 8.2042 Epoch: 4 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:18,214-Speed 4986.36 samples/sec Loss 8.2412 Epoch: 4 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:28,145-Speed 5156.07 samples/sec Loss 8.2260 Epoch: 4 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:38,250-Speed 5066.92 samples/sec Loss 8.2517 Epoch: 4 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:48,275-Speed 5107.88 samples/sec Loss 8.2111 Epoch: 4 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:50:58,605-Speed 4956.47 samples/sec Loss 8.2953 Epoch: 4 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:51:15,285-[lfw][68000]XNorm: 23.093012 Training: 2021-03-19 00:51:15,286-[lfw][68000]Accuracy-Flip: 0.99367+-0.00427 Training: 2021-03-19 00:51:15,286-[lfw][68000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:51:34,003-[cfp_fp][68000]XNorm: 18.180060 Training: 2021-03-19 00:51:34,003-[cfp_fp][68000]Accuracy-Flip: 0.90257+-0.01676 Training: 2021-03-19 00:51:34,003-[cfp_fp][68000]Accuracy-Highest: 0.90257 Training: 2021-03-19 00:51:50,249-[agedb_30][68000]XNorm: 21.745266 Training: 2021-03-19 00:51:50,249-[agedb_30][68000]Accuracy-Flip: 0.93167+-0.01602 Training: 2021-03-19 00:51:50,249-[agedb_30][68000]Accuracy-Highest: 0.94400 Training: 2021-03-19 00:52:00,058-Speed 833.17 samples/sec Loss 8.3454 Epoch: 4 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:52:09,997-Speed 5151.47 samples/sec Loss 8.3500 Epoch: 4 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:52:21,142-Speed 4594.22 samples/sec Loss 8.3300 Epoch: 4 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:52:31,133-Speed 5124.99 samples/sec Loss 8.3489 Epoch: 4 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:52:41,139-Speed 5116.95 samples/sec Loss 8.3268 Epoch: 4 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:52:51,068-Speed 5156.91 samples/sec Loss 8.3230 Epoch: 4 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:01,090-Speed 5109.14 samples/sec Loss 8.3256 Epoch: 4 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:11,069-Speed 5131.10 samples/sec Loss 8.3767 Epoch: 4 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:20,920-Speed 5197.65 samples/sec Loss 8.3376 Epoch: 4 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:30,906-Speed 5127.36 samples/sec Loss 8.3793 Epoch: 4 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:41,203-Speed 4972.45 samples/sec Loss 8.3592 Epoch: 4 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:53:51,515-Speed 4965.42 samples/sec Loss 8.3601 Epoch: 4 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:01,636-Speed 5059.43 samples/sec Loss 8.3754 Epoch: 4 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:11,924-Speed 4977.03 samples/sec Loss 8.4601 Epoch: 4 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:21,881-Speed 5142.54 samples/sec Loss 8.3440 Epoch: 4 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:31,854-Speed 5133.83 samples/sec Loss 8.4061 Epoch: 4 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:41,740-Speed 5179.82 samples/sec Loss 8.4180 Epoch: 4 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:54:51,717-Speed 5132.11 samples/sec Loss 8.3755 Epoch: 4 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:01,536-Speed 5214.18 samples/sec Loss 8.4753 Epoch: 4 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:11,811-Speed 4983.62 samples/sec Loss 8.4722 Epoch: 4 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:21,794-Speed 5128.83 samples/sec Loss 8.4163 Epoch: 4 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:31,701-Speed 5168.57 samples/sec Loss 8.4507 Epoch: 4 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:41,521-Speed 5214.01 samples/sec Loss 8.4812 Epoch: 4 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:55:51,592-Speed 5084.30 samples/sec Loss 8.4207 Epoch: 4 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:01,381-Speed 5230.43 samples/sec Loss 8.4215 Epoch: 4 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:11,486-Speed 5067.22 samples/sec Loss 8.4764 Epoch: 4 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:21,583-Speed 5071.40 samples/sec Loss 8.4819 Epoch: 4 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:31,611-Speed 5106.10 samples/sec Loss 8.4746 Epoch: 4 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:41,609-Speed 5121.26 samples/sec Loss 8.4530 Epoch: 4 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:56:51,790-Speed 5029.22 samples/sec Loss 8.4720 Epoch: 4 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:01,935-Speed 5047.04 samples/sec Loss 8.4893 Epoch: 4 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:11,943-Speed 5116.17 samples/sec Loss 8.4761 Epoch: 4 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:22,004-Speed 5089.18 samples/sec Loss 8.5510 Epoch: 4 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:31,996-Speed 5124.38 samples/sec Loss 8.4939 Epoch: 4 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:42,225-Speed 5005.95 samples/sec Loss 8.5645 Epoch: 4 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:57:52,124-Speed 5172.61 samples/sec Loss 8.5855 Epoch: 4 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:58:02,315-Speed 5024.22 samples/sec Loss 8.4823 Epoch: 4 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:58:12,475-Speed 5039.73 samples/sec Loss 8.5392 Epoch: 4 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:58:22,265-Speed 5230.09 samples/sec Loss 8.5297 Epoch: 4 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:58:32,363-Speed 5070.47 samples/sec Loss 8.4595 Epoch: 4 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:58:48,928-[lfw][70000]XNorm: 24.463990 Training: 2021-03-19 00:58:48,928-[lfw][70000]Accuracy-Flip: 0.99217+-0.00350 Training: 2021-03-19 00:58:48,928-[lfw][70000]Accuracy-Highest: 0.99433 Training: 2021-03-19 00:59:07,530-[cfp_fp][70000]XNorm: 19.802856 Training: 2021-03-19 00:59:07,530-[cfp_fp][70000]Accuracy-Flip: 0.89129+-0.01293 Training: 2021-03-19 00:59:07,530-[cfp_fp][70000]Accuracy-Highest: 0.90257 Training: 2021-03-19 00:59:23,594-[agedb_30][70000]XNorm: 23.164073 Training: 2021-03-19 00:59:23,595-[agedb_30][70000]Accuracy-Flip: 0.93683+-0.00935 Training: 2021-03-19 00:59:23,595-[agedb_30][70000]Accuracy-Highest: 0.94400 Training: 2021-03-19 00:59:33,385-Speed 839.06 samples/sec Loss 8.5431 Epoch: 4 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:59:44,154-Speed 4754.58 samples/sec Loss 8.5451 Epoch: 4 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 00:59:54,045-Speed 5176.32 samples/sec Loss 8.5274 Epoch: 4 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:04,053-Speed 5116.35 samples/sec Loss 8.5488 Epoch: 4 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:13,908-Speed 5195.97 samples/sec Loss 8.6433 Epoch: 4 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:24,651-Speed 4766.25 samples/sec Loss 8.4935 Epoch: 4 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:34,799-Speed 5045.42 samples/sec Loss 8.5418 Epoch: 4 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:44,923-Speed 5057.37 samples/sec Loss 8.5347 Epoch: 4 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:00:54,852-Speed 5157.24 samples/sec Loss 8.5319 Epoch: 4 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:04,765-Speed 5165.27 samples/sec Loss 8.5198 Epoch: 4 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:15,398-Speed 4815.43 samples/sec Loss 8.5169 Epoch: 4 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:25,504-Speed 5066.41 samples/sec Loss 8.5604 Epoch: 4 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:36,461-Speed 4672.89 samples/sec Loss 8.5739 Epoch: 4 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:46,924-Speed 4893.60 samples/sec Loss 8.5591 Epoch: 4 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:01:56,782-Speed 5194.14 samples/sec Loss 8.5932 Epoch: 4 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:06,742-Speed 5141.36 samples/sec Loss 8.6030 Epoch: 4 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:16,771-Speed 5105.25 samples/sec Loss 8.5993 Epoch: 4 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:27,602-Speed 4727.46 samples/sec Loss 8.5769 Epoch: 4 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:37,571-Speed 5136.23 samples/sec Loss 8.6016 Epoch: 4 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:48,216-Speed 4810.22 samples/sec Loss 8.5638 Epoch: 4 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:02:58,199-Speed 5129.02 samples/sec Loss 8.5515 Epoch: 4 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:08,099-Speed 5171.81 samples/sec Loss 8.5615 Epoch: 4 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:18,146-Speed 5096.48 samples/sec Loss 8.5394 Epoch: 4 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:28,226-Speed 5079.84 samples/sec Loss 8.5491 Epoch: 4 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:38,218-Speed 5124.24 samples/sec Loss 8.5440 Epoch: 4 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:48,322-Speed 5067.31 samples/sec Loss 8.5534 Epoch: 4 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:03:58,852-Speed 4862.56 samples/sec Loss 8.5346 Epoch: 4 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:08,694-Speed 5202.50 samples/sec Loss 8.5591 Epoch: 4 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:18,500-Speed 5221.50 samples/sec Loss 8.5542 Epoch: 4 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:28,632-Speed 5053.70 samples/sec Loss 8.6050 Epoch: 4 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:38,596-Speed 5138.80 samples/sec Loss 8.5757 Epoch: 4 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:48,824-Speed 5005.89 samples/sec Loss 8.5586 Epoch: 4 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:04:58,931-Speed 5066.53 samples/sec Loss 8.6199 Epoch: 4 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:08,979-Speed 5095.82 samples/sec Loss 8.6449 Epoch: 4 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:19,065-Speed 5076.44 samples/sec Loss 8.5788 Epoch: 4 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:29,044-Speed 5130.85 samples/sec Loss 8.5687 Epoch: 4 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:38,996-Speed 5144.96 samples/sec Loss 8.5925 Epoch: 4 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:49,197-Speed 5019.64 samples/sec Loss 8.6169 Epoch: 4 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:05:59,316-Speed 5060.02 samples/sec Loss 8.6645 Epoch: 4 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:06:09,219-Speed 5170.58 samples/sec Loss 8.6027 Epoch: 4 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:06:25,976-[lfw][72000]XNorm: 22.884707 Training: 2021-03-19 01:06:25,977-[lfw][72000]Accuracy-Flip: 0.99267+-0.00467 Training: 2021-03-19 01:06:25,977-[lfw][72000]Accuracy-Highest: 0.99433 Training: 2021-03-19 01:06:44,589-[cfp_fp][72000]XNorm: 18.313401 Training: 2021-03-19 01:06:44,590-[cfp_fp][72000]Accuracy-Flip: 0.89871+-0.01758 Training: 2021-03-19 01:06:44,590-[cfp_fp][72000]Accuracy-Highest: 0.90257 Training: 2021-03-19 01:07:00,664-[agedb_30][72000]XNorm: 21.436386 Training: 2021-03-19 01:07:00,664-[agedb_30][72000]Accuracy-Flip: 0.93700+-0.01388 Training: 2021-03-19 01:07:00,664-[agedb_30][72000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:07:10,613-Speed 833.97 samples/sec Loss 8.6541 Epoch: 4 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:07:20,401-Speed 5231.42 samples/sec Loss 8.6685 Epoch: 4 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:07:30,353-Speed 5144.74 samples/sec Loss 8.6288 Epoch: 4 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:07:40,376-Speed 5108.78 samples/sec Loss 8.6682 Epoch: 4 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:07:50,409-Speed 5103.57 samples/sec Loss 8.5842 Epoch: 4 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:00,584-Speed 5032.18 samples/sec Loss 8.6283 Epoch: 4 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:10,417-Speed 5207.69 samples/sec Loss 8.6090 Epoch: 4 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:20,442-Speed 5107.33 samples/sec Loss 8.6409 Epoch: 4 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:30,273-Speed 5208.51 samples/sec Loss 8.5972 Epoch: 4 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:40,494-Speed 5009.49 samples/sec Loss 8.6285 Epoch: 4 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:08:50,369-Speed 5185.05 samples/sec Loss 8.5447 Epoch: 4 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:00,354-Speed 5128.16 samples/sec Loss 8.6427 Epoch: 4 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:10,432-Speed 5080.41 samples/sec Loss 8.6457 Epoch: 4 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:20,507-Speed 5082.31 samples/sec Loss 8.6022 Epoch: 4 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:30,545-Speed 5101.02 samples/sec Loss 8.6535 Epoch: 4 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:40,523-Speed 5131.61 samples/sec Loss 8.5865 Epoch: 4 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:09:50,373-Speed 5198.01 samples/sec Loss 8.6656 Epoch: 4 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:00,422-Speed 5095.13 samples/sec Loss 8.6249 Epoch: 4 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:10,488-Speed 5087.02 samples/sec Loss 8.6446 Epoch: 4 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:20,460-Speed 5134.54 samples/sec Loss 8.6728 Epoch: 4 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:30,539-Speed 5080.08 samples/sec Loss 8.5672 Epoch: 4 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:40,633-Speed 5072.63 samples/sec Loss 8.5741 Epoch: 4 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:10:50,387-Speed 5249.24 samples/sec Loss 8.6162 Epoch: 4 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:00,447-Speed 5090.15 samples/sec Loss 8.6804 Epoch: 4 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:10,501-Speed 5092.57 samples/sec Loss 8.6625 Epoch: 4 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:21,441-Speed 4680.03 samples/sec Loss 8.5976 Epoch: 4 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:31,689-Speed 4996.45 samples/sec Loss 8.6791 Epoch: 4 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:41,856-Speed 5036.24 samples/sec Loss 8.6149 Epoch: 4 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:11:51,802-Speed 5148.36 samples/sec Loss 8.6126 Epoch: 4 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:01,659-Speed 5194.64 samples/sec Loss 8.6791 Epoch: 4 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:12,841-Speed 4579.18 samples/sec Loss 8.6506 Epoch: 4 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:23,004-Speed 5038.24 samples/sec Loss 8.6379 Epoch: 4 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:33,041-Speed 5101.21 samples/sec Loss 8.6981 Epoch: 4 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:43,137-Speed 5071.97 samples/sec Loss 8.6208 Epoch: 4 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:12:53,183-Speed 5097.12 samples/sec Loss 8.6137 Epoch: 4 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:13:03,210-Speed 5106.13 samples/sec Loss 8.6388 Epoch: 4 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:13:13,303-Speed 5073.31 samples/sec Loss 8.6357 Epoch: 4 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:13:24,799-Speed 4453.93 samples/sec Loss 8.6491 Epoch: 4 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:13:34,735-Speed 5153.37 samples/sec Loss 8.6482 Epoch: 4 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:13:45,426-Speed 4789.08 samples/sec Loss 8.6025 Epoch: 4 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:14:02,186-[lfw][74000]XNorm: 23.872396 Training: 2021-03-19 01:14:02,186-[lfw][74000]Accuracy-Flip: 0.99217+-0.00597 Training: 2021-03-19 01:14:02,186-[lfw][74000]Accuracy-Highest: 0.99433 Training: 2021-03-19 01:14:20,988-[cfp_fp][74000]XNorm: 19.535463 Training: 2021-03-19 01:14:20,989-[cfp_fp][74000]Accuracy-Flip: 0.89543+-0.01774 Training: 2021-03-19 01:14:20,989-[cfp_fp][74000]Accuracy-Highest: 0.90257 Training: 2021-03-19 01:14:37,064-[agedb_30][74000]XNorm: 22.457263 Training: 2021-03-19 01:14:37,065-[agedb_30][74000]Accuracy-Flip: 0.93583+-0.01225 Training: 2021-03-19 01:14:37,065-[agedb_30][74000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:14:47,018-Speed 831.29 samples/sec Loss 8.6273 Epoch: 4 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:14:56,888-Speed 5187.77 samples/sec Loss 8.6592 Epoch: 4 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:06,895-Speed 5116.90 samples/sec Loss 8.6909 Epoch: 4 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:17,526-Speed 4816.22 samples/sec Loss 8.5983 Epoch: 4 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:28,410-Speed 4704.42 samples/sec Loss 8.6990 Epoch: 4 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:38,377-Speed 5137.49 samples/sec Loss 8.6400 Epoch: 4 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:48,404-Speed 5106.49 samples/sec Loss 8.6586 Epoch: 4 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:15:58,564-Speed 5040.27 samples/sec Loss 8.6780 Epoch: 4 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:08,533-Speed 5136.24 samples/sec Loss 8.6400 Epoch: 4 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:18,636-Speed 5068.02 samples/sec Loss 8.7309 Epoch: 4 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:28,549-Speed 5165.05 samples/sec Loss 8.6749 Epoch: 4 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:38,515-Speed 5137.89 samples/sec Loss 8.6430 Epoch: 4 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:49,311-Speed 4742.53 samples/sec Loss 8.6507 Epoch: 4 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:16:59,332-Speed 5109.81 samples/sec Loss 8.6417 Epoch: 4 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:17:09,277-Speed 5148.72 samples/sec Loss 8.5974 Epoch: 4 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:17:19,298-Speed 5109.45 samples/sec Loss 8.6493 Epoch: 4 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-19 01:17:29,271-Speed 5134.20 samples/sec Loss 8.6634 Epoch: 4 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:17:39,184-Speed 5165.27 samples/sec Loss 8.6949 Epoch: 4 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:17:49,074-Speed 5176.79 samples/sec Loss 8.6692 Epoch: 4 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:17:59,177-Speed 5068.57 samples/sec Loss 8.6134 Epoch: 4 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:09,258-Speed 5078.84 samples/sec Loss 8.7031 Epoch: 4 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:19,188-Speed 5156.61 samples/sec Loss 8.6377 Epoch: 4 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:28,959-Speed 5240.39 samples/sec Loss 8.6961 Epoch: 4 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:38,911-Speed 5144.83 samples/sec Loss 8.6845 Epoch: 4 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:48,815-Speed 5169.86 samples/sec Loss 8.6384 Epoch: 4 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:18:58,621-Speed 5221.98 samples/sec Loss 8.6712 Epoch: 4 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:08,508-Speed 5178.50 samples/sec Loss 8.6894 Epoch: 4 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:18,742-Speed 5003.29 samples/sec Loss 8.6256 Epoch: 4 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:28,723-Speed 5130.41 samples/sec Loss 8.7111 Epoch: 4 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:38,862-Speed 5049.78 samples/sec Loss 8.6721 Epoch: 4 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:48,740-Speed 5183.97 samples/sec Loss 8.6558 Epoch: 4 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:19:58,757-Speed 5111.35 samples/sec Loss 8.6690 Epoch: 4 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:08,687-Speed 5156.54 samples/sec Loss 8.6189 Epoch: 4 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:18,830-Speed 5048.58 samples/sec Loss 8.6864 Epoch: 4 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:28,617-Speed 5231.63 samples/sec Loss 8.6449 Epoch: 4 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:38,700-Speed 5077.99 samples/sec Loss 8.6692 Epoch: 4 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:48,572-Speed 5186.70 samples/sec Loss 8.6289 Epoch: 4 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:20:58,369-Speed 5226.28 samples/sec Loss 8.6673 Epoch: 4 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:21:08,206-Speed 5205.15 samples/sec Loss 8.6167 Epoch: 4 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:21:18,026-Speed 5214.54 samples/sec Loss 8.6631 Epoch: 4 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:21:34,783-[lfw][76000]XNorm: 23.728547 Training: 2021-03-19 01:21:34,783-[lfw][76000]Accuracy-Flip: 0.99150+-0.00519 Training: 2021-03-19 01:21:34,783-[lfw][76000]Accuracy-Highest: 0.99433 Training: 2021-03-19 01:21:53,514-[cfp_fp][76000]XNorm: 18.401055 Training: 2021-03-19 01:21:53,514-[cfp_fp][76000]Accuracy-Flip: 0.87457+-0.01303 Training: 2021-03-19 01:21:53,514-[cfp_fp][76000]Accuracy-Highest: 0.90257 Training: 2021-03-19 01:22:09,645-[agedb_30][76000]XNorm: 22.232738 Training: 2021-03-19 01:22:09,645-[agedb_30][76000]Accuracy-Flip: 0.93967+-0.01238 Training: 2021-03-19 01:22:09,645-[agedb_30][76000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:22:19,642-Speed 830.95 samples/sec Loss 8.6711 Epoch: 4 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:22:29,746-Speed 5067.90 samples/sec Loss 8.6506 Epoch: 4 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:22:39,771-Speed 5107.18 samples/sec Loss 8.6692 Epoch: 4 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:22:49,538-Speed 5242.66 samples/sec Loss 8.6860 Epoch: 4 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:22:59,475-Speed 5152.85 samples/sec Loss 8.6618 Epoch: 4 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:23:09,455-Speed 5130.67 samples/sec Loss 8.6298 Epoch: 4 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:23:19,417-Speed 5139.84 samples/sec Loss 8.5591 Epoch: 4 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:23:29,363-Speed 5147.96 samples/sec Loss 8.6200 Epoch: 4 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:23:39,308-Speed 5148.68 samples/sec Loss 8.5989 Epoch: 4 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:23:49,438-Speed 5054.40 samples/sec Loss 8.7464 Epoch: 4 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:00,045-Speed 4827.48 samples/sec Loss 8.6893 Epoch: 4 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:09,921-Speed 5184.65 samples/sec Loss 8.6202 Epoch: 4 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:19,787-Speed 5190.14 samples/sec Loss 8.6557 Epoch: 4 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:30,055-Speed 4986.67 samples/sec Loss 8.6347 Epoch: 4 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:39,977-Speed 5160.64 samples/sec Loss 8.6397 Epoch: 4 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:24:50,882-Speed 4695.50 samples/sec Loss 8.6375 Epoch: 4 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:00,791-Speed 5167.48 samples/sec Loss 8.6875 Epoch: 4 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:10,899-Speed 5065.40 samples/sec Loss 8.6778 Epoch: 4 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:21,017-Speed 5060.67 samples/sec Loss 8.6430 Epoch: 4 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:30,946-Speed 5157.19 samples/sec Loss 8.7157 Epoch: 4 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:40,951-Speed 5117.83 samples/sec Loss 8.7014 Epoch: 4 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:25:50,928-Speed 5131.93 samples/sec Loss 8.6958 Epoch: 4 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:01,543-Speed 4823.62 samples/sec Loss 8.6487 Epoch: 4 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:12,191-Speed 4808.94 samples/sec Loss 8.6742 Epoch: 4 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:22,225-Speed 5103.06 samples/sec Loss 8.7044 Epoch: 4 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:32,159-Speed 5154.74 samples/sec Loss 8.6733 Epoch: 4 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:42,772-Speed 4824.52 samples/sec Loss 8.6868 Epoch: 4 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:26:52,643-Speed 5187.04 samples/sec Loss 8.5906 Epoch: 4 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:03,427-Speed 4748.18 samples/sec Loss 8.6716 Epoch: 4 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:13,332-Speed 5169.58 samples/sec Loss 8.7491 Epoch: 4 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:23,498-Speed 5036.59 samples/sec Loss 8.7549 Epoch: 4 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:33,913-Speed 4916.47 samples/sec Loss 8.6493 Epoch: 4 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:43,879-Speed 5137.69 samples/sec Loss 8.6593 Epoch: 4 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:27:53,730-Speed 5197.58 samples/sec Loss 8.6344 Epoch: 4 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:03,895-Speed 5037.19 samples/sec Loss 8.6207 Epoch: 4 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:13,934-Speed 5100.45 samples/sec Loss 8.6825 Epoch: 4 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:23,980-Speed 5096.93 samples/sec Loss 8.6457 Epoch: 4 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:34,644-Speed 4801.61 samples/sec Loss 8.5773 Epoch: 4 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:44,699-Speed 5092.54 samples/sec Loss 8.6556 Epoch: 4 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:28:54,589-Speed 5176.81 samples/sec Loss 8.6619 Epoch: 4 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:29:11,309-[lfw][78000]XNorm: 23.907864 Training: 2021-03-19 01:29:11,309-[lfw][78000]Accuracy-Flip: 0.99500+-0.00316 Training: 2021-03-19 01:29:11,309-[lfw][78000]Accuracy-Highest: 0.99500 Training: 2021-03-19 01:29:29,981-[cfp_fp][78000]XNorm: 19.685779 Training: 2021-03-19 01:29:29,981-[cfp_fp][78000]Accuracy-Flip: 0.89400+-0.01434 Training: 2021-03-19 01:29:29,981-[cfp_fp][78000]Accuracy-Highest: 0.90257 Training: 2021-03-19 01:29:46,134-[agedb_30][78000]XNorm: 22.079918 Training: 2021-03-19 01:29:46,135-[agedb_30][78000]Accuracy-Flip: 0.93850+-0.01158 Training: 2021-03-19 01:29:46,135-[agedb_30][78000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:29:55,938-Speed 834.58 samples/sec Loss 8.6880 Epoch: 4 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:05,973-Speed 5102.64 samples/sec Loss 8.5904 Epoch: 4 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:15,967-Speed 5123.79 samples/sec Loss 8.8004 Epoch: 4 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:26,023-Speed 5091.74 samples/sec Loss 8.7313 Epoch: 4 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:35,905-Speed 5181.61 samples/sec Loss 8.6471 Epoch: 4 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:46,085-Speed 5029.36 samples/sec Loss 8.6675 Epoch: 4 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:30:56,177-Speed 5073.79 samples/sec Loss 8.6505 Epoch: 4 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:06,108-Speed 5156.10 samples/sec Loss 8.6699 Epoch: 4 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:16,374-Speed 4987.55 samples/sec Loss 8.5759 Epoch: 4 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:26,187-Speed 5217.94 samples/sec Loss 8.6466 Epoch: 4 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:36,163-Speed 5132.58 samples/sec Loss 8.6629 Epoch: 4 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:46,198-Speed 5102.15 samples/sec Loss 8.6795 Epoch: 4 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:31:56,118-Speed 5161.80 samples/sec Loss 8.6686 Epoch: 4 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:06,314-Speed 5021.98 samples/sec Loss 8.6436 Epoch: 4 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:16,399-Speed 5077.30 samples/sec Loss 8.6493 Epoch: 4 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:26,621-Speed 5008.81 samples/sec Loss 8.6471 Epoch: 4 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:36,573-Speed 5145.23 samples/sec Loss 8.6722 Epoch: 4 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:46,681-Speed 5065.86 samples/sec Loss 8.6568 Epoch: 4 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:32:56,754-Speed 5082.91 samples/sec Loss 8.6560 Epoch: 4 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:06,922-Speed 5035.57 samples/sec Loss 8.6951 Epoch: 4 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:17,069-Speed 5046.08 samples/sec Loss 8.6699 Epoch: 4 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:26,918-Speed 5198.76 samples/sec Loss 8.7276 Epoch: 4 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:36,940-Speed 5109.06 samples/sec Loss 8.6583 Epoch: 4 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:46,956-Speed 5112.49 samples/sec Loss 8.6042 Epoch: 4 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:33:56,986-Speed 5104.79 samples/sec Loss 8.6167 Epoch: 4 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:06,997-Speed 5114.80 samples/sec Loss 8.6725 Epoch: 4 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:17,108-Speed 5064.03 samples/sec Loss 8.6785 Epoch: 4 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:26,962-Speed 5196.34 samples/sec Loss 8.6561 Epoch: 4 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:36,912-Speed 5146.05 samples/sec Loss 8.7233 Epoch: 4 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:46,692-Speed 5235.43 samples/sec Loss 8.6945 Epoch: 4 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:34:56,979-Speed 4977.54 samples/sec Loss 8.6515 Epoch: 4 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:07,050-Speed 5084.43 samples/sec Loss 8.6241 Epoch: 4 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:16,879-Speed 5209.24 samples/sec Loss 8.6961 Epoch: 4 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:26,883-Speed 5118.43 samples/sec Loss 8.6843 Epoch: 4 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:37,186-Speed 4969.54 samples/sec Loss 8.7394 Epoch: 4 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:48,191-Speed 4652.79 samples/sec Loss 8.6490 Epoch: 4 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:35:58,051-Speed 5192.97 samples/sec Loss 8.6203 Epoch: 4 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:36:08,088-Speed 5101.50 samples/sec Loss 8.6467 Epoch: 4 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:36:17,963-Speed 5185.09 samples/sec Loss 8.7124 Epoch: 4 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:36:27,872-Speed 5167.12 samples/sec Loss 8.6559 Epoch: 4 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:36:44,666-[lfw][80000]XNorm: 23.441047 Training: 2021-03-19 01:36:44,666-[lfw][80000]Accuracy-Flip: 0.99233+-0.00423 Training: 2021-03-19 01:36:44,666-[lfw][80000]Accuracy-Highest: 0.99500 Training: 2021-03-19 01:37:03,366-[cfp_fp][80000]XNorm: 19.004575 Training: 2021-03-19 01:37:03,367-[cfp_fp][80000]Accuracy-Flip: 0.89443+-0.01446 Training: 2021-03-19 01:37:03,367-[cfp_fp][80000]Accuracy-Highest: 0.90257 Training: 2021-03-19 01:37:19,530-[agedb_30][80000]XNorm: 21.995006 Training: 2021-03-19 01:37:19,530-[agedb_30][80000]Accuracy-Flip: 0.93450+-0.01256 Training: 2021-03-19 01:37:19,530-[agedb_30][80000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:37:29,381-Speed 832.41 samples/sec Loss 8.6762 Epoch: 4 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:37:40,257-Speed 4708.06 samples/sec Loss 8.6867 Epoch: 4 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:37:50,379-Speed 5058.44 samples/sec Loss 8.5855 Epoch: 4 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:00,388-Speed 5115.74 samples/sec Loss 8.6447 Epoch: 4 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:10,322-Speed 5154.11 samples/sec Loss 8.6429 Epoch: 4 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:20,442-Speed 5060.54 samples/sec Loss 8.6104 Epoch: 4 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:30,702-Speed 4990.72 samples/sec Loss 8.6340 Epoch: 4 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:41,491-Speed 4745.65 samples/sec Loss 8.6244 Epoch: 4 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:38:51,685-Speed 5022.93 samples/sec Loss 8.6484 Epoch: 4 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:02,401-Speed 4778.35 samples/sec Loss 8.6821 Epoch: 4 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:13,155-Speed 4761.09 samples/sec Loss 8.6315 Epoch: 4 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:23,231-Speed 5081.74 samples/sec Loss 8.6722 Epoch: 4 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:33,222-Speed 5124.84 samples/sec Loss 8.5603 Epoch: 4 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:43,124-Speed 5170.66 samples/sec Loss 8.5934 Epoch: 4 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:39:53,820-Speed 4787.31 samples/sec Loss 8.6703 Epoch: 4 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:03,998-Speed 5030.66 samples/sec Loss 8.6488 Epoch: 4 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:13,991-Speed 5123.93 samples/sec Loss 8.6585 Epoch: 4 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:24,824-Speed 4726.69 samples/sec Loss 8.6775 Epoch: 4 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:34,761-Speed 5152.74 samples/sec Loss 8.6223 Epoch: 4 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:44,587-Speed 5211.16 samples/sec Loss 8.6647 Epoch: 4 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:40:54,543-Speed 5142.90 samples/sec Loss 8.6417 Epoch: 4 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:04,399-Speed 5194.78 samples/sec Loss 8.6429 Epoch: 4 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:14,527-Speed 5055.91 samples/sec Loss 8.6149 Epoch: 4 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:25,227-Speed 4785.13 samples/sec Loss 8.6916 Epoch: 4 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:35,119-Speed 5176.08 samples/sec Loss 8.6473 Epoch: 4 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:45,266-Speed 5046.42 samples/sec Loss 8.6319 Epoch: 4 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:41:55,564-Speed 4971.84 samples/sec Loss 8.6396 Epoch: 4 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:05,545-Speed 5130.22 samples/sec Loss 8.6200 Epoch: 4 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:15,544-Speed 5120.81 samples/sec Loss 8.6391 Epoch: 4 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:25,470-Speed 5158.81 samples/sec Loss 8.6411 Epoch: 4 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:35,417-Speed 5147.68 samples/sec Loss 8.6661 Epoch: 4 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:45,516-Speed 5069.71 samples/sec Loss 8.6231 Epoch: 4 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:42:55,690-Speed 5032.60 samples/sec Loss 8.6532 Epoch: 4 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:05,475-Speed 5232.97 samples/sec Loss 8.7006 Epoch: 4 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:15,293-Speed 5215.29 samples/sec Loss 8.6482 Epoch: 4 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:25,174-Speed 5181.83 samples/sec Loss 8.6249 Epoch: 4 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:34,949-Speed 5238.03 samples/sec Loss 8.6302 Epoch: 4 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:44,865-Speed 5163.70 samples/sec Loss 8.6202 Epoch: 4 Global Step: 81900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:43:54,675-Speed 5219.89 samples/sec Loss 8.7151 Epoch: 4 Global Step: 81950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:44:04,646-Speed 5135.31 samples/sec Loss 8.6299 Epoch: 4 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:44:21,446-[lfw][82000]XNorm: 23.039953 Training: 2021-03-19 01:44:21,446-[lfw][82000]Accuracy-Flip: 0.99317+-0.00404 Training: 2021-03-19 01:44:21,446-[lfw][82000]Accuracy-Highest: 0.99500 Training: 2021-03-19 01:44:40,437-[cfp_fp][82000]XNorm: 18.651231 Training: 2021-03-19 01:44:40,437-[cfp_fp][82000]Accuracy-Flip: 0.90429+-0.01519 Training: 2021-03-19 01:44:40,438-[cfp_fp][82000]Accuracy-Highest: 0.90429 Training: 2021-03-19 01:44:56,575-[agedb_30][82000]XNorm: 21.803951 Training: 2021-03-19 01:44:56,575-[agedb_30][82000]Accuracy-Flip: 0.94033+-0.01215 Training: 2021-03-19 01:44:56,576-[agedb_30][82000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:45:06,348-Speed 829.80 samples/sec Loss 8.5874 Epoch: 4 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:45:16,323-Speed 5133.34 samples/sec Loss 8.7165 Epoch: 4 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:45:26,202-Speed 5182.82 samples/sec Loss 8.6822 Epoch: 4 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:45:36,257-Speed 5092.10 samples/sec Loss 8.6733 Epoch: 4 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:45:46,154-Speed 5173.72 samples/sec Loss 8.5951 Epoch: 4 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:45:56,124-Speed 5135.82 samples/sec Loss 8.6485 Epoch: 4 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:05,958-Speed 5206.35 samples/sec Loss 8.6172 Epoch: 4 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:15,796-Speed 5204.88 samples/sec Loss 8.5745 Epoch: 4 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:25,558-Speed 5245.07 samples/sec Loss 8.6502 Epoch: 4 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:35,498-Speed 5151.29 samples/sec Loss 8.7217 Epoch: 4 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:45,365-Speed 5189.55 samples/sec Loss 8.6686 Epoch: 4 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:46:55,373-Speed 5115.87 samples/sec Loss 8.6481 Epoch: 4 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:05,211-Speed 5204.47 samples/sec Loss 8.6484 Epoch: 4 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:15,033-Speed 5213.14 samples/sec Loss 8.5704 Epoch: 4 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:25,036-Speed 5118.93 samples/sec Loss 8.6614 Epoch: 4 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:35,058-Speed 5109.01 samples/sec Loss 8.6768 Epoch: 4 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:45,048-Speed 5125.33 samples/sec Loss 8.6817 Epoch: 4 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:47:55,001-Speed 5144.60 samples/sec Loss 8.6215 Epoch: 4 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:04,937-Speed 5153.13 samples/sec Loss 8.6385 Epoch: 4 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:14,881-Speed 5149.35 samples/sec Loss 8.7321 Epoch: 4 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:25,828-Speed 4677.21 samples/sec Loss 8.6161 Epoch: 4 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:35,894-Speed 5086.98 samples/sec Loss 8.7063 Epoch: 4 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:45,708-Speed 5217.51 samples/sec Loss 8.6466 Epoch: 4 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:48:55,600-Speed 5176.30 samples/sec Loss 8.6559 Epoch: 4 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:49:05,603-Speed 5118.42 samples/sec Loss 8.6594 Epoch: 4 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:49:15,564-Speed 5140.56 samples/sec Loss 8.6149 Epoch: 4 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:49:25,361-Speed 5226.25 samples/sec Loss 8.5893 Epoch: 4 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:49:35,945-Speed 4837.90 samples/sec Loss 8.6258 Epoch: 4 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:49:46,135-Speed 5024.83 samples/sec Loss 8.6379 Epoch: 4 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:50:08,973-Speed 2241.91 samples/sec Loss 7.9631 Epoch: 5 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:50:19,192-Speed 5010.63 samples/sec Loss 7.8377 Epoch: 5 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:50:29,458-Speed 4987.94 samples/sec Loss 7.8409 Epoch: 5 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:50:39,744-Speed 4978.23 samples/sec Loss 7.9289 Epoch: 5 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:50:49,806-Speed 5088.71 samples/sec Loss 7.8864 Epoch: 5 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:01,145-Speed 4515.77 samples/sec Loss 7.8814 Epoch: 5 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:13,281-Speed 4218.82 samples/sec Loss 7.9819 Epoch: 5 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:23,427-Speed 5046.77 samples/sec Loss 7.9808 Epoch: 5 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:33,334-Speed 5168.51 samples/sec Loss 7.9956 Epoch: 5 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:43,264-Speed 5156.14 samples/sec Loss 8.0273 Epoch: 5 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:51:53,148-Speed 5180.20 samples/sec Loss 8.0223 Epoch: 5 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:52:09,870-[lfw][84000]XNorm: 23.316071 Training: 2021-03-19 01:52:09,870-[lfw][84000]Accuracy-Flip: 0.99283+-0.00358 Training: 2021-03-19 01:52:09,870-[lfw][84000]Accuracy-Highest: 0.99500 Training: 2021-03-19 01:52:28,536-[cfp_fp][84000]XNorm: 19.177757 Training: 2021-03-19 01:52:28,536-[cfp_fp][84000]Accuracy-Flip: 0.89257+-0.01701 Training: 2021-03-19 01:52:28,536-[cfp_fp][84000]Accuracy-Highest: 0.90429 Training: 2021-03-19 01:52:44,592-[agedb_30][84000]XNorm: 22.321544 Training: 2021-03-19 01:52:44,592-[agedb_30][84000]Accuracy-Flip: 0.93817+-0.01055 Training: 2021-03-19 01:52:44,592-[agedb_30][84000]Accuracy-Highest: 0.94400 Training: 2021-03-19 01:52:55,117-Speed 826.23 samples/sec Loss 8.1096 Epoch: 5 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:05,303-Speed 5026.79 samples/sec Loss 8.1089 Epoch: 5 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:15,907-Speed 4828.53 samples/sec Loss 8.1224 Epoch: 5 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:25,871-Speed 5139.25 samples/sec Loss 8.1217 Epoch: 5 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:35,739-Speed 5188.90 samples/sec Loss 8.1029 Epoch: 5 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:45,756-Speed 5111.63 samples/sec Loss 8.1565 Epoch: 5 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:53:55,664-Speed 5168.00 samples/sec Loss 8.1400 Epoch: 5 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:05,623-Speed 5141.22 samples/sec Loss 8.1645 Epoch: 5 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:16,403-Speed 4749.70 samples/sec Loss 8.1633 Epoch: 5 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:26,620-Speed 5011.36 samples/sec Loss 8.1655 Epoch: 5 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:36,591-Speed 5135.26 samples/sec Loss 8.1866 Epoch: 5 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:46,572-Speed 5130.23 samples/sec Loss 8.2527 Epoch: 5 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:54:56,535-Speed 5139.18 samples/sec Loss 8.2182 Epoch: 5 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:06,290-Speed 5248.64 samples/sec Loss 8.1546 Epoch: 5 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:16,290-Speed 5120.55 samples/sec Loss 8.2666 Epoch: 5 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:26,102-Speed 5218.40 samples/sec Loss 8.2555 Epoch: 5 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:35,975-Speed 5185.92 samples/sec Loss 8.2320 Epoch: 5 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:45,904-Speed 5157.01 samples/sec Loss 8.2499 Epoch: 5 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:55:56,047-Speed 5048.04 samples/sec Loss 8.2422 Epoch: 5 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:06,251-Speed 5018.16 samples/sec Loss 8.3460 Epoch: 5 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:16,111-Speed 5192.54 samples/sec Loss 8.2883 Epoch: 5 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:26,188-Speed 5081.38 samples/sec Loss 8.3103 Epoch: 5 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:36,105-Speed 5162.94 samples/sec Loss 8.3882 Epoch: 5 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:46,265-Speed 5039.68 samples/sec Loss 8.3114 Epoch: 5 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:56:56,224-Speed 5141.37 samples/sec Loss 8.2826 Epoch: 5 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:05,875-Speed 5305.67 samples/sec Loss 8.3512 Epoch: 5 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:15,822-Speed 5147.44 samples/sec Loss 8.3392 Epoch: 5 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:25,842-Speed 5110.18 samples/sec Loss 8.3597 Epoch: 5 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:35,990-Speed 5045.56 samples/sec Loss 8.3308 Epoch: 5 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:46,203-Speed 5013.55 samples/sec Loss 8.3026 Epoch: 5 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:57:56,263-Speed 5090.03 samples/sec Loss 8.3369 Epoch: 5 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:06,407-Speed 5047.58 samples/sec Loss 8.3502 Epoch: 5 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:16,229-Speed 5213.01 samples/sec Loss 8.3680 Epoch: 5 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:26,064-Speed 5206.25 samples/sec Loss 8.3517 Epoch: 5 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:35,975-Speed 5166.29 samples/sec Loss 8.3921 Epoch: 5 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:45,934-Speed 5141.29 samples/sec Loss 8.3502 Epoch: 5 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:58:55,758-Speed 5212.07 samples/sec Loss 8.4047 Epoch: 5 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:59:05,749-Speed 5124.99 samples/sec Loss 8.3262 Epoch: 5 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:59:15,764-Speed 5112.70 samples/sec Loss 8.3531 Epoch: 5 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:59:25,682-Speed 5162.49 samples/sec Loss 8.3375 Epoch: 5 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 01:59:42,489-[lfw][86000]XNorm: 23.793640 Training: 2021-03-19 01:59:42,490-[lfw][86000]Accuracy-Flip: 0.99167+-0.00465 Training: 2021-03-19 01:59:42,490-[lfw][86000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:00:01,101-[cfp_fp][86000]XNorm: 18.600172 Training: 2021-03-19 02:00:01,101-[cfp_fp][86000]Accuracy-Flip: 0.88129+-0.01669 Training: 2021-03-19 02:00:01,101-[cfp_fp][86000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:00:17,320-[agedb_30][86000]XNorm: 21.586241 Training: 2021-03-19 02:00:17,320-[agedb_30][86000]Accuracy-Flip: 0.92850+-0.01357 Training: 2021-03-19 02:00:17,321-[agedb_30][86000]Accuracy-Highest: 0.94400 Training: 2021-03-19 02:00:27,193-Speed 832.39 samples/sec Loss 8.4432 Epoch: 5 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:00:36,914-Speed 5267.24 samples/sec Loss 8.3726 Epoch: 5 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:00:46,788-Speed 5185.62 samples/sec Loss 8.3442 Epoch: 5 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:00:56,725-Speed 5152.73 samples/sec Loss 8.3988 Epoch: 5 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:06,394-Speed 5295.76 samples/sec Loss 8.4320 Epoch: 5 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:16,480-Speed 5076.27 samples/sec Loss 8.3940 Epoch: 5 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:27,385-Speed 4695.43 samples/sec Loss 8.4774 Epoch: 5 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:37,184-Speed 5225.29 samples/sec Loss 8.4015 Epoch: 5 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:47,071-Speed 5179.02 samples/sec Loss 8.3815 Epoch: 5 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:01:56,928-Speed 5194.71 samples/sec Loss 8.4560 Epoch: 5 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:07,262-Speed 4954.76 samples/sec Loss 8.3949 Epoch: 5 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:17,426-Speed 5037.61 samples/sec Loss 8.3793 Epoch: 5 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:27,810-Speed 4931.30 samples/sec Loss 8.4419 Epoch: 5 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:38,478-Speed 4799.39 samples/sec Loss 8.4332 Epoch: 5 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:48,527-Speed 5095.58 samples/sec Loss 8.3545 Epoch: 5 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:02:58,484-Speed 5142.74 samples/sec Loss 8.4537 Epoch: 5 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:08,283-Speed 5225.29 samples/sec Loss 8.5116 Epoch: 5 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:18,177-Speed 5175.11 samples/sec Loss 8.4187 Epoch: 5 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:27,933-Speed 5248.62 samples/sec Loss 8.4384 Epoch: 5 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:37,553-Speed 5322.52 samples/sec Loss 8.5398 Epoch: 5 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:48,134-Speed 4839.44 samples/sec Loss 8.4745 Epoch: 5 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:03:59,182-Speed 4634.49 samples/sec Loss 8.5026 Epoch: 5 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:04:09,948-Speed 4756.24 samples/sec Loss 8.4480 Epoch: 5 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:04:19,794-Speed 5200.26 samples/sec Loss 8.5211 Epoch: 5 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:04:29,931-Speed 5050.79 samples/sec Loss 8.4431 Epoch: 5 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:04:39,706-Speed 5238.52 samples/sec Loss 8.4989 Epoch: 5 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:04:49,616-Speed 5166.59 samples/sec Loss 8.4563 Epoch: 5 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:00,367-Speed 4762.78 samples/sec Loss 8.4664 Epoch: 5 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:10,398-Speed 5104.46 samples/sec Loss 8.5650 Epoch: 5 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:21,158-Speed 4758.57 samples/sec Loss 8.4659 Epoch: 5 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:31,123-Speed 5138.00 samples/sec Loss 8.4678 Epoch: 5 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:41,059-Speed 5153.39 samples/sec Loss 8.4280 Epoch: 5 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:05:50,793-Speed 5259.88 samples/sec Loss 8.4767 Epoch: 5 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:00,511-Speed 5269.50 samples/sec Loss 8.4750 Epoch: 5 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:11,101-Speed 4834.96 samples/sec Loss 8.5201 Epoch: 5 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:21,075-Speed 5133.32 samples/sec Loss 8.5858 Epoch: 5 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:31,120-Speed 5097.65 samples/sec Loss 8.5093 Epoch: 5 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:40,967-Speed 5199.92 samples/sec Loss 8.6007 Epoch: 5 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:06:51,003-Speed 5101.45 samples/sec Loss 8.5004 Epoch: 5 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:07:00,811-Speed 5220.89 samples/sec Loss 8.5256 Epoch: 5 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:07:17,495-[lfw][88000]XNorm: 23.077036 Training: 2021-03-19 02:07:17,495-[lfw][88000]Accuracy-Flip: 0.99250+-0.00359 Training: 2021-03-19 02:07:17,496-[lfw][88000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:07:36,002-[cfp_fp][88000]XNorm: 18.454289 Training: 2021-03-19 02:07:36,003-[cfp_fp][88000]Accuracy-Flip: 0.90257+-0.01697 Training: 2021-03-19 02:07:36,003-[cfp_fp][88000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:07:52,080-[agedb_30][88000]XNorm: 21.107947 Training: 2021-03-19 02:07:52,080-[agedb_30][88000]Accuracy-Flip: 0.93633+-0.01318 Training: 2021-03-19 02:07:52,080-[agedb_30][88000]Accuracy-Highest: 0.94400 Training: 2021-03-19 02:08:01,701-Speed 840.87 samples/sec Loss 8.5494 Epoch: 5 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:08:11,788-Speed 5075.72 samples/sec Loss 8.5971 Epoch: 5 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:08:21,835-Speed 5096.47 samples/sec Loss 8.5734 Epoch: 5 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:08:31,814-Speed 5131.03 samples/sec Loss 8.5563 Epoch: 5 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:08:41,815-Speed 5119.94 samples/sec Loss 8.4940 Epoch: 5 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:08:51,764-Speed 5146.34 samples/sec Loss 8.5167 Epoch: 5 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:01,838-Speed 5082.44 samples/sec Loss 8.5519 Epoch: 5 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:11,671-Speed 5207.28 samples/sec Loss 8.4520 Epoch: 5 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:21,606-Speed 5154.08 samples/sec Loss 8.5344 Epoch: 5 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:31,466-Speed 5193.12 samples/sec Loss 8.5558 Epoch: 5 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:41,298-Speed 5207.75 samples/sec Loss 8.5684 Epoch: 5 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:09:51,225-Speed 5157.77 samples/sec Loss 8.5251 Epoch: 5 Global Step: 88600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:01,124-Speed 5173.11 samples/sec Loss 8.6311 Epoch: 5 Global Step: 88650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:10,783-Speed 5300.73 samples/sec Loss 8.5849 Epoch: 5 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:20,552-Speed 5241.45 samples/sec Loss 8.5095 Epoch: 5 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:30,493-Speed 5150.86 samples/sec Loss 8.5456 Epoch: 5 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:40,353-Speed 5192.75 samples/sec Loss 8.5214 Epoch: 5 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:10:50,236-Speed 5181.05 samples/sec Loss 8.4995 Epoch: 5 Global Step: 88900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:00,268-Speed 5103.87 samples/sec Loss 8.6401 Epoch: 5 Global Step: 88950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:10,255-Speed 5127.35 samples/sec Loss 8.5319 Epoch: 5 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:20,242-Speed 5127.01 samples/sec Loss 8.5558 Epoch: 5 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:29,988-Speed 5253.92 samples/sec Loss 8.5684 Epoch: 5 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:39,783-Speed 5227.20 samples/sec Loss 8.5469 Epoch: 5 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:49,646-Speed 5191.75 samples/sec Loss 8.5351 Epoch: 5 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:11:59,515-Speed 5188.41 samples/sec Loss 8.5597 Epoch: 5 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:12:09,467-Speed 5144.64 samples/sec Loss 8.5729 Epoch: 5 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:12:19,380-Speed 5165.47 samples/sec Loss 8.5884 Epoch: 5 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:12:29,271-Speed 5176.47 samples/sec Loss 8.5536 Epoch: 5 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:12:39,145-Speed 5185.80 samples/sec Loss 8.5778 Epoch: 5 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:12:49,927-Speed 4748.87 samples/sec Loss 8.6000 Epoch: 5 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:00,012-Speed 5077.33 samples/sec Loss 8.5578 Epoch: 5 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:09,921-Speed 5167.21 samples/sec Loss 8.6251 Epoch: 5 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:19,875-Speed 5143.79 samples/sec Loss 8.5770 Epoch: 5 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:30,050-Speed 5032.47 samples/sec Loss 8.5277 Epoch: 5 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:40,016-Speed 5137.43 samples/sec Loss 8.6044 Epoch: 5 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:50,072-Speed 5092.09 samples/sec Loss 8.5110 Epoch: 5 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:13:59,835-Speed 5244.71 samples/sec Loss 8.5698 Epoch: 5 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:14:10,192-Speed 4943.61 samples/sec Loss 8.6023 Epoch: 5 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:14:20,193-Speed 5119.95 samples/sec Loss 8.5630 Epoch: 5 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:14:30,777-Speed 4837.59 samples/sec Loss 8.4461 Epoch: 5 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:14:46,852-[lfw][90000]XNorm: 25.830081 Training: 2021-03-19 02:14:46,852-[lfw][90000]Accuracy-Flip: 0.99367+-0.00348 Training: 2021-03-19 02:14:46,852-[lfw][90000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:15:05,588-[cfp_fp][90000]XNorm: 20.559939 Training: 2021-03-19 02:15:05,589-[cfp_fp][90000]Accuracy-Flip: 0.89214+-0.01960 Training: 2021-03-19 02:15:05,589-[cfp_fp][90000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:15:21,798-[agedb_30][90000]XNorm: 24.055389 Training: 2021-03-19 02:15:21,799-[agedb_30][90000]Accuracy-Flip: 0.93533+-0.01445 Training: 2021-03-19 02:15:21,799-[agedb_30][90000]Accuracy-Highest: 0.94400 Training: 2021-03-19 02:15:31,558-Speed 842.38 samples/sec Loss 8.6072 Epoch: 5 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:15:41,507-Speed 5146.50 samples/sec Loss 8.5954 Epoch: 5 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:15:51,382-Speed 5185.30 samples/sec Loss 8.4839 Epoch: 5 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:16:01,280-Speed 5173.09 samples/sec Loss 8.5365 Epoch: 5 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:16:12,223-Speed 4678.92 samples/sec Loss 8.6031 Epoch: 5 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:16:22,156-Speed 5154.94 samples/sec Loss 8.6380 Epoch: 5 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:16:32,119-Speed 5139.30 samples/sec Loss 8.5702 Epoch: 5 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-19 02:16:42,212-Speed 5073.33 samples/sec Loss 8.5890 Epoch: 5 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:16:52,935-Speed 4775.12 samples/sec Loss 8.6085 Epoch: 5 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:03,723-Speed 4746.18 samples/sec Loss 8.6094 Epoch: 5 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:13,764-Speed 5099.51 samples/sec Loss 8.6172 Epoch: 5 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:24,025-Speed 4989.99 samples/sec Loss 8.4710 Epoch: 5 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:34,877-Speed 4718.07 samples/sec Loss 8.5574 Epoch: 5 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:44,858-Speed 5130.44 samples/sec Loss 8.5829 Epoch: 5 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:17:55,006-Speed 5045.67 samples/sec Loss 8.5684 Epoch: 5 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:05,649-Speed 4810.76 samples/sec Loss 8.5717 Epoch: 5 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:15,415-Speed 5243.03 samples/sec Loss 8.4654 Epoch: 5 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:25,227-Speed 5218.74 samples/sec Loss 8.5681 Epoch: 5 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:35,343-Speed 5061.61 samples/sec Loss 8.5253 Epoch: 5 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:45,524-Speed 5028.97 samples/sec Loss 8.5731 Epoch: 5 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:18:56,061-Speed 4859.58 samples/sec Loss 8.5684 Epoch: 5 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:05,994-Speed 5154.45 samples/sec Loss 8.5698 Epoch: 5 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:15,901-Speed 5168.46 samples/sec Loss 8.5698 Epoch: 5 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:25,955-Speed 5092.59 samples/sec Loss 8.5809 Epoch: 5 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:36,287-Speed 4955.85 samples/sec Loss 8.5199 Epoch: 5 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:46,031-Speed 5255.14 samples/sec Loss 8.5819 Epoch: 5 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:19:55,798-Speed 5242.27 samples/sec Loss 8.6030 Epoch: 5 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:05,689-Speed 5176.57 samples/sec Loss 8.5871 Epoch: 5 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:15,556-Speed 5189.32 samples/sec Loss 8.5980 Epoch: 5 Global Step: 91450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:25,525-Speed 5136.72 samples/sec Loss 8.6650 Epoch: 5 Global Step: 91500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:35,458-Speed 5154.37 samples/sec Loss 8.6420 Epoch: 5 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:45,143-Speed 5286.90 samples/sec Loss 8.5835 Epoch: 5 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:20:55,016-Speed 5185.96 samples/sec Loss 8.6234 Epoch: 5 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:04,938-Speed 5160.92 samples/sec Loss 8.6118 Epoch: 5 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:14,954-Speed 5112.01 samples/sec Loss 8.6284 Epoch: 5 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:24,931-Speed 5131.91 samples/sec Loss 8.5124 Epoch: 5 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:34,939-Speed 5116.22 samples/sec Loss 8.5656 Epoch: 5 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:44,852-Speed 5165.68 samples/sec Loss 8.5330 Epoch: 5 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:21:54,782-Speed 5156.15 samples/sec Loss 8.5862 Epoch: 5 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:22:04,750-Speed 5136.86 samples/sec Loss 8.6050 Epoch: 5 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:22:21,442-[lfw][92000]XNorm: 23.548056 Training: 2021-03-19 02:22:21,442-[lfw][92000]Accuracy-Flip: 0.99217+-0.00533 Training: 2021-03-19 02:22:21,442-[lfw][92000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:22:40,060-[cfp_fp][92000]XNorm: 18.716073 Training: 2021-03-19 02:22:40,060-[cfp_fp][92000]Accuracy-Flip: 0.89486+-0.01611 Training: 2021-03-19 02:22:40,060-[cfp_fp][92000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:22:56,109-[agedb_30][92000]XNorm: 22.157592 Training: 2021-03-19 02:22:56,109-[agedb_30][92000]Accuracy-Flip: 0.94417+-0.01243 Training: 2021-03-19 02:22:56,109-[agedb_30][92000]Accuracy-Highest: 0.94417 Training: 2021-03-19 02:23:05,916-Speed 837.07 samples/sec Loss 8.6221 Epoch: 5 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:23:15,826-Speed 5166.99 samples/sec Loss 8.6255 Epoch: 5 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:23:25,727-Speed 5171.51 samples/sec Loss 8.5633 Epoch: 5 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:23:35,684-Speed 5142.26 samples/sec Loss 8.5981 Epoch: 5 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:23:45,796-Speed 5063.53 samples/sec Loss 8.5890 Epoch: 5 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:23:55,907-Speed 5064.24 samples/sec Loss 8.5512 Epoch: 5 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:06,005-Speed 5070.63 samples/sec Loss 8.5950 Epoch: 5 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:16,221-Speed 5012.02 samples/sec Loss 8.5971 Epoch: 5 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:26,191-Speed 5135.41 samples/sec Loss 8.6224 Epoch: 5 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:36,280-Speed 5075.59 samples/sec Loss 8.6315 Epoch: 5 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:46,348-Speed 5085.61 samples/sec Loss 8.5467 Epoch: 5 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:24:56,662-Speed 4964.18 samples/sec Loss 8.5608 Epoch: 5 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:06,854-Speed 5024.15 samples/sec Loss 8.5634 Epoch: 5 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:16,720-Speed 5189.73 samples/sec Loss 8.5552 Epoch: 5 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:26,673-Speed 5144.77 samples/sec Loss 8.5688 Epoch: 5 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:37,520-Speed 4720.62 samples/sec Loss 8.6292 Epoch: 5 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:47,404-Speed 5180.20 samples/sec Loss 8.6269 Epoch: 5 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:25:57,043-Speed 5312.51 samples/sec Loss 8.5307 Epoch: 5 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:07,008-Speed 5138.07 samples/sec Loss 8.5778 Epoch: 5 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:16,737-Speed 5262.96 samples/sec Loss 8.6040 Epoch: 5 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:26,970-Speed 5003.62 samples/sec Loss 8.5954 Epoch: 5 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:37,009-Speed 5100.36 samples/sec Loss 8.5053 Epoch: 5 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:46,761-Speed 5250.51 samples/sec Loss 8.6262 Epoch: 5 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:26:56,749-Speed 5126.41 samples/sec Loss 8.6325 Epoch: 5 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:06,782-Speed 5103.36 samples/sec Loss 8.5928 Epoch: 5 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:17,575-Speed 4744.45 samples/sec Loss 8.6004 Epoch: 5 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:27,345-Speed 5240.78 samples/sec Loss 8.6050 Epoch: 5 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:37,226-Speed 5182.35 samples/sec Loss 8.6462 Epoch: 5 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:47,201-Speed 5132.75 samples/sec Loss 8.6288 Epoch: 5 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:27:57,063-Speed 5192.19 samples/sec Loss 8.5937 Epoch: 5 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:07,053-Speed 5125.08 samples/sec Loss 8.5756 Epoch: 5 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:17,644-Speed 4834.77 samples/sec Loss 8.6097 Epoch: 5 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:27,518-Speed 5185.80 samples/sec Loss 8.6475 Epoch: 5 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:37,297-Speed 5235.78 samples/sec Loss 8.5138 Epoch: 5 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:47,391-Speed 5072.68 samples/sec Loss 8.6010 Epoch: 5 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:28:58,377-Speed 4660.57 samples/sec Loss 8.6549 Epoch: 5 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:29:09,324-Speed 4677.50 samples/sec Loss 8.6053 Epoch: 5 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:29:19,336-Speed 5114.01 samples/sec Loss 8.6544 Epoch: 5 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:29:29,509-Speed 5033.61 samples/sec Loss 8.6624 Epoch: 5 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:29:40,322-Speed 4735.21 samples/sec Loss 8.6314 Epoch: 5 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:29:56,881-[lfw][94000]XNorm: 23.027583 Training: 2021-03-19 02:29:56,882-[lfw][94000]Accuracy-Flip: 0.99233+-0.00517 Training: 2021-03-19 02:29:56,882-[lfw][94000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:30:15,576-[cfp_fp][94000]XNorm: 18.744997 Training: 2021-03-19 02:30:15,577-[cfp_fp][94000]Accuracy-Flip: 0.89286+-0.01601 Training: 2021-03-19 02:30:15,577-[cfp_fp][94000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:30:31,732-[agedb_30][94000]XNorm: 21.935573 Training: 2021-03-19 02:30:31,732-[agedb_30][94000]Accuracy-Flip: 0.93950+-0.01362 Training: 2021-03-19 02:30:31,732-[agedb_30][94000]Accuracy-Highest: 0.94417 Training: 2021-03-19 02:30:41,574-Speed 835.89 samples/sec Loss 8.5890 Epoch: 5 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:30:51,556-Speed 5129.80 samples/sec Loss 8.6000 Epoch: 5 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:02,235-Speed 4794.76 samples/sec Loss 8.5862 Epoch: 5 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:12,039-Speed 5222.62 samples/sec Loss 8.6897 Epoch: 5 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:21,965-Speed 5158.36 samples/sec Loss 8.5881 Epoch: 5 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:31,901-Speed 5153.51 samples/sec Loss 8.5729 Epoch: 5 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:42,653-Speed 4761.96 samples/sec Loss 8.6183 Epoch: 5 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:31:52,388-Speed 5259.43 samples/sec Loss 8.6383 Epoch: 5 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:02,054-Speed 5297.68 samples/sec Loss 8.6448 Epoch: 5 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:12,241-Speed 5026.42 samples/sec Loss 8.5605 Epoch: 5 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:22,355-Speed 5062.28 samples/sec Loss 8.6118 Epoch: 5 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:32,481-Speed 5056.66 samples/sec Loss 8.5970 Epoch: 5 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:42,525-Speed 5097.82 samples/sec Loss 8.5948 Epoch: 5 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:32:52,419-Speed 5175.25 samples/sec Loss 8.6439 Epoch: 5 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:02,364-Speed 5148.63 samples/sec Loss 8.5524 Epoch: 5 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:12,593-Speed 5005.67 samples/sec Loss 8.5752 Epoch: 5 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:22,266-Speed 5293.40 samples/sec Loss 8.5882 Epoch: 5 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:32,090-Speed 5211.99 samples/sec Loss 8.5438 Epoch: 5 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:42,098-Speed 5116.29 samples/sec Loss 8.5979 Epoch: 5 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:33:51,892-Speed 5227.78 samples/sec Loss 8.5461 Epoch: 5 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:01,801-Speed 5167.65 samples/sec Loss 8.5847 Epoch: 5 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:11,790-Speed 5126.07 samples/sec Loss 8.6594 Epoch: 5 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:21,788-Speed 5121.31 samples/sec Loss 8.6203 Epoch: 5 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:31,388-Speed 5333.42 samples/sec Loss 8.5505 Epoch: 5 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:41,222-Speed 5206.95 samples/sec Loss 8.6024 Epoch: 5 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:34:51,250-Speed 5105.83 samples/sec Loss 8.5885 Epoch: 5 Global Step: 95300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:00,895-Speed 5309.30 samples/sec Loss 8.6030 Epoch: 5 Global Step: 95350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:11,134-Speed 5000.57 samples/sec Loss 8.5996 Epoch: 5 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:20,976-Speed 5202.42 samples/sec Loss 8.5592 Epoch: 5 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:30,990-Speed 5113.04 samples/sec Loss 8.6299 Epoch: 5 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:40,944-Speed 5144.41 samples/sec Loss 8.5834 Epoch: 5 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:35:50,954-Speed 5115.08 samples/sec Loss 8.5594 Epoch: 5 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:00,887-Speed 5154.57 samples/sec Loss 8.6800 Epoch: 5 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:10,954-Speed 5086.14 samples/sec Loss 8.6222 Epoch: 5 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:20,921-Speed 5137.38 samples/sec Loss 8.5701 Epoch: 5 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:31,059-Speed 5050.43 samples/sec Loss 8.5707 Epoch: 5 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:41,032-Speed 5134.16 samples/sec Loss 8.5266 Epoch: 5 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:36:50,977-Speed 5148.73 samples/sec Loss 8.6431 Epoch: 5 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:37:01,054-Speed 5081.34 samples/sec Loss 8.6314 Epoch: 5 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:37:10,987-Speed 5155.04 samples/sec Loss 8.5471 Epoch: 5 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:37:27,770-[lfw][96000]XNorm: 22.430425 Training: 2021-03-19 02:37:27,770-[lfw][96000]Accuracy-Flip: 0.99250+-0.00430 Training: 2021-03-19 02:37:27,770-[lfw][96000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:37:46,478-[cfp_fp][96000]XNorm: 18.198265 Training: 2021-03-19 02:37:46,478-[cfp_fp][96000]Accuracy-Flip: 0.89571+-0.01785 Training: 2021-03-19 02:37:46,478-[cfp_fp][96000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:38:02,663-[agedb_30][96000]XNorm: 21.573034 Training: 2021-03-19 02:38:02,663-[agedb_30][96000]Accuracy-Flip: 0.92950+-0.01274 Training: 2021-03-19 02:38:02,663-[agedb_30][96000]Accuracy-Highest: 0.94417 Training: 2021-03-19 02:38:12,225-Speed 836.08 samples/sec Loss 8.5951 Epoch: 5 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:38:22,834-Speed 4826.52 samples/sec Loss 8.6600 Epoch: 5 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:38:32,812-Speed 5131.44 samples/sec Loss 8.6172 Epoch: 5 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:38:42,678-Speed 5189.86 samples/sec Loss 8.5959 Epoch: 5 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:38:52,405-Speed 5264.05 samples/sec Loss 8.6148 Epoch: 5 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:02,491-Speed 5076.71 samples/sec Loss 8.5855 Epoch: 5 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:12,369-Speed 5184.01 samples/sec Loss 8.5880 Epoch: 5 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:22,503-Speed 5052.53 samples/sec Loss 8.5908 Epoch: 5 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:32,479-Speed 5132.48 samples/sec Loss 8.5878 Epoch: 5 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:42,468-Speed 5125.85 samples/sec Loss 8.5810 Epoch: 5 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:39:53,233-Speed 4756.30 samples/sec Loss 8.6462 Epoch: 5 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:03,098-Speed 5190.30 samples/sec Loss 8.6335 Epoch: 5 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:13,321-Speed 5008.69 samples/sec Loss 8.6061 Epoch: 5 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:23,110-Speed 5230.76 samples/sec Loss 8.6361 Epoch: 5 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:33,106-Speed 5122.32 samples/sec Loss 8.5622 Epoch: 5 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:43,229-Speed 5057.97 samples/sec Loss 8.6243 Epoch: 5 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:40:53,006-Speed 5237.39 samples/sec Loss 8.5774 Epoch: 5 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:03,634-Speed 4817.96 samples/sec Loss 8.6585 Epoch: 5 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:13,592-Speed 5141.99 samples/sec Loss 8.5686 Epoch: 5 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:23,788-Speed 5021.69 samples/sec Loss 8.5973 Epoch: 5 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:33,493-Speed 5275.88 samples/sec Loss 8.6300 Epoch: 5 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:43,375-Speed 5181.58 samples/sec Loss 8.5633 Epoch: 5 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:41:53,912-Speed 4859.18 samples/sec Loss 8.6612 Epoch: 5 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:04,746-Speed 4726.39 samples/sec Loss 8.6743 Epoch: 5 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:14,705-Speed 5140.98 samples/sec Loss 8.5903 Epoch: 5 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:25,402-Speed 4786.65 samples/sec Loss 8.5654 Epoch: 5 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:36,154-Speed 4762.43 samples/sec Loss 8.5628 Epoch: 5 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:46,164-Speed 5115.20 samples/sec Loss 8.5384 Epoch: 5 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:42:56,196-Speed 5103.68 samples/sec Loss 8.5638 Epoch: 5 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:06,005-Speed 5220.02 samples/sec Loss 8.5831 Epoch: 5 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:15,980-Speed 5133.36 samples/sec Loss 8.5742 Epoch: 5 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:25,913-Speed 5154.52 samples/sec Loss 8.6366 Epoch: 5 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:35,617-Speed 5276.45 samples/sec Loss 8.5961 Epoch: 5 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:46,214-Speed 4832.11 samples/sec Loss 8.5379 Epoch: 5 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:43:56,037-Speed 5212.39 samples/sec Loss 8.6071 Epoch: 5 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:44:05,928-Speed 5176.51 samples/sec Loss 8.6612 Epoch: 5 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:44:15,822-Speed 5175.07 samples/sec Loss 8.6196 Epoch: 5 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:44:25,529-Speed 5274.76 samples/sec Loss 8.6024 Epoch: 5 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:44:35,634-Speed 5067.45 samples/sec Loss 8.5683 Epoch: 5 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:44:45,746-Speed 5063.55 samples/sec Loss 8.6207 Epoch: 5 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:45:02,486-[lfw][98000]XNorm: 23.822098 Training: 2021-03-19 02:45:02,486-[lfw][98000]Accuracy-Flip: 0.99433+-0.00318 Training: 2021-03-19 02:45:02,486-[lfw][98000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:45:21,177-[cfp_fp][98000]XNorm: 19.071110 Training: 2021-03-19 02:45:21,178-[cfp_fp][98000]Accuracy-Flip: 0.89086+-0.01316 Training: 2021-03-19 02:45:21,178-[cfp_fp][98000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:45:37,327-[agedb_30][98000]XNorm: 22.349647 Training: 2021-03-19 02:45:37,328-[agedb_30][98000]Accuracy-Flip: 0.93750+-0.01510 Training: 2021-03-19 02:45:37,328-[agedb_30][98000]Accuracy-Highest: 0.94417 Training: 2021-03-19 02:45:47,013-Speed 835.69 samples/sec Loss 8.5584 Epoch: 5 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:45:56,895-Speed 5181.07 samples/sec Loss 8.5418 Epoch: 5 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:06,953-Speed 5091.13 samples/sec Loss 8.5588 Epoch: 5 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:16,881-Speed 5157.30 samples/sec Loss 8.5699 Epoch: 5 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:26,669-Speed 5231.21 samples/sec Loss 8.5676 Epoch: 5 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:36,842-Speed 5033.05 samples/sec Loss 8.6086 Epoch: 5 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:46,748-Speed 5168.97 samples/sec Loss 8.5270 Epoch: 5 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:46:56,842-Speed 5072.90 samples/sec Loss 8.5669 Epoch: 5 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:06,717-Speed 5184.91 samples/sec Loss 8.5786 Epoch: 5 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:16,545-Speed 5209.94 samples/sec Loss 8.5876 Epoch: 5 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:26,728-Speed 5028.42 samples/sec Loss 8.6009 Epoch: 5 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:36,607-Speed 5183.07 samples/sec Loss 8.6231 Epoch: 5 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:46,860-Speed 4993.85 samples/sec Loss 8.5497 Epoch: 5 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:47:56,630-Speed 5240.96 samples/sec Loss 8.6603 Epoch: 5 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:06,629-Speed 5121.08 samples/sec Loss 8.5575 Epoch: 5 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:16,880-Speed 4994.84 samples/sec Loss 8.5895 Epoch: 5 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:26,800-Speed 5161.48 samples/sec Loss 8.6195 Epoch: 5 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:36,601-Speed 5224.60 samples/sec Loss 8.6046 Epoch: 5 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:46,566-Speed 5137.93 samples/sec Loss 8.6026 Epoch: 5 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:48:56,360-Speed 5228.17 samples/sec Loss 8.6433 Epoch: 5 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:06,256-Speed 5174.08 samples/sec Loss 8.6454 Epoch: 5 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:16,133-Speed 5184.00 samples/sec Loss 8.6318 Epoch: 5 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:26,134-Speed 5119.63 samples/sec Loss 8.5621 Epoch: 5 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:36,122-Speed 5126.60 samples/sec Loss 8.5871 Epoch: 5 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:45,965-Speed 5202.12 samples/sec Loss 8.5781 Epoch: 5 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:49:55,894-Speed 5156.86 samples/sec Loss 8.6120 Epoch: 5 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:05,883-Speed 5125.81 samples/sec Loss 8.5817 Epoch: 5 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:16,612-Speed 4772.72 samples/sec Loss 8.6042 Epoch: 5 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:26,674-Speed 5088.48 samples/sec Loss 8.5831 Epoch: 5 Global Step: 99450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:36,672-Speed 5121.59 samples/sec Loss 8.5803 Epoch: 5 Global Step: 99500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:46,857-Speed 5026.95 samples/sec Loss 8.6055 Epoch: 5 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:50:56,734-Speed 5183.89 samples/sec Loss 8.5658 Epoch: 5 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:06,519-Speed 5233.18 samples/sec Loss 8.5797 Epoch: 5 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:16,434-Speed 5163.97 samples/sec Loss 8.6331 Epoch: 5 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:26,199-Speed 5243.88 samples/sec Loss 8.5989 Epoch: 5 Global Step: 99750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:36,157-Speed 5141.57 samples/sec Loss 8.5580 Epoch: 5 Global Step: 99800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:47,008-Speed 4719.04 samples/sec Loss 8.6408 Epoch: 5 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:51:56,795-Speed 5231.71 samples/sec Loss 8.6435 Epoch: 5 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:52:06,764-Speed 5136.51 samples/sec Loss 8.5377 Epoch: 5 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:52:16,562-Speed 5225.72 samples/sec Loss 8.5759 Epoch: 5 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:52:33,126-[lfw][100000]XNorm: 21.687841 Training: 2021-03-19 02:52:33,126-[lfw][100000]Accuracy-Flip: 0.99300+-0.00407 Training: 2021-03-19 02:52:33,126-[lfw][100000]Accuracy-Highest: 0.99500 Training: 2021-03-19 02:52:51,836-[cfp_fp][100000]XNorm: 17.418651 Training: 2021-03-19 02:52:51,836-[cfp_fp][100000]Accuracy-Flip: 0.88729+-0.01867 Training: 2021-03-19 02:52:51,836-[cfp_fp][100000]Accuracy-Highest: 0.90429 Training: 2021-03-19 02:53:08,019-[agedb_30][100000]XNorm: 20.384687 Training: 2021-03-19 02:53:08,019-[agedb_30][100000]Accuracy-Flip: 0.93933+-0.01041 Training: 2021-03-19 02:53:08,019-[agedb_30][100000]Accuracy-Highest: 0.94417 Training: 2021-03-19 02:53:18,044-Speed 832.77 samples/sec Loss 8.5693 Epoch: 5 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:53:27,885-Speed 5203.20 samples/sec Loss 8.6138 Epoch: 5 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:53:50,693-Speed 2244.90 samples/sec Loss 8.5506 Epoch: 6 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:01,845-Speed 4591.34 samples/sec Loss 7.7904 Epoch: 6 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:11,784-Speed 5152.00 samples/sec Loss 7.8011 Epoch: 6 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:21,712-Speed 5157.58 samples/sec Loss 7.8014 Epoch: 6 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:31,677-Speed 5138.00 samples/sec Loss 7.8777 Epoch: 6 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:42,854-Speed 4581.37 samples/sec Loss 7.8750 Epoch: 6 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:54:52,973-Speed 5060.19 samples/sec Loss 7.8979 Epoch: 6 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:03,699-Speed 4773.56 samples/sec Loss 7.8872 Epoch: 6 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:13,878-Speed 5030.38 samples/sec Loss 7.9981 Epoch: 6 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:23,748-Speed 5187.53 samples/sec Loss 7.9940 Epoch: 6 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:34,592-Speed 4721.97 samples/sec Loss 8.0264 Epoch: 6 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:45,539-Speed 4677.29 samples/sec Loss 8.0393 Epoch: 6 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:55:55,439-Speed 5172.08 samples/sec Loss 8.0108 Epoch: 6 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:05,380-Speed 5150.70 samples/sec Loss 8.0411 Epoch: 6 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:15,365-Speed 5128.05 samples/sec Loss 8.0574 Epoch: 6 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:25,520-Speed 5041.79 samples/sec Loss 8.1011 Epoch: 6 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:35,336-Speed 5216.71 samples/sec Loss 8.0453 Epoch: 6 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:46,021-Speed 4792.21 samples/sec Loss 8.0832 Epoch: 6 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:56:55,987-Speed 5137.69 samples/sec Loss 8.0858 Epoch: 6 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:05,947-Speed 5140.65 samples/sec Loss 8.0575 Epoch: 6 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:15,986-Speed 5100.72 samples/sec Loss 8.1482 Epoch: 6 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:26,013-Speed 5106.56 samples/sec Loss 8.1216 Epoch: 6 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:35,852-Speed 5203.92 samples/sec Loss 8.1525 Epoch: 6 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:45,861-Speed 5115.57 samples/sec Loss 8.1637 Epoch: 6 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:57:55,692-Speed 5208.74 samples/sec Loss 8.1209 Epoch: 6 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:05,582-Speed 5176.78 samples/sec Loss 8.2013 Epoch: 6 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:15,457-Speed 5185.57 samples/sec Loss 8.1589 Epoch: 6 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:25,526-Speed 5084.92 samples/sec Loss 8.1639 Epoch: 6 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:35,306-Speed 5235.97 samples/sec Loss 8.1429 Epoch: 6 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:45,184-Speed 5183.51 samples/sec Loss 8.2266 Epoch: 6 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:58:55,093-Speed 5167.12 samples/sec Loss 8.1745 Epoch: 6 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:05,028-Speed 5153.92 samples/sec Loss 8.1718 Epoch: 6 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:15,123-Speed 5072.18 samples/sec Loss 8.2418 Epoch: 6 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:25,090-Speed 5136.86 samples/sec Loss 8.2356 Epoch: 6 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:34,947-Speed 5195.01 samples/sec Loss 8.2289 Epoch: 6 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:44,929-Speed 5129.39 samples/sec Loss 8.2322 Epoch: 6 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 02:59:54,913-Speed 5128.63 samples/sec Loss 8.2713 Epoch: 6 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:00:04,730-Speed 5215.52 samples/sec Loss 8.2567 Epoch: 6 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:00:21,354-[lfw][102000]XNorm: 21.938176 Training: 2021-03-19 03:00:21,355-[lfw][102000]Accuracy-Flip: 0.99367+-0.00386 Training: 2021-03-19 03:00:21,355-[lfw][102000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:00:40,032-[cfp_fp][102000]XNorm: 18.084172 Training: 2021-03-19 03:00:40,032-[cfp_fp][102000]Accuracy-Flip: 0.89371+-0.01560 Training: 2021-03-19 03:00:40,032-[cfp_fp][102000]Accuracy-Highest: 0.90429 Training: 2021-03-19 03:00:56,143-[agedb_30][102000]XNorm: 20.741779 Training: 2021-03-19 03:00:56,143-[agedb_30][102000]Accuracy-Flip: 0.93033+-0.01021 Training: 2021-03-19 03:00:56,143-[agedb_30][102000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:01:05,606-Speed 841.06 samples/sec Loss 8.3024 Epoch: 6 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:01:15,435-Speed 5209.16 samples/sec Loss 8.2873 Epoch: 6 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:01:25,617-Speed 5028.72 samples/sec Loss 8.2772 Epoch: 6 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:01:35,521-Speed 5169.72 samples/sec Loss 8.3018 Epoch: 6 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:01:45,183-Speed 5299.50 samples/sec Loss 8.2525 Epoch: 6 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:01:55,160-Speed 5132.20 samples/sec Loss 8.2624 Epoch: 6 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:05,176-Speed 5111.99 samples/sec Loss 8.3104 Epoch: 6 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:15,382-Speed 5017.08 samples/sec Loss 8.3199 Epoch: 6 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:25,471-Speed 5074.94 samples/sec Loss 8.3076 Epoch: 6 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:35,476-Speed 5117.87 samples/sec Loss 8.3112 Epoch: 6 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:45,378-Speed 5171.22 samples/sec Loss 8.3495 Epoch: 6 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:02:55,279-Speed 5171.47 samples/sec Loss 8.3396 Epoch: 6 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:05,150-Speed 5187.39 samples/sec Loss 8.3190 Epoch: 6 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:15,909-Speed 4758.98 samples/sec Loss 8.3478 Epoch: 6 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:25,654-Speed 5254.13 samples/sec Loss 8.3019 Epoch: 6 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:35,456-Speed 5224.03 samples/sec Loss 8.3359 Epoch: 6 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:45,223-Speed 5242.28 samples/sec Loss 8.3479 Epoch: 6 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:03:55,185-Speed 5140.02 samples/sec Loss 8.3478 Epoch: 6 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:05,157-Speed 5134.78 samples/sec Loss 8.3393 Epoch: 6 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:15,107-Speed 5146.03 samples/sec Loss 8.3521 Epoch: 6 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:25,099-Speed 5124.43 samples/sec Loss 8.4152 Epoch: 6 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:35,122-Speed 5108.68 samples/sec Loss 8.3503 Epoch: 6 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:45,134-Speed 5113.81 samples/sec Loss 8.3735 Epoch: 6 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:04:55,811-Speed 4795.78 samples/sec Loss 8.3848 Epoch: 6 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:05,678-Speed 5189.34 samples/sec Loss 8.3593 Epoch: 6 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:15,439-Speed 5245.43 samples/sec Loss 8.3767 Epoch: 6 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:25,268-Speed 5209.22 samples/sec Loss 8.4005 Epoch: 6 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:35,261-Speed 5123.96 samples/sec Loss 8.4035 Epoch: 6 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:45,134-Speed 5186.08 samples/sec Loss 8.4044 Epoch: 6 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:05:55,128-Speed 5123.52 samples/sec Loss 8.3327 Epoch: 6 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:05,756-Speed 4817.77 samples/sec Loss 8.3670 Epoch: 6 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:15,598-Speed 5202.56 samples/sec Loss 8.4751 Epoch: 6 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:26,080-Speed 4884.74 samples/sec Loss 8.3659 Epoch: 6 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:35,858-Speed 5236.54 samples/sec Loss 8.4426 Epoch: 6 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:45,649-Speed 5229.75 samples/sec Loss 8.4466 Epoch: 6 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:06:55,489-Speed 5203.44 samples/sec Loss 8.4549 Epoch: 6 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:07:06,410-Speed 4688.68 samples/sec Loss 8.4235 Epoch: 6 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:07:16,515-Speed 5067.02 samples/sec Loss 8.4785 Epoch: 6 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:07:28,044-Speed 4441.07 samples/sec Loss 8.4106 Epoch: 6 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:07:37,948-Speed 5170.05 samples/sec Loss 8.4560 Epoch: 6 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:07:54,574-[lfw][104000]XNorm: 24.149462 Training: 2021-03-19 03:07:54,575-[lfw][104000]Accuracy-Flip: 0.99333+-0.00373 Training: 2021-03-19 03:07:54,575-[lfw][104000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:08:13,183-[cfp_fp][104000]XNorm: 20.249008 Training: 2021-03-19 03:08:13,183-[cfp_fp][104000]Accuracy-Flip: 0.88986+-0.01365 Training: 2021-03-19 03:08:13,183-[cfp_fp][104000]Accuracy-Highest: 0.90429 Training: 2021-03-19 03:08:29,349-[agedb_30][104000]XNorm: 23.194708 Training: 2021-03-19 03:08:29,349-[agedb_30][104000]Accuracy-Flip: 0.93383+-0.01310 Training: 2021-03-19 03:08:29,349-[agedb_30][104000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:08:39,248-Speed 835.25 samples/sec Loss 8.4248 Epoch: 6 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:08:49,032-Speed 5233.14 samples/sec Loss 8.4211 Epoch: 6 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:08:58,968-Speed 5153.53 samples/sec Loss 8.4724 Epoch: 6 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:08,941-Speed 5133.82 samples/sec Loss 8.3860 Epoch: 6 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:18,923-Speed 5129.60 samples/sec Loss 8.4754 Epoch: 6 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:29,726-Speed 4739.76 samples/sec Loss 8.4290 Epoch: 6 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:39,519-Speed 5228.82 samples/sec Loss 8.4424 Epoch: 6 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:49,609-Speed 5074.42 samples/sec Loss 8.5051 Epoch: 6 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:09:59,528-Speed 5162.19 samples/sec Loss 8.5161 Epoch: 6 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:09,415-Speed 5178.65 samples/sec Loss 8.4377 Epoch: 6 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:19,398-Speed 5129.04 samples/sec Loss 8.4380 Epoch: 6 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:29,726-Speed 4957.41 samples/sec Loss 8.4462 Epoch: 6 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:39,741-Speed 5112.96 samples/sec Loss 8.4652 Epoch: 6 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:49,440-Speed 5279.22 samples/sec Loss 8.4768 Epoch: 6 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:10:59,379-Speed 5151.88 samples/sec Loss 8.4732 Epoch: 6 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:09,224-Speed 5200.87 samples/sec Loss 8.4925 Epoch: 6 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:19,176-Speed 5144.77 samples/sec Loss 8.4380 Epoch: 6 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:29,256-Speed 5079.48 samples/sec Loss 8.4680 Epoch: 6 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:39,144-Speed 5178.53 samples/sec Loss 8.5072 Epoch: 6 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:49,084-Speed 5151.23 samples/sec Loss 8.4895 Epoch: 6 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:11:59,007-Speed 5160.47 samples/sec Loss 8.4956 Epoch: 6 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:08,821-Speed 5217.10 samples/sec Loss 8.5119 Epoch: 6 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:18,686-Speed 5190.45 samples/sec Loss 8.5409 Epoch: 6 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:28,686-Speed 5120.35 samples/sec Loss 8.5374 Epoch: 6 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:38,517-Speed 5208.39 samples/sec Loss 8.5002 Epoch: 6 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:48,585-Speed 5085.63 samples/sec Loss 8.5624 Epoch: 6 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:12:58,383-Speed 5225.85 samples/sec Loss 8.4805 Epoch: 6 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:13:08,598-Speed 5012.42 samples/sec Loss 8.4931 Epoch: 6 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:13:18,811-Speed 5013.67 samples/sec Loss 8.4895 Epoch: 6 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:13:29,020-Speed 5015.48 samples/sec Loss 8.4845 Epoch: 6 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-19 03:13:38,919-Speed 5172.30 samples/sec Loss 8.5065 Epoch: 6 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:13:48,734-Speed 5216.89 samples/sec Loss 8.4414 Epoch: 6 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:13:58,633-Speed 5173.00 samples/sec Loss 8.4290 Epoch: 6 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:08,673-Speed 5099.48 samples/sec Loss 8.5191 Epoch: 6 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:18,585-Speed 5166.48 samples/sec Loss 8.4854 Epoch: 6 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:28,382-Speed 5226.22 samples/sec Loss 8.5307 Epoch: 6 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:38,077-Speed 5281.52 samples/sec Loss 8.5284 Epoch: 6 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:47,995-Speed 5162.54 samples/sec Loss 8.4843 Epoch: 6 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:14:57,835-Speed 5203.36 samples/sec Loss 8.5063 Epoch: 6 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:15:08,318-Speed 4884.59 samples/sec Loss 8.5674 Epoch: 6 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:15:25,118-[lfw][106000]XNorm: 21.864164 Training: 2021-03-19 03:15:25,118-[lfw][106000]Accuracy-Flip: 0.99267+-0.00410 Training: 2021-03-19 03:15:25,118-[lfw][106000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:15:43,822-[cfp_fp][106000]XNorm: 17.516592 Training: 2021-03-19 03:15:43,823-[cfp_fp][106000]Accuracy-Flip: 0.89529+-0.01496 Training: 2021-03-19 03:15:43,823-[cfp_fp][106000]Accuracy-Highest: 0.90429 Training: 2021-03-19 03:15:59,986-[agedb_30][106000]XNorm: 20.972128 Training: 2021-03-19 03:15:59,987-[agedb_30][106000]Accuracy-Flip: 0.93517+-0.01473 Training: 2021-03-19 03:15:59,987-[agedb_30][106000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:16:09,554-Speed 836.12 samples/sec Loss 8.5161 Epoch: 6 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:16:19,510-Speed 5142.80 samples/sec Loss 8.5092 Epoch: 6 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:16:29,698-Speed 5026.35 samples/sec Loss 8.5325 Epoch: 6 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:16:39,487-Speed 5230.57 samples/sec Loss 8.5797 Epoch: 6 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:16:49,459-Speed 5134.77 samples/sec Loss 8.5568 Epoch: 6 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:16:59,457-Speed 5121.25 samples/sec Loss 8.5750 Epoch: 6 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:09,238-Speed 5234.93 samples/sec Loss 8.5362 Epoch: 6 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:19,080-Speed 5202.59 samples/sec Loss 8.5074 Epoch: 6 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:28,954-Speed 5185.81 samples/sec Loss 8.5441 Epoch: 6 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:38,814-Speed 5192.83 samples/sec Loss 8.5152 Epoch: 6 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:48,989-Speed 5032.20 samples/sec Loss 8.5465 Epoch: 6 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:17:59,485-Speed 4878.16 samples/sec Loss 8.5023 Epoch: 6 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:18:09,490-Speed 5117.76 samples/sec Loss 8.5752 Epoch: 6 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:18:19,292-Speed 5223.84 samples/sec Loss 8.5009 Epoch: 6 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:18:29,123-Speed 5208.23 samples/sec Loss 8.5653 Epoch: 6 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:18:39,009-Speed 5179.52 samples/sec Loss 8.5610 Epoch: 6 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:18:49,683-Speed 4796.78 samples/sec Loss 8.5476 Epoch: 6 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:00,213-Speed 4862.56 samples/sec Loss 8.4832 Epoch: 6 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:10,273-Speed 5090.41 samples/sec Loss 8.4960 Epoch: 6 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:20,072-Speed 5225.20 samples/sec Loss 8.5226 Epoch: 6 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:29,799-Speed 5264.12 samples/sec Loss 8.5376 Epoch: 6 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:39,644-Speed 5200.89 samples/sec Loss 8.5091 Epoch: 6 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:19:49,793-Speed 5045.38 samples/sec Loss 8.5598 Epoch: 6 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:00,543-Speed 4762.83 samples/sec Loss 8.5523 Epoch: 6 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:11,169-Speed 4818.94 samples/sec Loss 8.5341 Epoch: 6 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:21,777-Speed 4826.83 samples/sec Loss 8.5076 Epoch: 6 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:31,823-Speed 5096.43 samples/sec Loss 8.6006 Epoch: 6 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:41,657-Speed 5206.87 samples/sec Loss 8.5792 Epoch: 6 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:20:51,519-Speed 5192.14 samples/sec Loss 8.5601 Epoch: 6 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:01,616-Speed 5070.81 samples/sec Loss 8.5597 Epoch: 6 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:11,476-Speed 5193.02 samples/sec Loss 8.5320 Epoch: 6 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:21,994-Speed 4868.33 samples/sec Loss 8.5925 Epoch: 6 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:32,130-Speed 5051.54 samples/sec Loss 8.5256 Epoch: 6 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:42,167-Speed 5101.35 samples/sec Loss 8.5458 Epoch: 6 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:21:52,168-Speed 5119.86 samples/sec Loss 8.5445 Epoch: 6 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:02,090-Speed 5160.78 samples/sec Loss 8.5205 Epoch: 6 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:11,974-Speed 5180.34 samples/sec Loss 8.5513 Epoch: 6 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:22,038-Speed 5087.77 samples/sec Loss 8.5364 Epoch: 6 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:31,961-Speed 5160.38 samples/sec Loss 8.5447 Epoch: 6 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:42,010-Speed 5095.33 samples/sec Loss 8.4600 Epoch: 6 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:22:58,497-[lfw][108000]XNorm: 24.079795 Training: 2021-03-19 03:22:58,497-[lfw][108000]Accuracy-Flip: 0.99150+-0.00353 Training: 2021-03-19 03:22:58,497-[lfw][108000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:23:17,095-[cfp_fp][108000]XNorm: 18.784436 Training: 2021-03-19 03:23:17,095-[cfp_fp][108000]Accuracy-Flip: 0.88871+-0.01278 Training: 2021-03-19 03:23:17,096-[cfp_fp][108000]Accuracy-Highest: 0.90429 Training: 2021-03-19 03:23:33,271-[agedb_30][108000]XNorm: 22.577853 Training: 2021-03-19 03:23:33,272-[agedb_30][108000]Accuracy-Flip: 0.93100+-0.01133 Training: 2021-03-19 03:23:33,273-[agedb_30][108000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:23:42,976-Speed 839.82 samples/sec Loss 8.5376 Epoch: 6 Global Step: 108050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:23:52,881-Speed 5169.19 samples/sec Loss 8.5501 Epoch: 6 Global Step: 108100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:02,844-Speed 5139.47 samples/sec Loss 8.5989 Epoch: 6 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:13,018-Speed 5032.49 samples/sec Loss 8.4907 Epoch: 6 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:22,820-Speed 5223.66 samples/sec Loss 8.5621 Epoch: 6 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:32,780-Speed 5141.10 samples/sec Loss 8.5348 Epoch: 6 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:42,878-Speed 5070.77 samples/sec Loss 8.5407 Epoch: 6 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:24:53,067-Speed 5025.06 samples/sec Loss 8.5757 Epoch: 6 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:02,993-Speed 5158.59 samples/sec Loss 8.5890 Epoch: 6 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:12,827-Speed 5206.80 samples/sec Loss 8.5470 Epoch: 6 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:22,665-Speed 5204.84 samples/sec Loss 8.5380 Epoch: 6 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:32,621-Speed 5142.97 samples/sec Loss 8.5307 Epoch: 6 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:42,512-Speed 5176.76 samples/sec Loss 8.4842 Epoch: 6 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:25:52,484-Speed 5134.58 samples/sec Loss 8.5139 Epoch: 6 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:02,337-Speed 5196.59 samples/sec Loss 8.5137 Epoch: 6 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:12,231-Speed 5175.08 samples/sec Loss 8.5295 Epoch: 6 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:22,060-Speed 5209.25 samples/sec Loss 8.5253 Epoch: 6 Global Step: 108850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:32,120-Speed 5089.93 samples/sec Loss 8.5489 Epoch: 6 Global Step: 108900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:42,089-Speed 5136.13 samples/sec Loss 8.5978 Epoch: 6 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:26:51,850-Speed 5246.02 samples/sec Loss 8.5557 Epoch: 6 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:01,656-Speed 5221.36 samples/sec Loss 8.6127 Epoch: 6 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:11,548-Speed 5176.35 samples/sec Loss 8.5151 Epoch: 6 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:21,481-Speed 5154.86 samples/sec Loss 8.5315 Epoch: 6 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:31,262-Speed 5234.71 samples/sec Loss 8.5810 Epoch: 6 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:42,045-Speed 4748.83 samples/sec Loss 8.5953 Epoch: 6 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:27:51,815-Speed 5240.73 samples/sec Loss 8.6008 Epoch: 6 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:01,986-Speed 5034.44 samples/sec Loss 8.5707 Epoch: 6 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:11,878-Speed 5176.18 samples/sec Loss 8.6218 Epoch: 6 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:21,597-Speed 5268.48 samples/sec Loss 8.5166 Epoch: 6 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:31,434-Speed 5205.29 samples/sec Loss 8.5336 Epoch: 6 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:41,310-Speed 5184.64 samples/sec Loss 8.5702 Epoch: 6 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:28:51,123-Speed 5217.85 samples/sec Loss 8.5014 Epoch: 6 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:00,946-Speed 5212.58 samples/sec Loss 8.5665 Epoch: 6 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:10,860-Speed 5164.61 samples/sec Loss 8.5539 Epoch: 6 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:20,647-Speed 5231.86 samples/sec Loss 8.5412 Epoch: 6 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:30,512-Speed 5190.07 samples/sec Loss 8.5866 Epoch: 6 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:40,467-Speed 5143.53 samples/sec Loss 8.4974 Epoch: 6 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:29:50,300-Speed 5207.23 samples/sec Loss 8.4982 Epoch: 6 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:30:00,802-Speed 4875.51 samples/sec Loss 8.5018 Epoch: 6 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:30:10,612-Speed 5219.27 samples/sec Loss 8.5918 Epoch: 6 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:30:27,155-[lfw][110000]XNorm: 24.443688 Training: 2021-03-19 03:30:27,155-[lfw][110000]Accuracy-Flip: 0.99150+-0.00524 Training: 2021-03-19 03:30:27,155-[lfw][110000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:30:45,796-[cfp_fp][110000]XNorm: 19.729288 Training: 2021-03-19 03:30:45,796-[cfp_fp][110000]Accuracy-Flip: 0.89971+-0.01873 Training: 2021-03-19 03:30:45,796-[cfp_fp][110000]Accuracy-Highest: 0.90429 Training: 2021-03-19 03:31:01,856-[agedb_30][110000]XNorm: 23.653037 Training: 2021-03-19 03:31:01,857-[agedb_30][110000]Accuracy-Flip: 0.93717+-0.01157 Training: 2021-03-19 03:31:01,857-[agedb_30][110000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:31:11,534-Speed 840.44 samples/sec Loss 8.5872 Epoch: 6 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:31:21,347-Speed 5217.59 samples/sec Loss 8.5869 Epoch: 6 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:31:31,935-Speed 4835.98 samples/sec Loss 8.5756 Epoch: 6 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:31:42,637-Speed 4784.76 samples/sec Loss 8.4817 Epoch: 6 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:31:52,505-Speed 5188.85 samples/sec Loss 8.5747 Epoch: 6 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:02,478-Speed 5133.92 samples/sec Loss 8.5218 Epoch: 6 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:12,592-Speed 5062.93 samples/sec Loss 8.5485 Epoch: 6 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:22,463-Speed 5187.34 samples/sec Loss 8.5515 Epoch: 6 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:32,279-Speed 5216.13 samples/sec Loss 8.6125 Epoch: 6 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:42,802-Speed 4866.13 samples/sec Loss 8.4909 Epoch: 6 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:32:53,517-Speed 4778.57 samples/sec Loss 8.5589 Epoch: 6 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:03,315-Speed 5225.66 samples/sec Loss 8.5751 Epoch: 6 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:14,112-Speed 4742.32 samples/sec Loss 8.5828 Epoch: 6 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:24,084-Speed 5134.91 samples/sec Loss 8.5218 Epoch: 6 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:33,951-Speed 5189.48 samples/sec Loss 8.5067 Epoch: 6 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:43,773-Speed 5212.80 samples/sec Loss 8.5540 Epoch: 6 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:33:53,578-Speed 5222.62 samples/sec Loss 8.5488 Epoch: 6 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:04,328-Speed 4763.01 samples/sec Loss 8.5858 Epoch: 6 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:13,911-Speed 5342.71 samples/sec Loss 8.5577 Epoch: 6 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:24,025-Speed 5062.90 samples/sec Loss 8.5666 Epoch: 6 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:33,960-Speed 5153.20 samples/sec Loss 8.5693 Epoch: 6 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:43,877-Speed 5163.65 samples/sec Loss 8.5956 Epoch: 6 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:34:53,963-Speed 5076.23 samples/sec Loss 8.5419 Epoch: 6 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:03,993-Speed 5105.08 samples/sec Loss 8.5187 Epoch: 6 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:13,994-Speed 5119.99 samples/sec Loss 8.5603 Epoch: 6 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:23,898-Speed 5170.13 samples/sec Loss 8.5472 Epoch: 6 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:33,948-Speed 5094.37 samples/sec Loss 8.6164 Epoch: 6 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:43,913-Speed 5138.37 samples/sec Loss 8.6551 Epoch: 6 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:35:53,928-Speed 5112.78 samples/sec Loss 8.5549 Epoch: 6 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:03,770-Speed 5202.73 samples/sec Loss 8.5453 Epoch: 6 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:13,840-Speed 5084.47 samples/sec Loss 8.5885 Epoch: 6 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:23,925-Speed 5077.12 samples/sec Loss 8.5140 Epoch: 6 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:33,903-Speed 5132.06 samples/sec Loss 8.5361 Epoch: 6 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:44,074-Speed 5034.03 samples/sec Loss 8.5555 Epoch: 6 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:36:53,916-Speed 5202.34 samples/sec Loss 8.4945 Epoch: 6 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:37:04,065-Speed 5044.96 samples/sec Loss 8.5512 Epoch: 6 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:37:13,836-Speed 5240.41 samples/sec Loss 8.5620 Epoch: 6 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:37:23,785-Speed 5147.02 samples/sec Loss 8.5839 Epoch: 6 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:37:33,868-Speed 5077.73 samples/sec Loss 8.5787 Epoch: 6 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:37:43,880-Speed 5114.60 samples/sec Loss 8.5700 Epoch: 6 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:38:00,725-[lfw][112000]XNorm: 24.838683 Training: 2021-03-19 03:38:00,725-[lfw][112000]Accuracy-Flip: 0.99400+-0.00389 Training: 2021-03-19 03:38:00,725-[lfw][112000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:38:19,540-[cfp_fp][112000]XNorm: 19.675595 Training: 2021-03-19 03:38:19,540-[cfp_fp][112000]Accuracy-Flip: 0.90557+-0.01642 Training: 2021-03-19 03:38:19,545-[cfp_fp][112000]Accuracy-Highest: 0.90557 Training: 2021-03-19 03:38:35,693-[agedb_30][112000]XNorm: 23.475145 Training: 2021-03-19 03:38:35,693-[agedb_30][112000]Accuracy-Flip: 0.93233+-0.01323 Training: 2021-03-19 03:38:35,693-[agedb_30][112000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:38:45,487-Speed 831.08 samples/sec Loss 8.5632 Epoch: 6 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:38:55,613-Speed 5056.60 samples/sec Loss 8.5669 Epoch: 6 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:05,510-Speed 5173.57 samples/sec Loss 8.5472 Epoch: 6 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:15,468-Speed 5141.95 samples/sec Loss 8.5385 Epoch: 6 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:25,326-Speed 5193.76 samples/sec Loss 8.5750 Epoch: 6 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:35,394-Speed 5085.87 samples/sec Loss 8.5815 Epoch: 6 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:45,249-Speed 5195.71 samples/sec Loss 8.5461 Epoch: 6 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:39:55,034-Speed 5232.98 samples/sec Loss 8.5218 Epoch: 6 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:04,963-Speed 5156.44 samples/sec Loss 8.5567 Epoch: 6 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:14,765-Speed 5223.93 samples/sec Loss 8.6162 Epoch: 6 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:25,417-Speed 4806.98 samples/sec Loss 8.5402 Epoch: 6 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:35,170-Speed 5249.46 samples/sec Loss 8.5267 Epoch: 6 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:45,253-Speed 5078.27 samples/sec Loss 8.5450 Epoch: 6 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:40:55,132-Speed 5183.25 samples/sec Loss 8.5203 Epoch: 6 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:05,108-Speed 5132.95 samples/sec Loss 8.5957 Epoch: 6 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:15,137-Speed 5105.40 samples/sec Loss 8.6040 Epoch: 6 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:24,945-Speed 5220.66 samples/sec Loss 8.6038 Epoch: 6 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:34,958-Speed 5113.45 samples/sec Loss 8.5257 Epoch: 6 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:44,823-Speed 5190.53 samples/sec Loss 8.5570 Epoch: 6 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:41:54,735-Speed 5166.23 samples/sec Loss 8.5339 Epoch: 6 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:04,629-Speed 5174.88 samples/sec Loss 8.5954 Epoch: 6 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:14,630-Speed 5119.53 samples/sec Loss 8.6118 Epoch: 6 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:24,437-Speed 5221.24 samples/sec Loss 8.6092 Epoch: 6 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:34,419-Speed 5129.26 samples/sec Loss 8.5101 Epoch: 6 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:44,505-Speed 5076.74 samples/sec Loss 8.6192 Epoch: 6 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:42:55,251-Speed 4765.11 samples/sec Loss 8.6235 Epoch: 6 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:05,130-Speed 5182.79 samples/sec Loss 8.5686 Epoch: 6 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:14,886-Speed 5248.57 samples/sec Loss 8.5412 Epoch: 6 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:24,801-Speed 5163.96 samples/sec Loss 8.5999 Epoch: 6 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:36,246-Speed 4474.03 samples/sec Loss 8.5072 Epoch: 6 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:46,201-Speed 5143.43 samples/sec Loss 8.5295 Epoch: 6 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:43:55,958-Speed 5247.63 samples/sec Loss 8.4940 Epoch: 6 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:05,809-Speed 5197.87 samples/sec Loss 8.5212 Epoch: 6 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:15,588-Speed 5236.27 samples/sec Loss 8.5136 Epoch: 6 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:25,320-Speed 5261.41 samples/sec Loss 8.5553 Epoch: 6 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:36,143-Speed 4730.84 samples/sec Loss 8.5866 Epoch: 6 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:47,164-Speed 4645.95 samples/sec Loss 8.5693 Epoch: 6 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:44:56,848-Speed 5287.82 samples/sec Loss 8.5885 Epoch: 6 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:45:07,759-Speed 4692.70 samples/sec Loss 8.5573 Epoch: 6 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:45:17,824-Speed 5086.97 samples/sec Loss 8.5199 Epoch: 6 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:45:34,492-[lfw][114000]XNorm: 22.726747 Training: 2021-03-19 03:45:34,492-[lfw][114000]Accuracy-Flip: 0.99433+-0.00396 Training: 2021-03-19 03:45:34,492-[lfw][114000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:45:53,221-[cfp_fp][114000]XNorm: 18.187607 Training: 2021-03-19 03:45:53,221-[cfp_fp][114000]Accuracy-Flip: 0.89443+-0.01196 Training: 2021-03-19 03:45:53,221-[cfp_fp][114000]Accuracy-Highest: 0.90557 Training: 2021-03-19 03:46:09,461-[agedb_30][114000]XNorm: 21.235293 Training: 2021-03-19 03:46:09,461-[agedb_30][114000]Accuracy-Flip: 0.92700+-0.01472 Training: 2021-03-19 03:46:09,461-[agedb_30][114000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:46:19,393-Speed 831.60 samples/sec Loss 8.5996 Epoch: 6 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:46:29,592-Speed 5020.25 samples/sec Loss 8.6136 Epoch: 6 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:46:39,604-Speed 5113.90 samples/sec Loss 8.5764 Epoch: 6 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:46:50,088-Speed 4884.07 samples/sec Loss 8.5091 Epoch: 6 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:00,159-Speed 5084.27 samples/sec Loss 8.5528 Epoch: 6 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:10,443-Speed 4978.85 samples/sec Loss 8.5200 Epoch: 6 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:20,375-Speed 5155.43 samples/sec Loss 8.5619 Epoch: 6 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:30,310-Speed 5153.61 samples/sec Loss 8.5408 Epoch: 6 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:40,336-Speed 5106.90 samples/sec Loss 8.5836 Epoch: 6 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:47:50,500-Speed 5037.90 samples/sec Loss 8.5321 Epoch: 6 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:00,461-Speed 5140.32 samples/sec Loss 8.5247 Epoch: 6 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:10,291-Speed 5208.86 samples/sec Loss 8.4248 Epoch: 6 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:20,232-Speed 5150.66 samples/sec Loss 8.5706 Epoch: 6 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:29,997-Speed 5243.42 samples/sec Loss 8.5377 Epoch: 6 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:39,774-Speed 5237.15 samples/sec Loss 8.5977 Epoch: 6 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:49,645-Speed 5187.23 samples/sec Loss 8.5585 Epoch: 6 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:48:59,548-Speed 5170.56 samples/sec Loss 8.5499 Epoch: 6 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:09,358-Speed 5219.41 samples/sec Loss 8.5774 Epoch: 6 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:19,219-Speed 5192.49 samples/sec Loss 8.5701 Epoch: 6 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:29,032-Speed 5217.52 samples/sec Loss 8.5683 Epoch: 6 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:39,076-Speed 5097.97 samples/sec Loss 8.5454 Epoch: 6 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:48,983-Speed 5168.52 samples/sec Loss 8.5591 Epoch: 6 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:49:59,136-Speed 5043.25 samples/sec Loss 8.6225 Epoch: 6 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:08,945-Speed 5220.04 samples/sec Loss 8.5529 Epoch: 6 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:18,914-Speed 5136.16 samples/sec Loss 8.5794 Epoch: 6 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:28,802-Speed 5178.32 samples/sec Loss 8.5696 Epoch: 6 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:38,602-Speed 5224.71 samples/sec Loss 8.5471 Epoch: 6 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:48,620-Speed 5111.08 samples/sec Loss 8.5031 Epoch: 6 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:50:58,570-Speed 5145.93 samples/sec Loss 8.5546 Epoch: 6 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:08,537-Speed 5137.60 samples/sec Loss 8.5124 Epoch: 6 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:18,230-Speed 5282.57 samples/sec Loss 8.5466 Epoch: 6 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:28,020-Speed 5229.96 samples/sec Loss 8.4855 Epoch: 6 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:37,899-Speed 5183.08 samples/sec Loss 8.6171 Epoch: 6 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:47,835-Speed 5152.89 samples/sec Loss 8.5159 Epoch: 6 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:51:57,881-Speed 5097.22 samples/sec Loss 8.5413 Epoch: 6 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:52:07,903-Speed 5108.90 samples/sec Loss 8.6106 Epoch: 6 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:52:17,894-Speed 5125.11 samples/sec Loss 8.5502 Epoch: 6 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:52:27,731-Speed 5205.15 samples/sec Loss 8.5683 Epoch: 6 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:52:38,487-Speed 4760.32 samples/sec Loss 8.5688 Epoch: 6 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:52:48,514-Speed 5106.18 samples/sec Loss 8.5480 Epoch: 6 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:53:04,813-[lfw][116000]XNorm: 22.522535 Training: 2021-03-19 03:53:04,813-[lfw][116000]Accuracy-Flip: 0.99283+-0.00422 Training: 2021-03-19 03:53:04,814-[lfw][116000]Accuracy-Highest: 0.99500 Training: 2021-03-19 03:53:23,569-[cfp_fp][116000]XNorm: 18.337479 Training: 2021-03-19 03:53:23,569-[cfp_fp][116000]Accuracy-Flip: 0.89571+-0.01322 Training: 2021-03-19 03:53:23,569-[cfp_fp][116000]Accuracy-Highest: 0.90557 Training: 2021-03-19 03:53:39,818-[agedb_30][116000]XNorm: 21.445198 Training: 2021-03-19 03:53:39,818-[agedb_30][116000]Accuracy-Flip: 0.93767+-0.01289 Training: 2021-03-19 03:53:39,818-[agedb_30][116000]Accuracy-Highest: 0.94417 Training: 2021-03-19 03:53:49,837-Speed 834.95 samples/sec Loss 8.5942 Epoch: 6 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:53:59,626-Speed 5230.22 samples/sec Loss 8.5263 Epoch: 6 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:09,638-Speed 5113.98 samples/sec Loss 8.5310 Epoch: 6 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:19,657-Speed 5110.80 samples/sec Loss 8.5499 Epoch: 6 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:29,588-Speed 5155.71 samples/sec Loss 8.5012 Epoch: 6 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:39,492-Speed 5170.35 samples/sec Loss 8.5278 Epoch: 6 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:49,541-Speed 5095.50 samples/sec Loss 8.4712 Epoch: 6 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:54:59,361-Speed 5213.98 samples/sec Loss 8.5976 Epoch: 6 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:09,295-Speed 5154.34 samples/sec Loss 8.5955 Epoch: 6 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:19,148-Speed 5196.52 samples/sec Loss 8.5582 Epoch: 6 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:29,914-Speed 4756.06 samples/sec Loss 8.6087 Epoch: 6 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:39,804-Speed 5177.25 samples/sec Loss 8.5737 Epoch: 6 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:49,652-Speed 5199.06 samples/sec Loss 8.5709 Epoch: 6 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:55:59,580-Speed 5157.51 samples/sec Loss 8.5657 Epoch: 6 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:56:09,336-Speed 5248.78 samples/sec Loss 8.5063 Epoch: 6 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:56:19,975-Speed 4812.60 samples/sec Loss 8.5865 Epoch: 6 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:56:42,683-Speed 2254.84 samples/sec Loss 8.2906 Epoch: 7 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:56:53,392-Speed 4781.15 samples/sec Loss 6.8947 Epoch: 7 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:04,354-Speed 4670.93 samples/sec Loss 6.5425 Epoch: 7 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:15,001-Speed 4809.62 samples/sec Loss 6.3053 Epoch: 7 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:26,486-Speed 4458.15 samples/sec Loss 6.2848 Epoch: 7 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:37,227-Speed 4766.93 samples/sec Loss 6.1220 Epoch: 7 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:47,593-Speed 4939.67 samples/sec Loss 6.0732 Epoch: 7 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:57:58,189-Speed 4832.44 samples/sec Loss 6.0399 Epoch: 7 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:58:10,354-Speed 4208.90 samples/sec Loss 5.9411 Epoch: 7 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:58:20,856-Speed 4875.34 samples/sec Loss 5.9078 Epoch: 7 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:58:31,436-Speed 4839.79 samples/sec Loss 5.8698 Epoch: 7 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:58:42,034-Speed 4831.17 samples/sec Loss 5.8802 Epoch: 7 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:58:52,772-Speed 4768.62 samples/sec Loss 5.7828 Epoch: 7 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:03,988-Speed 4565.14 samples/sec Loss 5.7245 Epoch: 7 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:14,552-Speed 4847.10 samples/sec Loss 5.7505 Epoch: 7 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:25,207-Speed 4805.41 samples/sec Loss 5.6864 Epoch: 7 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:35,911-Speed 4783.38 samples/sec Loss 5.6237 Epoch: 7 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:46,278-Speed 4939.09 samples/sec Loss 5.6632 Epoch: 7 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 03:59:56,768-Speed 4881.22 samples/sec Loss 5.6491 Epoch: 7 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:00:07,212-Speed 4902.49 samples/sec Loss 5.5574 Epoch: 7 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:00:17,927-Speed 4778.73 samples/sec Loss 5.6104 Epoch: 7 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:00:28,420-Speed 4879.58 samples/sec Loss 5.5106 Epoch: 7 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:00:38,967-Speed 4854.59 samples/sec Loss 5.5519 Epoch: 7 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:00:49,296-Speed 4957.15 samples/sec Loss 5.4408 Epoch: 7 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:01:05,713-[lfw][118000]XNorm: 23.314289 Training: 2021-03-19 04:01:05,714-[lfw][118000]Accuracy-Flip: 0.99617+-0.00259 Training: 2021-03-19 04:01:05,714-[lfw][118000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:01:24,272-[cfp_fp][118000]XNorm: 18.634486 Training: 2021-03-19 04:01:24,273-[cfp_fp][118000]Accuracy-Flip: 0.94629+-0.01306 Training: 2021-03-19 04:01:24,273-[cfp_fp][118000]Accuracy-Highest: 0.94629 Training: 2021-03-19 04:01:40,398-[agedb_30][118000]XNorm: 22.235548 Training: 2021-03-19 04:01:40,399-[agedb_30][118000]Accuracy-Flip: 0.95950+-0.00949 Training: 2021-03-19 04:01:40,399-[agedb_30][118000]Accuracy-Highest: 0.95950 Training: 2021-03-19 04:01:50,743-Speed 833.24 samples/sec Loss 5.4454 Epoch: 7 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:01,209-Speed 4892.20 samples/sec Loss 5.4622 Epoch: 7 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:11,608-Speed 4923.82 samples/sec Loss 5.4222 Epoch: 7 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:22,093-Speed 4883.50 samples/sec Loss 5.3895 Epoch: 7 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:32,508-Speed 4916.70 samples/sec Loss 5.3860 Epoch: 7 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:43,164-Speed 4804.81 samples/sec Loss 5.3707 Epoch: 7 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:02:53,776-Speed 4825.38 samples/sec Loss 5.3861 Epoch: 7 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:04,259-Speed 4884.40 samples/sec Loss 5.3291 Epoch: 7 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:14,863-Speed 4828.39 samples/sec Loss 5.2954 Epoch: 7 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:25,471-Speed 4826.79 samples/sec Loss 5.2765 Epoch: 7 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:36,002-Speed 4862.32 samples/sec Loss 5.2779 Epoch: 7 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:46,401-Speed 4923.65 samples/sec Loss 5.2910 Epoch: 7 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:03:56,884-Speed 4884.17 samples/sec Loss 5.2345 Epoch: 7 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:07,477-Speed 4833.68 samples/sec Loss 5.2012 Epoch: 7 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:18,121-Speed 4810.38 samples/sec Loss 5.1876 Epoch: 7 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:28,373-Speed 4994.46 samples/sec Loss 5.2032 Epoch: 7 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:38,995-Speed 4820.79 samples/sec Loss 5.1649 Epoch: 7 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:49,250-Speed 4992.87 samples/sec Loss 5.2173 Epoch: 7 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:04:59,418-Speed 5035.69 samples/sec Loss 5.1552 Epoch: 7 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:05:09,651-Speed 5003.93 samples/sec Loss 5.1642 Epoch: 7 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:05:19,944-Speed 4974.40 samples/sec Loss 5.1285 Epoch: 7 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:05:30,321-Speed 4934.60 samples/sec Loss 5.1514 Epoch: 7 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:05:40,408-Speed 5076.14 samples/sec Loss 5.0945 Epoch: 7 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:05:50,829-Speed 4913.21 samples/sec Loss 5.1154 Epoch: 7 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:00,937-Speed 5065.65 samples/sec Loss 5.1196 Epoch: 7 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:12,480-Speed 4435.56 samples/sec Loss 5.0964 Epoch: 7 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:23,005-Speed 4865.06 samples/sec Loss 5.0583 Epoch: 7 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:33,365-Speed 4942.24 samples/sec Loss 5.0413 Epoch: 7 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:43,636-Speed 4985.53 samples/sec Loss 5.0666 Epoch: 7 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:06:53,859-Speed 5008.46 samples/sec Loss 5.0478 Epoch: 7 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:04,391-Speed 4861.56 samples/sec Loss 5.0385 Epoch: 7 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:14,611-Speed 5009.86 samples/sec Loss 5.0099 Epoch: 7 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:25,269-Speed 4804.17 samples/sec Loss 4.9547 Epoch: 7 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:35,541-Speed 4985.12 samples/sec Loss 5.0093 Epoch: 7 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:46,144-Speed 4829.17 samples/sec Loss 4.9502 Epoch: 7 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:07:56,608-Speed 4893.46 samples/sec Loss 5.0180 Epoch: 7 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:08:07,031-Speed 4912.12 samples/sec Loss 4.9624 Epoch: 7 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:08:18,151-Speed 4605.04 samples/sec Loss 4.9403 Epoch: 7 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:08:28,609-Speed 4896.05 samples/sec Loss 4.9584 Epoch: 7 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:08:38,885-Speed 4982.44 samples/sec Loss 4.9120 Epoch: 7 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:08:55,048-[lfw][120000]XNorm: 23.347043 Training: 2021-03-19 04:08:55,048-[lfw][120000]Accuracy-Flip: 0.99533+-0.00287 Training: 2021-03-19 04:08:55,048-[lfw][120000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:09:13,636-[cfp_fp][120000]XNorm: 18.695131 Training: 2021-03-19 04:09:13,636-[cfp_fp][120000]Accuracy-Flip: 0.95114+-0.01321 Training: 2021-03-19 04:09:13,636-[cfp_fp][120000]Accuracy-Highest: 0.95114 Training: 2021-03-19 04:09:29,703-[agedb_30][120000]XNorm: 22.454093 Training: 2021-03-19 04:09:29,703-[agedb_30][120000]Accuracy-Flip: 0.96317+-0.00914 Training: 2021-03-19 04:09:29,703-[agedb_30][120000]Accuracy-Highest: 0.96317 Training: 2021-03-19 04:09:40,037-Speed 837.27 samples/sec Loss 4.9173 Epoch: 7 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:09:50,941-Speed 4695.63 samples/sec Loss 4.8982 Epoch: 7 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:02,291-Speed 4511.13 samples/sec Loss 4.8986 Epoch: 7 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:12,785-Speed 4879.24 samples/sec Loss 4.8926 Epoch: 7 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:23,374-Speed 4835.40 samples/sec Loss 4.8822 Epoch: 7 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:33,718-Speed 4950.14 samples/sec Loss 4.8574 Epoch: 7 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:44,044-Speed 4959.01 samples/sec Loss 4.9046 Epoch: 7 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:10:54,382-Speed 4952.66 samples/sec Loss 4.9144 Epoch: 7 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:05,594-Speed 4567.04 samples/sec Loss 4.8753 Epoch: 7 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:16,024-Speed 4909.25 samples/sec Loss 4.8452 Epoch: 7 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:26,996-Speed 4666.64 samples/sec Loss 4.8256 Epoch: 7 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:37,599-Speed 4829.00 samples/sec Loss 4.8725 Epoch: 7 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:48,643-Speed 4636.31 samples/sec Loss 4.8833 Epoch: 7 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:11:59,181-Speed 4858.99 samples/sec Loss 4.8161 Epoch: 7 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:12:09,628-Speed 4901.31 samples/sec Loss 4.7715 Epoch: 7 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:12:20,437-Speed 4736.80 samples/sec Loss 4.8379 Epoch: 7 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:12:31,215-Speed 4750.70 samples/sec Loss 4.7774 Epoch: 7 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:12:41,687-Speed 4889.37 samples/sec Loss 4.8261 Epoch: 7 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:12:52,068-Speed 4932.41 samples/sec Loss 4.7228 Epoch: 7 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:02,287-Speed 5010.85 samples/sec Loss 4.7327 Epoch: 7 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:12,907-Speed 4821.18 samples/sec Loss 4.7712 Epoch: 7 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:23,169-Speed 4989.58 samples/sec Loss 4.7745 Epoch: 7 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:33,535-Speed 4939.32 samples/sec Loss 4.7710 Epoch: 7 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:43,667-Speed 5053.97 samples/sec Loss 4.7696 Epoch: 7 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:13:54,045-Speed 4933.77 samples/sec Loss 4.7664 Epoch: 7 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:14:04,200-Speed 5041.98 samples/sec Loss 4.7173 Epoch: 7 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:14:14,488-Speed 4977.06 samples/sec Loss 4.7694 Epoch: 7 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:14:25,015-Speed 4864.04 samples/sec Loss 4.7751 Epoch: 7 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-19 04:14:35,329-Speed 4964.56 samples/sec Loss 4.7133 Epoch: 7 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:14:46,004-Speed 4796.62 samples/sec Loss 4.7214 Epoch: 7 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:14:56,298-Speed 4973.85 samples/sec Loss 4.7404 Epoch: 7 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:06,516-Speed 5010.81 samples/sec Loss 4.7025 Epoch: 7 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:16,825-Speed 4966.75 samples/sec Loss 4.7316 Epoch: 7 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:27,157-Speed 4955.87 samples/sec Loss 4.7039 Epoch: 7 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:37,542-Speed 4930.42 samples/sec Loss 4.6677 Epoch: 7 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:47,860-Speed 4962.76 samples/sec Loss 4.7042 Epoch: 7 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:15:58,029-Speed 5035.07 samples/sec Loss 4.6825 Epoch: 7 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:16:08,249-Speed 5010.31 samples/sec Loss 4.7250 Epoch: 7 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:16:18,403-Speed 5042.60 samples/sec Loss 4.7018 Epoch: 7 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:16:28,651-Speed 4996.70 samples/sec Loss 4.6619 Epoch: 7 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:16:45,025-[lfw][122000]XNorm: 23.601068 Training: 2021-03-19 04:16:45,025-[lfw][122000]Accuracy-Flip: 0.99600+-0.00271 Training: 2021-03-19 04:16:45,025-[lfw][122000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:17:03,615-[cfp_fp][122000]XNorm: 18.921952 Training: 2021-03-19 04:17:03,616-[cfp_fp][122000]Accuracy-Flip: 0.95157+-0.01369 Training: 2021-03-19 04:17:03,616-[cfp_fp][122000]Accuracy-Highest: 0.95157 Training: 2021-03-19 04:17:19,763-[agedb_30][122000]XNorm: 22.604324 Training: 2021-03-19 04:17:19,763-[agedb_30][122000]Accuracy-Flip: 0.96117+-0.00978 Training: 2021-03-19 04:17:19,763-[agedb_30][122000]Accuracy-Highest: 0.96317 Training: 2021-03-19 04:17:29,831-Speed 836.88 samples/sec Loss 4.6930 Epoch: 7 Global Step: 122050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:17:40,029-Speed 5020.73 samples/sec Loss 4.6738 Epoch: 7 Global Step: 122100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:17:50,268-Speed 5000.90 samples/sec Loss 4.6914 Epoch: 7 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:00,595-Speed 4958.36 samples/sec Loss 4.6213 Epoch: 7 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:10,757-Speed 5038.94 samples/sec Loss 4.7063 Epoch: 7 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:20,998-Speed 4999.73 samples/sec Loss 4.6422 Epoch: 7 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:31,423-Speed 4911.65 samples/sec Loss 4.6839 Epoch: 7 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:41,902-Speed 4886.21 samples/sec Loss 4.6717 Epoch: 7 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:18:52,780-Speed 4706.86 samples/sec Loss 4.6688 Epoch: 7 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:03,262-Speed 4885.15 samples/sec Loss 4.6332 Epoch: 7 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:13,760-Speed 4877.55 samples/sec Loss 4.6329 Epoch: 7 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:24,396-Speed 4813.77 samples/sec Loss 4.6167 Epoch: 7 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:34,857-Speed 4895.01 samples/sec Loss 4.6614 Epoch: 7 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:46,225-Speed 4503.78 samples/sec Loss 4.6301 Epoch: 7 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:19:56,527-Speed 4970.26 samples/sec Loss 4.6237 Epoch: 7 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:06,911-Speed 4930.80 samples/sec Loss 4.5981 Epoch: 7 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:17,180-Speed 4986.09 samples/sec Loss 4.5924 Epoch: 7 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:27,689-Speed 4872.61 samples/sec Loss 4.6255 Epoch: 7 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:37,943-Speed 4993.24 samples/sec Loss 4.5690 Epoch: 7 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:47,849-Speed 5168.96 samples/sec Loss 4.5702 Epoch: 7 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:20:57,850-Speed 5119.85 samples/sec Loss 4.6137 Epoch: 7 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:08,032-Speed 5028.44 samples/sec Loss 4.6235 Epoch: 7 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:18,142-Speed 5064.96 samples/sec Loss 4.5972 Epoch: 7 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:28,065-Speed 5159.76 samples/sec Loss 4.5664 Epoch: 7 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:39,303-Speed 4556.51 samples/sec Loss 4.5802 Epoch: 7 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:49,599-Speed 4972.89 samples/sec Loss 4.5450 Epoch: 7 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:21:59,707-Speed 5065.48 samples/sec Loss 4.6138 Epoch: 7 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:22:09,670-Speed 5139.63 samples/sec Loss 4.5834 Epoch: 7 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:22:20,846-Speed 4581.11 samples/sec Loss 4.5955 Epoch: 7 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:22:31,717-Speed 4710.28 samples/sec Loss 4.5669 Epoch: 7 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:22:41,858-Speed 5049.02 samples/sec Loss 4.5317 Epoch: 7 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:22:52,214-Speed 4944.29 samples/sec Loss 4.5597 Epoch: 7 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:02,329-Speed 5062.25 samples/sec Loss 4.5718 Epoch: 7 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:12,622-Speed 4974.44 samples/sec Loss 4.5889 Epoch: 7 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:23,048-Speed 4910.93 samples/sec Loss 4.5857 Epoch: 7 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:34,012-Speed 4670.00 samples/sec Loss 4.5518 Epoch: 7 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:43,984-Speed 5134.71 samples/sec Loss 4.5581 Epoch: 7 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:23:55,225-Speed 4555.32 samples/sec Loss 4.5217 Epoch: 7 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:24:05,498-Speed 4984.29 samples/sec Loss 4.5089 Epoch: 7 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:24:16,289-Speed 4744.77 samples/sec Loss 4.5546 Epoch: 7 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:24:32,552-[lfw][124000]XNorm: 23.323382 Training: 2021-03-19 04:24:32,552-[lfw][124000]Accuracy-Flip: 0.99617+-0.00279 Training: 2021-03-19 04:24:32,552-[lfw][124000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:24:51,160-[cfp_fp][124000]XNorm: 18.854224 Training: 2021-03-19 04:24:51,160-[cfp_fp][124000]Accuracy-Flip: 0.95271+-0.01208 Training: 2021-03-19 04:24:51,160-[cfp_fp][124000]Accuracy-Highest: 0.95271 Training: 2021-03-19 04:25:07,322-[agedb_30][124000]XNorm: 22.251054 Training: 2021-03-19 04:25:07,322-[agedb_30][124000]Accuracy-Flip: 0.96350+-0.00871 Training: 2021-03-19 04:25:07,322-[agedb_30][124000]Accuracy-Highest: 0.96350 Training: 2021-03-19 04:25:17,356-Speed 838.44 samples/sec Loss 4.5559 Epoch: 7 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:25:27,492-Speed 5051.67 samples/sec Loss 4.4941 Epoch: 7 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:25:38,626-Speed 4598.89 samples/sec Loss 4.5360 Epoch: 7 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:25:48,836-Speed 5014.52 samples/sec Loss 4.5354 Epoch: 7 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:25:59,066-Speed 5005.51 samples/sec Loss 4.5014 Epoch: 7 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:26:09,478-Speed 4917.55 samples/sec Loss 4.5542 Epoch: 7 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:26:19,761-Speed 4979.22 samples/sec Loss 4.5283 Epoch: 7 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:26:30,069-Speed 4967.43 samples/sec Loss 4.5363 Epoch: 7 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:26:40,199-Speed 5054.18 samples/sec Loss 4.5287 Epoch: 7 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:26:50,420-Speed 5009.90 samples/sec Loss 4.4790 Epoch: 7 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:00,919-Speed 4876.95 samples/sec Loss 4.4781 Epoch: 7 Global Step: 124550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:11,119-Speed 5019.76 samples/sec Loss 4.4575 Epoch: 7 Global Step: 124600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:21,744-Speed 4819.10 samples/sec Loss 4.5571 Epoch: 7 Global Step: 124650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:32,105-Speed 4942.03 samples/sec Loss 4.5017 Epoch: 7 Global Step: 124700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:42,321-Speed 5012.37 samples/sec Loss 4.4955 Epoch: 7 Global Step: 124750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:27:52,864-Speed 4856.34 samples/sec Loss 4.4610 Epoch: 7 Global Step: 124800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:03,244-Speed 4932.89 samples/sec Loss 4.4870 Epoch: 7 Global Step: 124850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:13,665-Speed 4913.54 samples/sec Loss 4.4821 Epoch: 7 Global Step: 124900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:23,810-Speed 5046.93 samples/sec Loss 4.4492 Epoch: 7 Global Step: 124950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:34,102-Speed 4974.95 samples/sec Loss 4.4674 Epoch: 7 Global Step: 125000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:44,484-Speed 4932.26 samples/sec Loss 4.5316 Epoch: 7 Global Step: 125050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:28:54,692-Speed 5015.93 samples/sec Loss 4.4969 Epoch: 7 Global Step: 125100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:05,510-Speed 4732.93 samples/sec Loss 4.5084 Epoch: 7 Global Step: 125150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:15,812-Speed 4970.49 samples/sec Loss 4.4267 Epoch: 7 Global Step: 125200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:26,254-Speed 4903.65 samples/sec Loss 4.4644 Epoch: 7 Global Step: 125250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:36,311-Speed 5090.95 samples/sec Loss 4.4895 Epoch: 7 Global Step: 125300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:46,395-Speed 5077.63 samples/sec Loss 4.4898 Epoch: 7 Global Step: 125350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:29:56,485-Speed 5074.75 samples/sec Loss 4.4196 Epoch: 7 Global Step: 125400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:07,084-Speed 4831.09 samples/sec Loss 4.4345 Epoch: 7 Global Step: 125450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:17,088-Speed 5118.05 samples/sec Loss 4.4440 Epoch: 7 Global Step: 125500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:27,515-Speed 4910.80 samples/sec Loss 4.4487 Epoch: 7 Global Step: 125550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:37,710-Speed 5022.10 samples/sec Loss 4.4620 Epoch: 7 Global Step: 125600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:47,904-Speed 5022.87 samples/sec Loss 4.4693 Epoch: 7 Global Step: 125650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:30:58,200-Speed 4972.82 samples/sec Loss 4.4362 Epoch: 7 Global Step: 125700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:08,516-Speed 4963.73 samples/sec Loss 4.4478 Epoch: 7 Global Step: 125750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:18,787-Speed 4985.41 samples/sec Loss 4.4104 Epoch: 7 Global Step: 125800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:28,953-Speed 5036.62 samples/sec Loss 4.4195 Epoch: 7 Global Step: 125850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:39,097-Speed 5047.39 samples/sec Loss 4.4746 Epoch: 7 Global Step: 125900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:49,469-Speed 4936.87 samples/sec Loss 4.4947 Epoch: 7 Global Step: 125950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:31:59,853-Speed 4930.77 samples/sec Loss 4.4830 Epoch: 7 Global Step: 126000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:32:16,398-[lfw][126000]XNorm: 22.953516 Training: 2021-03-19 04:32:16,398-[lfw][126000]Accuracy-Flip: 0.99567+-0.00281 Training: 2021-03-19 04:32:16,398-[lfw][126000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:32:34,967-[cfp_fp][126000]XNorm: 18.596493 Training: 2021-03-19 04:32:34,968-[cfp_fp][126000]Accuracy-Flip: 0.95129+-0.01694 Training: 2021-03-19 04:32:34,968-[cfp_fp][126000]Accuracy-Highest: 0.95271 Training: 2021-03-19 04:32:51,116-[agedb_30][126000]XNorm: 22.174797 Training: 2021-03-19 04:32:51,116-[agedb_30][126000]Accuracy-Flip: 0.96233+-0.00857 Training: 2021-03-19 04:32:51,116-[agedb_30][126000]Accuracy-Highest: 0.96350 Training: 2021-03-19 04:33:02,377-Speed 818.89 samples/sec Loss 4.4172 Epoch: 7 Global Step: 126050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:33:12,766-Speed 4928.56 samples/sec Loss 4.4289 Epoch: 7 Global Step: 126100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:33:23,126-Speed 4942.17 samples/sec Loss 4.4167 Epoch: 7 Global Step: 126150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:33:33,170-Speed 5098.17 samples/sec Loss 4.4530 Epoch: 7 Global Step: 126200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:33:43,455-Speed 4978.02 samples/sec Loss 4.4428 Epoch: 7 Global Step: 126250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:33:53,809-Speed 4945.19 samples/sec Loss 4.4489 Epoch: 7 Global Step: 126300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:04,047-Speed 5001.54 samples/sec Loss 4.4065 Epoch: 7 Global Step: 126350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:14,151-Speed 5067.82 samples/sec Loss 4.4458 Epoch: 7 Global Step: 126400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:24,269-Speed 5060.54 samples/sec Loss 4.3828 Epoch: 7 Global Step: 126450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:34,523-Speed 4993.33 samples/sec Loss 4.4313 Epoch: 7 Global Step: 126500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:44,783-Speed 4990.39 samples/sec Loss 4.3678 Epoch: 7 Global Step: 126550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:34:55,778-Speed 4656.83 samples/sec Loss 4.4347 Epoch: 7 Global Step: 126600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:06,890-Speed 4607.85 samples/sec Loss 4.4219 Epoch: 7 Global Step: 126650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:17,782-Speed 4700.87 samples/sec Loss 4.3845 Epoch: 7 Global Step: 126700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:27,921-Speed 5050.58 samples/sec Loss 4.4111 Epoch: 7 Global Step: 126750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:38,497-Speed 4841.00 samples/sec Loss 4.3945 Epoch: 7 Global Step: 126800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:48,791-Speed 4974.26 samples/sec Loss 4.4373 Epoch: 7 Global Step: 126850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:35:59,391-Speed 4830.39 samples/sec Loss 4.3641 Epoch: 7 Global Step: 126900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:36:09,665-Speed 4983.79 samples/sec Loss 4.3520 Epoch: 7 Global Step: 126950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:36:19,918-Speed 4994.17 samples/sec Loss 4.3815 Epoch: 7 Global Step: 127000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:36:30,179-Speed 4990.29 samples/sec Loss 4.3837 Epoch: 7 Global Step: 127050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:36:40,746-Speed 4845.59 samples/sec Loss 4.3915 Epoch: 7 Global Step: 127100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:36:51,833-Speed 4618.19 samples/sec Loss 4.3689 Epoch: 7 Global Step: 127150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:02,236-Speed 4921.96 samples/sec Loss 4.4273 Epoch: 7 Global Step: 127200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:13,871-Speed 4400.61 samples/sec Loss 4.3623 Epoch: 7 Global Step: 127250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:24,204-Speed 4955.20 samples/sec Loss 4.3579 Epoch: 7 Global Step: 127300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:35,480-Speed 4540.99 samples/sec Loss 4.4189 Epoch: 7 Global Step: 127350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:45,666-Speed 5026.79 samples/sec Loss 4.3538 Epoch: 7 Global Step: 127400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:37:56,541-Speed 4708.40 samples/sec Loss 4.3247 Epoch: 7 Global Step: 127450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:06,789-Speed 4996.30 samples/sec Loss 4.3714 Epoch: 7 Global Step: 127500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:17,470-Speed 4793.53 samples/sec Loss 4.3198 Epoch: 7 Global Step: 127550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:27,475-Speed 5118.09 samples/sec Loss 4.3614 Epoch: 7 Global Step: 127600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:37,742-Speed 4986.86 samples/sec Loss 4.3844 Epoch: 7 Global Step: 127650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:48,436-Speed 4788.06 samples/sec Loss 4.3577 Epoch: 7 Global Step: 127700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:38:59,048-Speed 4824.89 samples/sec Loss 4.3693 Epoch: 7 Global Step: 127750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:39:09,288-Speed 5000.31 samples/sec Loss 4.3601 Epoch: 7 Global Step: 127800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:39:19,789-Speed 4875.84 samples/sec Loss 4.3466 Epoch: 7 Global Step: 127850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:39:30,134-Speed 4949.37 samples/sec Loss 4.3789 Epoch: 7 Global Step: 127900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:39:40,353-Speed 5010.96 samples/sec Loss 4.3586 Epoch: 7 Global Step: 127950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:39:50,556-Speed 5018.45 samples/sec Loss 4.3181 Epoch: 7 Global Step: 128000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:40:06,862-[lfw][128000]XNorm: 23.839526 Training: 2021-03-19 04:40:06,863-[lfw][128000]Accuracy-Flip: 0.99600+-0.00281 Training: 2021-03-19 04:40:06,863-[lfw][128000]Accuracy-Highest: 0.99617 Training: 2021-03-19 04:40:25,518-[cfp_fp][128000]XNorm: 19.437062 Training: 2021-03-19 04:40:25,519-[cfp_fp][128000]Accuracy-Flip: 0.95171+-0.01097 Training: 2021-03-19 04:40:25,519-[cfp_fp][128000]Accuracy-Highest: 0.95271 Training: 2021-03-19 04:40:41,678-[agedb_30][128000]XNorm: 22.862870 Training: 2021-03-19 04:40:41,678-[agedb_30][128000]Accuracy-Flip: 0.96583+-0.00779 Training: 2021-03-19 04:40:41,679-[agedb_30][128000]Accuracy-Highest: 0.96583 Training: 2021-03-19 04:40:51,767-Speed 836.46 samples/sec Loss 4.3178 Epoch: 7 Global Step: 128050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:02,126-Speed 4942.87 samples/sec Loss 4.3054 Epoch: 7 Global Step: 128100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:12,495-Speed 4937.89 samples/sec Loss 4.3201 Epoch: 7 Global Step: 128150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:22,976-Speed 4885.44 samples/sec Loss 4.3085 Epoch: 7 Global Step: 128200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:33,262-Speed 4978.13 samples/sec Loss 4.4016 Epoch: 7 Global Step: 128250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:43,701-Speed 4905.02 samples/sec Loss 4.3403 Epoch: 7 Global Step: 128300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:41:54,099-Speed 4924.06 samples/sec Loss 4.3255 Epoch: 7 Global Step: 128350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:04,363-Speed 4988.77 samples/sec Loss 4.3602 Epoch: 7 Global Step: 128400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:14,809-Speed 4901.26 samples/sec Loss 4.3076 Epoch: 7 Global Step: 128450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:25,079-Speed 4985.87 samples/sec Loss 4.3182 Epoch: 7 Global Step: 128500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:35,459-Speed 4932.73 samples/sec Loss 4.3217 Epoch: 7 Global Step: 128550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:46,102-Speed 4810.75 samples/sec Loss 4.3437 Epoch: 7 Global Step: 128600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:42:56,467-Speed 4940.37 samples/sec Loss 4.2771 Epoch: 7 Global Step: 128650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:07,017-Speed 4853.06 samples/sec Loss 4.3255 Epoch: 7 Global Step: 128700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:17,262-Speed 4998.26 samples/sec Loss 4.3094 Epoch: 7 Global Step: 128750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:27,855-Speed 4833.49 samples/sec Loss 4.2760 Epoch: 7 Global Step: 128800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:38,196-Speed 4951.30 samples/sec Loss 4.2879 Epoch: 7 Global Step: 128850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:48,370-Speed 5033.20 samples/sec Loss 4.3252 Epoch: 7 Global Step: 128900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:43:58,615-Speed 4997.98 samples/sec Loss 4.3280 Epoch: 7 Global Step: 128950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:44:09,288-Speed 4797.08 samples/sec Loss 4.2708 Epoch: 7 Global Step: 129000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:44:19,720-Speed 4908.11 samples/sec Loss 4.2949 Epoch: 7 Global Step: 129050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:44:30,167-Speed 4901.57 samples/sec Loss 4.3056 Epoch: 7 Global Step: 129100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:44:40,614-Speed 4901.20 samples/sec Loss 4.3067 Epoch: 7 Global Step: 129150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:44:50,794-Speed 5029.56 samples/sec Loss 4.2892 Epoch: 7 Global Step: 129200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:01,037-Speed 4999.03 samples/sec Loss 4.3254 Epoch: 7 Global Step: 129250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:11,327-Speed 4975.74 samples/sec Loss 4.2922 Epoch: 7 Global Step: 129300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:21,781-Speed 4898.29 samples/sec Loss 4.2777 Epoch: 7 Global Step: 129350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:33,281-Speed 4452.45 samples/sec Loss 4.2911 Epoch: 7 Global Step: 129400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:43,771-Speed 4880.97 samples/sec Loss 4.2786 Epoch: 7 Global Step: 129450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:45:54,277-Speed 4873.90 samples/sec Loss 4.2857 Epoch: 7 Global Step: 129500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:04,857-Speed 4839.42 samples/sec Loss 4.2629 Epoch: 7 Global Step: 129550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:14,919-Speed 5088.53 samples/sec Loss 4.2541 Epoch: 7 Global Step: 129600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:25,113-Speed 5023.22 samples/sec Loss 4.2646 Epoch: 7 Global Step: 129650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:35,537-Speed 4912.19 samples/sec Loss 4.2875 Epoch: 7 Global Step: 129700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:45,942-Speed 4920.69 samples/sec Loss 4.3007 Epoch: 7 Global Step: 129750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:46:56,217-Speed 4983.31 samples/sec Loss 4.2721 Epoch: 7 Global Step: 129800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:47:06,668-Speed 4899.56 samples/sec Loss 4.2563 Epoch: 7 Global Step: 129850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:47:17,145-Speed 4887.31 samples/sec Loss 4.3044 Epoch: 7 Global Step: 129900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:47:28,873-Speed 4365.88 samples/sec Loss 4.2417 Epoch: 7 Global Step: 129950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:47:40,031-Speed 4588.61 samples/sec Loss 4.2532 Epoch: 7 Global Step: 130000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:47:56,151-[lfw][130000]XNorm: 23.189701 Training: 2021-03-19 04:47:56,151-[lfw][130000]Accuracy-Flip: 0.99667+-0.00224 Training: 2021-03-19 04:47:56,151-[lfw][130000]Accuracy-Highest: 0.99667 Training: 2021-03-19 04:48:14,793-[cfp_fp][130000]XNorm: 19.097146 Training: 2021-03-19 04:48:14,793-[cfp_fp][130000]Accuracy-Flip: 0.95600+-0.00975 Training: 2021-03-19 04:48:14,793-[cfp_fp][130000]Accuracy-Highest: 0.95600 Training: 2021-03-19 04:48:30,931-[agedb_30][130000]XNorm: 22.370117 Training: 2021-03-19 04:48:30,931-[agedb_30][130000]Accuracy-Flip: 0.96350+-0.00858 Training: 2021-03-19 04:48:30,931-[agedb_30][130000]Accuracy-Highest: 0.96583 Training: 2021-03-19 04:48:41,234-Speed 836.57 samples/sec Loss 4.2199 Epoch: 7 Global Step: 130050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:48:51,641-Speed 4920.17 samples/sec Loss 4.2481 Epoch: 7 Global Step: 130100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:01,927-Speed 4978.01 samples/sec Loss 4.2387 Epoch: 7 Global Step: 130150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:12,526-Speed 4830.43 samples/sec Loss 4.2234 Epoch: 7 Global Step: 130200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:23,051-Speed 4865.27 samples/sec Loss 4.2401 Epoch: 7 Global Step: 130250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:33,271-Speed 5009.82 samples/sec Loss 4.2273 Epoch: 7 Global Step: 130300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:43,633-Speed 4941.40 samples/sec Loss 4.2184 Epoch: 7 Global Step: 130350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:49:53,934-Speed 4970.89 samples/sec Loss 4.1922 Epoch: 7 Global Step: 130400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:04,392-Speed 4896.05 samples/sec Loss 4.2115 Epoch: 7 Global Step: 130450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:15,721-Speed 4519.29 samples/sec Loss 4.2630 Epoch: 7 Global Step: 130500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:26,903-Speed 4579.15 samples/sec Loss 4.1946 Epoch: 7 Global Step: 130550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:37,433-Speed 4862.50 samples/sec Loss 4.2855 Epoch: 7 Global Step: 130600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:48,516-Speed 4619.95 samples/sec Loss 4.2097 Epoch: 7 Global Step: 130650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:50:59,463-Speed 4677.60 samples/sec Loss 4.2367 Epoch: 7 Global Step: 130700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:51:09,811-Speed 4947.79 samples/sec Loss 4.2129 Epoch: 7 Global Step: 130750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:51:20,299-Speed 4882.06 samples/sec Loss 4.2189 Epoch: 7 Global Step: 130800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:51:30,530-Speed 5004.61 samples/sec Loss 4.2123 Epoch: 7 Global Step: 130850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:51:41,095-Speed 4846.40 samples/sec Loss 4.1980 Epoch: 7 Global Step: 130900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:51:51,380-Speed 4978.41 samples/sec Loss 4.1824 Epoch: 7 Global Step: 130950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:01,275-Speed 5174.94 samples/sec Loss 4.2313 Epoch: 7 Global Step: 131000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:11,566-Speed 4975.21 samples/sec Loss 4.1912 Epoch: 7 Global Step: 131050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:21,720-Speed 5042.90 samples/sec Loss 4.2120 Epoch: 7 Global Step: 131100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:32,053-Speed 4955.14 samples/sec Loss 4.1939 Epoch: 7 Global Step: 131150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:42,466-Speed 4917.03 samples/sec Loss 4.2223 Epoch: 7 Global Step: 131200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:52:52,706-Speed 5000.50 samples/sec Loss 4.1583 Epoch: 7 Global Step: 131250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:03,035-Speed 4956.88 samples/sec Loss 4.2323 Epoch: 7 Global Step: 131300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:13,576-Speed 4857.75 samples/sec Loss 4.2042 Epoch: 7 Global Step: 131350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:24,162-Speed 4836.55 samples/sec Loss 4.1937 Epoch: 7 Global Step: 131400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:35,067-Speed 4695.62 samples/sec Loss 4.2029 Epoch: 7 Global Step: 131450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:45,228-Speed 5038.84 samples/sec Loss 4.1860 Epoch: 7 Global Step: 131500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:53:55,744-Speed 4869.23 samples/sec Loss 4.2119 Epoch: 7 Global Step: 131550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:06,191-Speed 4901.18 samples/sec Loss 4.2422 Epoch: 7 Global Step: 131600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:16,472-Speed 4980.22 samples/sec Loss 4.2186 Epoch: 7 Global Step: 131650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:26,779-Speed 4968.03 samples/sec Loss 4.2074 Epoch: 7 Global Step: 131700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:37,095-Speed 4963.45 samples/sec Loss 4.2031 Epoch: 7 Global Step: 131750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:47,372-Speed 4982.40 samples/sec Loss 4.1940 Epoch: 7 Global Step: 131800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:54:57,532-Speed 5039.53 samples/sec Loss 4.1745 Epoch: 7 Global Step: 131850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:55:07,791-Speed 4991.07 samples/sec Loss 4.1637 Epoch: 7 Global Step: 131900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:55:18,213-Speed 4912.63 samples/sec Loss 4.2127 Epoch: 7 Global Step: 131950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:55:28,596-Speed 4931.59 samples/sec Loss 4.1854 Epoch: 7 Global Step: 132000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:55:45,008-[lfw][132000]XNorm: 23.851645 Training: 2021-03-19 04:55:45,009-[lfw][132000]Accuracy-Flip: 0.99550+-0.00236 Training: 2021-03-19 04:55:45,009-[lfw][132000]Accuracy-Highest: 0.99667 Training: 2021-03-19 04:56:03,598-[cfp_fp][132000]XNorm: 19.419094 Training: 2021-03-19 04:56:03,598-[cfp_fp][132000]Accuracy-Flip: 0.95400+-0.01473 Training: 2021-03-19 04:56:03,598-[cfp_fp][132000]Accuracy-Highest: 0.95600 Training: 2021-03-19 04:56:19,665-[agedb_30][132000]XNorm: 23.207402 Training: 2021-03-19 04:56:19,665-[agedb_30][132000]Accuracy-Flip: 0.96250+-0.01055 Training: 2021-03-19 04:56:19,665-[agedb_30][132000]Accuracy-Highest: 0.96583 Training: 2021-03-19 04:56:29,774-Speed 836.91 samples/sec Loss 4.1723 Epoch: 7 Global Step: 132050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:56:40,034-Speed 4990.26 samples/sec Loss 4.2219 Epoch: 7 Global Step: 132100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:56:50,276-Speed 4999.45 samples/sec Loss 4.1707 Epoch: 7 Global Step: 132150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:00,685-Speed 4919.12 samples/sec Loss 4.2032 Epoch: 7 Global Step: 132200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:11,005-Speed 4961.71 samples/sec Loss 4.1333 Epoch: 7 Global Step: 132250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:21,122-Speed 5060.91 samples/sec Loss 4.1799 Epoch: 7 Global Step: 132300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:31,398-Speed 4983.13 samples/sec Loss 4.1608 Epoch: 7 Global Step: 132350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:41,855-Speed 4896.35 samples/sec Loss 4.1751 Epoch: 7 Global Step: 132400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:57:52,311-Speed 4897.18 samples/sec Loss 4.1704 Epoch: 7 Global Step: 132450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:02,326-Speed 5112.46 samples/sec Loss 4.1409 Epoch: 7 Global Step: 132500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:12,621-Speed 4973.63 samples/sec Loss 4.1436 Epoch: 7 Global Step: 132550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:23,044-Speed 4912.46 samples/sec Loss 4.1834 Epoch: 7 Global Step: 132600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:33,219-Speed 5032.37 samples/sec Loss 4.1574 Epoch: 7 Global Step: 132650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:43,498-Speed 4980.88 samples/sec Loss 4.1589 Epoch: 7 Global Step: 132700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:58:53,921-Speed 4912.69 samples/sec Loss 4.1764 Epoch: 7 Global Step: 132750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:05,353-Speed 4478.66 samples/sec Loss 4.1341 Epoch: 7 Global Step: 132800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:15,993-Speed 4812.47 samples/sec Loss 4.1391 Epoch: 7 Global Step: 132850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:26,412-Speed 4914.26 samples/sec Loss 4.1213 Epoch: 7 Global Step: 132900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:36,772-Speed 4942.67 samples/sec Loss 4.1627 Epoch: 7 Global Step: 132950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:47,606-Speed 4725.80 samples/sec Loss 4.1599 Epoch: 7 Global Step: 133000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 04:59:57,700-Speed 5072.61 samples/sec Loss 4.1196 Epoch: 7 Global Step: 133050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:00:07,990-Speed 4976.01 samples/sec Loss 4.1365 Epoch: 7 Global Step: 133100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:00:18,366-Speed 4934.64 samples/sec Loss 4.1966 Epoch: 7 Global Step: 133150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:00:28,579-Speed 5013.57 samples/sec Loss 4.1094 Epoch: 7 Global Step: 133200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:00:38,799-Speed 5010.20 samples/sec Loss 4.1362 Epoch: 7 Global Step: 133250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:00:50,638-Speed 4324.77 samples/sec Loss 4.1181 Epoch: 7 Global Step: 133300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:01:01,812-Speed 4582.55 samples/sec Loss 4.1270 Epoch: 7 Global Step: 133350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:01:12,251-Speed 4905.05 samples/sec Loss 4.1163 Epoch: 7 Global Step: 133400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:01:22,739-Speed 4881.61 samples/sec Loss 4.1230 Epoch: 7 Global Step: 133450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:01:33,043-Speed 4969.46 samples/sec Loss 4.0897 Epoch: 7 Global Step: 133500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:01:56,248-Speed 2206.44 samples/sec Loss 3.9603 Epoch: 8 Global Step: 133550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:06,816-Speed 4845.20 samples/sec Loss 3.7098 Epoch: 8 Global Step: 133600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:17,376-Speed 4848.77 samples/sec Loss 3.7137 Epoch: 8 Global Step: 133650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:27,683-Speed 4968.30 samples/sec Loss 3.7269 Epoch: 8 Global Step: 133700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:37,944-Speed 4989.68 samples/sec Loss 3.6990 Epoch: 8 Global Step: 133750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:48,073-Speed 5055.36 samples/sec Loss 3.6897 Epoch: 8 Global Step: 133800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:02:58,265-Speed 5023.97 samples/sec Loss 3.7106 Epoch: 8 Global Step: 133850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:03:09,614-Speed 4511.45 samples/sec Loss 3.7042 Epoch: 8 Global Step: 133900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:03:19,662-Speed 5096.31 samples/sec Loss 3.7312 Epoch: 8 Global Step: 133950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:03:31,002-Speed 4515.12 samples/sec Loss 3.7017 Epoch: 8 Global Step: 134000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:03:47,514-[lfw][134000]XNorm: 25.058713 Training: 2021-03-19 05:03:47,514-[lfw][134000]Accuracy-Flip: 0.99567+-0.00291 Training: 2021-03-19 05:03:47,514-[lfw][134000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:04:06,114-[cfp_fp][134000]XNorm: 20.283156 Training: 2021-03-19 05:04:06,115-[cfp_fp][134000]Accuracy-Flip: 0.95000+-0.01244 Training: 2021-03-19 05:04:06,115-[cfp_fp][134000]Accuracy-Highest: 0.95600 Training: 2021-03-19 05:04:22,254-[agedb_30][134000]XNorm: 23.796853 Training: 2021-03-19 05:04:22,255-[agedb_30][134000]Accuracy-Flip: 0.96567+-0.00917 Training: 2021-03-19 05:04:22,255-[agedb_30][134000]Accuracy-Highest: 0.96583 Training: 2021-03-19 05:04:32,839-Speed 827.99 samples/sec Loss 3.6882 Epoch: 8 Global Step: 134050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:04:43,172-Speed 4955.08 samples/sec Loss 3.7062 Epoch: 8 Global Step: 134100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:04:53,224-Speed 5093.57 samples/sec Loss 3.7419 Epoch: 8 Global Step: 134150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:03,043-Speed 5215.02 samples/sec Loss 3.7334 Epoch: 8 Global Step: 134200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:13,485-Speed 4903.30 samples/sec Loss 3.7117 Epoch: 8 Global Step: 134250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:23,813-Speed 4957.89 samples/sec Loss 3.7091 Epoch: 8 Global Step: 134300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:33,934-Speed 5059.48 samples/sec Loss 3.7448 Epoch: 8 Global Step: 134350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:44,158-Speed 5007.88 samples/sec Loss 3.6924 Epoch: 8 Global Step: 134400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:05:54,271-Speed 5063.52 samples/sec Loss 3.7063 Epoch: 8 Global Step: 134450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:04,338-Speed 5086.06 samples/sec Loss 3.7427 Epoch: 8 Global Step: 134500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:14,179-Speed 5202.75 samples/sec Loss 3.7391 Epoch: 8 Global Step: 134550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:24,092-Speed 5165.21 samples/sec Loss 3.7402 Epoch: 8 Global Step: 134600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:34,076-Speed 5128.78 samples/sec Loss 3.7486 Epoch: 8 Global Step: 134650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:44,164-Speed 5075.77 samples/sec Loss 3.7342 Epoch: 8 Global Step: 134700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:06:54,395-Speed 5004.38 samples/sec Loss 3.7158 Epoch: 8 Global Step: 134750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:04,412-Speed 5111.61 samples/sec Loss 3.7655 Epoch: 8 Global Step: 134800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:14,278-Speed 5190.06 samples/sec Loss 3.7713 Epoch: 8 Global Step: 134850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:24,543-Speed 4988.03 samples/sec Loss 3.7340 Epoch: 8 Global Step: 134900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:34,621-Speed 5081.04 samples/sec Loss 3.7426 Epoch: 8 Global Step: 134950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:44,549-Speed 5157.32 samples/sec Loss 3.7361 Epoch: 8 Global Step: 135000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:07:54,563-Speed 5113.33 samples/sec Loss 3.7223 Epoch: 8 Global Step: 135050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:04,514-Speed 5145.10 samples/sec Loss 3.7178 Epoch: 8 Global Step: 135100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:14,447-Speed 5154.97 samples/sec Loss 3.7455 Epoch: 8 Global Step: 135150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:24,430-Speed 5129.18 samples/sec Loss 3.7841 Epoch: 8 Global Step: 135200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:34,381-Speed 5145.52 samples/sec Loss 3.7656 Epoch: 8 Global Step: 135250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:44,349-Speed 5136.55 samples/sec Loss 3.7867 Epoch: 8 Global Step: 135300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:08:54,166-Speed 5216.02 samples/sec Loss 3.7655 Epoch: 8 Global Step: 135350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:04,074-Speed 5167.63 samples/sec Loss 3.7403 Epoch: 8 Global Step: 135400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:13,917-Speed 5201.98 samples/sec Loss 3.7382 Epoch: 8 Global Step: 135450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:24,009-Speed 5073.66 samples/sec Loss 3.7642 Epoch: 8 Global Step: 135500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:34,216-Speed 5016.44 samples/sec Loss 3.7455 Epoch: 8 Global Step: 135550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:44,306-Speed 5074.70 samples/sec Loss 3.7671 Epoch: 8 Global Step: 135600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:09:54,460-Speed 5042.62 samples/sec Loss 3.7937 Epoch: 8 Global Step: 135650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:04,590-Speed 5054.86 samples/sec Loss 3.7845 Epoch: 8 Global Step: 135700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:14,521-Speed 5155.65 samples/sec Loss 3.7597 Epoch: 8 Global Step: 135750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:24,603-Speed 5078.74 samples/sec Loss 3.7323 Epoch: 8 Global Step: 135800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:34,694-Speed 5073.87 samples/sec Loss 3.7635 Epoch: 8 Global Step: 135850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:44,448-Speed 5249.21 samples/sec Loss 3.7458 Epoch: 8 Global Step: 135900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:10:54,313-Speed 5190.79 samples/sec Loss 3.8081 Epoch: 8 Global Step: 135950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:11:04,336-Speed 5108.42 samples/sec Loss 3.7563 Epoch: 8 Global Step: 136000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:11:21,089-[lfw][136000]XNorm: 24.072042 Training: 2021-03-19 05:11:21,090-[lfw][136000]Accuracy-Flip: 0.99533+-0.00277 Training: 2021-03-19 05:11:21,090-[lfw][136000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:11:39,806-[cfp_fp][136000]XNorm: 19.705943 Training: 2021-03-19 05:11:39,807-[cfp_fp][136000]Accuracy-Flip: 0.95614+-0.01302 Training: 2021-03-19 05:11:39,807-[cfp_fp][136000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:11:55,976-[agedb_30][136000]XNorm: 23.025966 Training: 2021-03-19 05:11:55,976-[agedb_30][136000]Accuracy-Flip: 0.96117+-0.01000 Training: 2021-03-19 05:11:55,976-[agedb_30][136000]Accuracy-Highest: 0.96583 Training: 2021-03-19 05:12:05,937-Speed 831.15 samples/sec Loss 3.7551 Epoch: 8 Global Step: 136050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:12:16,828-Speed 4701.37 samples/sec Loss 3.7538 Epoch: 8 Global Step: 136100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:12:26,968-Speed 5049.72 samples/sec Loss 3.7651 Epoch: 8 Global Step: 136150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:12:37,085-Speed 5061.15 samples/sec Loss 3.7188 Epoch: 8 Global Step: 136200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:12:47,242-Speed 5041.37 samples/sec Loss 3.7596 Epoch: 8 Global Step: 136250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:12:57,387-Speed 5046.84 samples/sec Loss 3.7516 Epoch: 8 Global Step: 136300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:07,275-Speed 5178.71 samples/sec Loss 3.7254 Epoch: 8 Global Step: 136350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:17,270-Speed 5122.69 samples/sec Loss 3.7070 Epoch: 8 Global Step: 136400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:27,316-Speed 5096.85 samples/sec Loss 3.7571 Epoch: 8 Global Step: 136450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:37,434-Speed 5060.58 samples/sec Loss 3.7603 Epoch: 8 Global Step: 136500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:47,953-Speed 4867.87 samples/sec Loss 3.7396 Epoch: 8 Global Step: 136550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:13:58,712-Speed 4758.78 samples/sec Loss 3.7529 Epoch: 8 Global Step: 136600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:14:08,717-Speed 5117.66 samples/sec Loss 3.7957 Epoch: 8 Global Step: 136650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:14:20,267-Speed 4433.17 samples/sec Loss 3.7895 Epoch: 8 Global Step: 136700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:14:30,276-Speed 5115.69 samples/sec Loss 3.7597 Epoch: 8 Global Step: 136750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:14:40,324-Speed 5095.90 samples/sec Loss 3.7440 Epoch: 8 Global Step: 136800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:14:50,541-Speed 5011.46 samples/sec Loss 3.7942 Epoch: 8 Global Step: 136850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:00,634-Speed 5073.35 samples/sec Loss 3.7782 Epoch: 8 Global Step: 136900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:10,758-Speed 5057.49 samples/sec Loss 3.8204 Epoch: 8 Global Step: 136950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:20,923-Speed 5037.21 samples/sec Loss 3.7817 Epoch: 8 Global Step: 137000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:31,078-Speed 5041.84 samples/sec Loss 3.8085 Epoch: 8 Global Step: 137050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:40,907-Speed 5209.44 samples/sec Loss 3.7593 Epoch: 8 Global Step: 137100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:15:51,230-Speed 4960.19 samples/sec Loss 3.8267 Epoch: 8 Global Step: 137150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:01,570-Speed 4951.93 samples/sec Loss 3.7651 Epoch: 8 Global Step: 137200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:13,417-Speed 4322.10 samples/sec Loss 3.7979 Epoch: 8 Global Step: 137250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:23,299-Speed 5181.66 samples/sec Loss 3.7680 Epoch: 8 Global Step: 137300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:35,321-Speed 4259.03 samples/sec Loss 3.7778 Epoch: 8 Global Step: 137350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:45,552-Speed 5004.42 samples/sec Loss 3.7831 Epoch: 8 Global Step: 137400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:16:55,624-Speed 5083.95 samples/sec Loss 3.7532 Epoch: 8 Global Step: 137450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:17:05,762-Speed 5050.49 samples/sec Loss 3.7602 Epoch: 8 Global Step: 137500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:17:16,043-Speed 4980.16 samples/sec Loss 3.7820 Epoch: 8 Global Step: 137550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:17:26,086-Speed 5098.54 samples/sec Loss 3.7848 Epoch: 8 Global Step: 137600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-19 05:17:36,359-Speed 4984.38 samples/sec Loss 3.7448 Epoch: 8 Global Step: 137650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:17:46,187-Speed 5209.71 samples/sec Loss 3.7836 Epoch: 8 Global Step: 137700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:17:56,177-Speed 5125.31 samples/sec Loss 3.7606 Epoch: 8 Global Step: 137750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:18:06,230-Speed 5093.41 samples/sec Loss 3.7485 Epoch: 8 Global Step: 137800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:18:16,450-Speed 5010.63 samples/sec Loss 3.7747 Epoch: 8 Global Step: 137850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:18:26,577-Speed 5055.81 samples/sec Loss 3.7838 Epoch: 8 Global Step: 137900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:18:36,349-Speed 5240.27 samples/sec Loss 3.8010 Epoch: 8 Global Step: 137950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:18:46,618-Speed 4986.16 samples/sec Loss 3.8063 Epoch: 8 Global Step: 138000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:19:03,479-[lfw][138000]XNorm: 23.530103 Training: 2021-03-19 05:19:03,480-[lfw][138000]Accuracy-Flip: 0.99633+-0.00277 Training: 2021-03-19 05:19:03,480-[lfw][138000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:19:22,096-[cfp_fp][138000]XNorm: 18.919709 Training: 2021-03-19 05:19:22,097-[cfp_fp][138000]Accuracy-Flip: 0.95286+-0.01183 Training: 2021-03-19 05:19:22,097-[cfp_fp][138000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:19:38,287-[agedb_30][138000]XNorm: 22.587512 Training: 2021-03-19 05:19:38,288-[agedb_30][138000]Accuracy-Flip: 0.96817+-0.01047 Training: 2021-03-19 05:19:38,288-[agedb_30][138000]Accuracy-Highest: 0.96817 Training: 2021-03-19 05:19:48,282-Speed 830.31 samples/sec Loss 3.8003 Epoch: 8 Global Step: 138050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:19:58,363-Speed 5078.96 samples/sec Loss 3.7533 Epoch: 8 Global Step: 138100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:08,251-Speed 5178.32 samples/sec Loss 3.7648 Epoch: 8 Global Step: 138150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:18,367-Speed 5061.67 samples/sec Loss 3.7559 Epoch: 8 Global Step: 138200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:28,417-Speed 5094.56 samples/sec Loss 3.7984 Epoch: 8 Global Step: 138250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:38,238-Speed 5213.79 samples/sec Loss 3.7727 Epoch: 8 Global Step: 138300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:48,135-Speed 5173.44 samples/sec Loss 3.8189 Epoch: 8 Global Step: 138350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:20:58,182-Speed 5096.29 samples/sec Loss 3.7908 Epoch: 8 Global Step: 138400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:08,285-Speed 5068.38 samples/sec Loss 3.7615 Epoch: 8 Global Step: 138450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:18,331-Speed 5096.65 samples/sec Loss 3.7295 Epoch: 8 Global Step: 138500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:28,476-Speed 5047.32 samples/sec Loss 3.8040 Epoch: 8 Global Step: 138550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:38,650-Speed 5032.89 samples/sec Loss 3.7910 Epoch: 8 Global Step: 138600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:48,711-Speed 5088.96 samples/sec Loss 3.7283 Epoch: 8 Global Step: 138650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:21:58,468-Speed 5247.80 samples/sec Loss 3.7887 Epoch: 8 Global Step: 138700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:08,467-Speed 5120.92 samples/sec Loss 3.7881 Epoch: 8 Global Step: 138750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:18,525-Speed 5090.74 samples/sec Loss 3.7733 Epoch: 8 Global Step: 138800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:28,590-Speed 5087.14 samples/sec Loss 3.7869 Epoch: 8 Global Step: 138850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:38,604-Speed 5113.38 samples/sec Loss 3.7840 Epoch: 8 Global Step: 138900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:48,725-Speed 5059.17 samples/sec Loss 3.7347 Epoch: 8 Global Step: 138950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:22:58,643-Speed 5162.28 samples/sec Loss 3.7626 Epoch: 8 Global Step: 139000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:08,640-Speed 5122.16 samples/sec Loss 3.7823 Epoch: 8 Global Step: 139050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:18,852-Speed 5013.80 samples/sec Loss 3.7440 Epoch: 8 Global Step: 139100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:28,935-Speed 5078.41 samples/sec Loss 3.7461 Epoch: 8 Global Step: 139150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:38,950-Speed 5112.83 samples/sec Loss 3.7271 Epoch: 8 Global Step: 139200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:48,840-Speed 5177.19 samples/sec Loss 3.7579 Epoch: 8 Global Step: 139250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:23:58,858-Speed 5111.04 samples/sec Loss 3.7914 Epoch: 8 Global Step: 139300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:08,621-Speed 5244.84 samples/sec Loss 3.7469 Epoch: 8 Global Step: 139350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:19,480-Speed 4714.92 samples/sec Loss 3.7380 Epoch: 8 Global Step: 139400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:29,578-Speed 5070.74 samples/sec Loss 3.7575 Epoch: 8 Global Step: 139450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:39,693-Speed 5062.11 samples/sec Loss 3.8210 Epoch: 8 Global Step: 139500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:49,744-Speed 5093.94 samples/sec Loss 3.7829 Epoch: 8 Global Step: 139550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:24:59,865-Speed 5059.02 samples/sec Loss 3.7927 Epoch: 8 Global Step: 139600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:25:10,120-Speed 4993.03 samples/sec Loss 3.7470 Epoch: 8 Global Step: 139650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:25:19,989-Speed 5188.55 samples/sec Loss 3.7526 Epoch: 8 Global Step: 139700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:25:30,093-Speed 5067.49 samples/sec Loss 3.7608 Epoch: 8 Global Step: 139750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:25:39,957-Speed 5190.75 samples/sec Loss 3.7609 Epoch: 8 Global Step: 139800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:25:50,173-Speed 5012.26 samples/sec Loss 3.7616 Epoch: 8 Global Step: 139850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:26:00,507-Speed 4955.01 samples/sec Loss 3.7619 Epoch: 8 Global Step: 139900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:26:12,067-Speed 4429.24 samples/sec Loss 3.7612 Epoch: 8 Global Step: 139950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:26:22,080-Speed 5114.09 samples/sec Loss 3.7851 Epoch: 8 Global Step: 140000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:26:38,634-[lfw][140000]XNorm: 23.494740 Training: 2021-03-19 05:26:38,634-[lfw][140000]Accuracy-Flip: 0.99650+-0.00345 Training: 2021-03-19 05:26:38,634-[lfw][140000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:26:57,311-[cfp_fp][140000]XNorm: 19.100480 Training: 2021-03-19 05:26:57,312-[cfp_fp][140000]Accuracy-Flip: 0.95543+-0.00910 Training: 2021-03-19 05:26:57,312-[cfp_fp][140000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:27:13,456-[agedb_30][140000]XNorm: 22.448771 Training: 2021-03-19 05:27:13,456-[agedb_30][140000]Accuracy-Flip: 0.96600+-0.00870 Training: 2021-03-19 05:27:13,456-[agedb_30][140000]Accuracy-Highest: 0.96817 Training: 2021-03-19 05:27:24,228-Speed 823.85 samples/sec Loss 3.7902 Epoch: 8 Global Step: 140050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:27:34,102-Speed 5185.57 samples/sec Loss 3.7667 Epoch: 8 Global Step: 140100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:27:44,133-Speed 5104.53 samples/sec Loss 3.7956 Epoch: 8 Global Step: 140150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:27:53,995-Speed 5191.85 samples/sec Loss 3.7930 Epoch: 8 Global Step: 140200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:04,081-Speed 5076.38 samples/sec Loss 3.8315 Epoch: 8 Global Step: 140250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:14,039-Speed 5142.09 samples/sec Loss 3.7979 Epoch: 8 Global Step: 140300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:24,324-Speed 4978.68 samples/sec Loss 3.8094 Epoch: 8 Global Step: 140350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:34,333-Speed 5115.22 samples/sec Loss 3.7835 Epoch: 8 Global Step: 140400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:44,326-Speed 5124.18 samples/sec Loss 3.8190 Epoch: 8 Global Step: 140450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:28:54,268-Speed 5150.18 samples/sec Loss 3.7857 Epoch: 8 Global Step: 140500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:04,518-Speed 4995.18 samples/sec Loss 3.7850 Epoch: 8 Global Step: 140550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:15,581-Speed 4628.44 samples/sec Loss 3.7872 Epoch: 8 Global Step: 140600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:26,590-Speed 4651.01 samples/sec Loss 3.8192 Epoch: 8 Global Step: 140650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:37,903-Speed 4525.96 samples/sec Loss 3.7503 Epoch: 8 Global Step: 140700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:47,951-Speed 5095.73 samples/sec Loss 3.7824 Epoch: 8 Global Step: 140750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:29:58,252-Speed 4970.63 samples/sec Loss 3.8024 Epoch: 8 Global Step: 140800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:08,390-Speed 5050.82 samples/sec Loss 3.7619 Epoch: 8 Global Step: 140850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:18,361-Speed 5135.13 samples/sec Loss 3.8003 Epoch: 8 Global Step: 140900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:28,250-Speed 5177.82 samples/sec Loss 3.7370 Epoch: 8 Global Step: 140950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:38,258-Speed 5116.22 samples/sec Loss 3.7710 Epoch: 8 Global Step: 141000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:48,279-Speed 5109.65 samples/sec Loss 3.8049 Epoch: 8 Global Step: 141050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:30:58,355-Speed 5081.62 samples/sec Loss 3.7274 Epoch: 8 Global Step: 141100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:08,438-Speed 5078.18 samples/sec Loss 3.7878 Epoch: 8 Global Step: 141150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:18,460-Speed 5109.02 samples/sec Loss 3.7886 Epoch: 8 Global Step: 141200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:28,207-Speed 5253.02 samples/sec Loss 3.7538 Epoch: 8 Global Step: 141250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:38,330-Speed 5058.04 samples/sec Loss 3.7807 Epoch: 8 Global Step: 141300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:48,268-Speed 5151.99 samples/sec Loss 3.7607 Epoch: 8 Global Step: 141350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:31:58,671-Speed 4922.11 samples/sec Loss 3.7831 Epoch: 8 Global Step: 141400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:08,646-Speed 5133.39 samples/sec Loss 3.7415 Epoch: 8 Global Step: 141450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:18,806-Speed 5039.39 samples/sec Loss 3.7452 Epoch: 8 Global Step: 141500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:28,859-Speed 5093.48 samples/sec Loss 3.8087 Epoch: 8 Global Step: 141550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:38,876-Speed 5111.75 samples/sec Loss 3.8219 Epoch: 8 Global Step: 141600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:48,796-Speed 5161.44 samples/sec Loss 3.7609 Epoch: 8 Global Step: 141650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:32:58,969-Speed 5033.63 samples/sec Loss 3.7455 Epoch: 8 Global Step: 141700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:09,037-Speed 5085.51 samples/sec Loss 3.7636 Epoch: 8 Global Step: 141750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:19,002-Speed 5138.24 samples/sec Loss 3.7695 Epoch: 8 Global Step: 141800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:28,799-Speed 5226.42 samples/sec Loss 3.7760 Epoch: 8 Global Step: 141850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:39,206-Speed 4920.05 samples/sec Loss 3.7831 Epoch: 8 Global Step: 141900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:49,197-Speed 5125.13 samples/sec Loss 3.7980 Epoch: 8 Global Step: 141950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:33:59,481-Speed 4978.63 samples/sec Loss 3.7721 Epoch: 8 Global Step: 142000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:34:15,990-[lfw][142000]XNorm: 24.463874 Training: 2021-03-19 05:34:15,990-[lfw][142000]Accuracy-Flip: 0.99567+-0.00291 Training: 2021-03-19 05:34:15,990-[lfw][142000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:34:34,551-[cfp_fp][142000]XNorm: 19.927287 Training: 2021-03-19 05:34:34,551-[cfp_fp][142000]Accuracy-Flip: 0.95586+-0.01147 Training: 2021-03-19 05:34:34,551-[cfp_fp][142000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:34:50,572-[agedb_30][142000]XNorm: 23.399011 Training: 2021-03-19 05:34:50,573-[agedb_30][142000]Accuracy-Flip: 0.96467+-0.00924 Training: 2021-03-19 05:34:50,573-[agedb_30][142000]Accuracy-Highest: 0.96817 Training: 2021-03-19 05:35:00,917-Speed 833.39 samples/sec Loss 3.7804 Epoch: 8 Global Step: 142050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:35:11,094-Speed 5031.33 samples/sec Loss 3.7944 Epoch: 8 Global Step: 142100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:35:21,175-Speed 5078.89 samples/sec Loss 3.7849 Epoch: 8 Global Step: 142150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:35:31,295-Speed 5059.91 samples/sec Loss 3.7680 Epoch: 8 Global Step: 142200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:35:41,374-Speed 5080.23 samples/sec Loss 3.7680 Epoch: 8 Global Step: 142250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:35:51,415-Speed 5098.90 samples/sec Loss 3.7764 Epoch: 8 Global Step: 142300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:01,343-Speed 5157.68 samples/sec Loss 3.7586 Epoch: 8 Global Step: 142350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:11,311-Speed 5137.13 samples/sec Loss 3.7405 Epoch: 8 Global Step: 142400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:21,341-Speed 5104.76 samples/sec Loss 3.7701 Epoch: 8 Global Step: 142450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:31,535-Speed 5022.95 samples/sec Loss 3.7957 Epoch: 8 Global Step: 142500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:41,630-Speed 5071.88 samples/sec Loss 3.7636 Epoch: 8 Global Step: 142550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:36:51,993-Speed 4941.02 samples/sec Loss 3.7671 Epoch: 8 Global Step: 142600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:01,860-Speed 5189.15 samples/sec Loss 3.7509 Epoch: 8 Global Step: 142650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:12,812-Speed 4675.16 samples/sec Loss 3.7970 Epoch: 8 Global Step: 142700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:22,899-Speed 5076.55 samples/sec Loss 3.7524 Epoch: 8 Global Step: 142750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:32,862-Speed 5139.16 samples/sec Loss 3.7368 Epoch: 8 Global Step: 142800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:42,962-Speed 5069.65 samples/sec Loss 3.7959 Epoch: 8 Global Step: 142850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:37:53,089-Speed 5056.24 samples/sec Loss 3.7843 Epoch: 8 Global Step: 142900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:03,303-Speed 5013.05 samples/sec Loss 3.7902 Epoch: 8 Global Step: 142950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:13,453-Speed 5044.81 samples/sec Loss 3.7350 Epoch: 8 Global Step: 143000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:23,304-Speed 5197.25 samples/sec Loss 3.7465 Epoch: 8 Global Step: 143050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:33,388-Speed 5078.13 samples/sec Loss 3.7763 Epoch: 8 Global Step: 143100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:43,298-Speed 5166.44 samples/sec Loss 3.7635 Epoch: 8 Global Step: 143150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:38:53,379-Speed 5079.25 samples/sec Loss 3.7745 Epoch: 8 Global Step: 143200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:04,850-Speed 4463.76 samples/sec Loss 3.8056 Epoch: 8 Global Step: 143250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:14,876-Speed 5107.14 samples/sec Loss 3.7538 Epoch: 8 Global Step: 143300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:25,015-Speed 5050.03 samples/sec Loss 3.7920 Epoch: 8 Global Step: 143350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:34,824-Speed 5219.93 samples/sec Loss 3.7614 Epoch: 8 Global Step: 143400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:45,494-Speed 4798.54 samples/sec Loss 3.7795 Epoch: 8 Global Step: 143450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:39:55,716-Speed 5009.19 samples/sec Loss 3.7657 Epoch: 8 Global Step: 143500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:05,626-Speed 5166.79 samples/sec Loss 3.7850 Epoch: 8 Global Step: 143550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:15,583-Speed 5142.48 samples/sec Loss 3.7626 Epoch: 8 Global Step: 143600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:25,403-Speed 5214.21 samples/sec Loss 3.7804 Epoch: 8 Global Step: 143650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:35,388-Speed 5128.00 samples/sec Loss 3.7969 Epoch: 8 Global Step: 143700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:45,191-Speed 5223.47 samples/sec Loss 3.7947 Epoch: 8 Global Step: 143750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:40:55,350-Speed 5040.04 samples/sec Loss 3.7717 Epoch: 8 Global Step: 143800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:41:05,473-Speed 5057.93 samples/sec Loss 3.7208 Epoch: 8 Global Step: 143850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:41:15,709-Speed 5002.29 samples/sec Loss 3.7431 Epoch: 8 Global Step: 143900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:41:27,075-Speed 4504.54 samples/sec Loss 3.7307 Epoch: 8 Global Step: 143950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:41:37,906-Speed 4727.85 samples/sec Loss 3.7682 Epoch: 8 Global Step: 144000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:41:54,186-[lfw][144000]XNorm: 23.921304 Training: 2021-03-19 05:41:54,186-[lfw][144000]Accuracy-Flip: 0.99600+-0.00271 Training: 2021-03-19 05:41:54,186-[lfw][144000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:42:12,771-[cfp_fp][144000]XNorm: 19.748895 Training: 2021-03-19 05:42:12,771-[cfp_fp][144000]Accuracy-Flip: 0.95486+-0.01238 Training: 2021-03-19 05:42:12,771-[cfp_fp][144000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:42:28,929-[agedb_30][144000]XNorm: 23.192120 Training: 2021-03-19 05:42:28,929-[agedb_30][144000]Accuracy-Flip: 0.96467+-0.00942 Training: 2021-03-19 05:42:28,929-[agedb_30][144000]Accuracy-Highest: 0.96817 Training: 2021-03-19 05:42:39,524-Speed 830.93 samples/sec Loss 3.7617 Epoch: 8 Global Step: 144050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:42:49,624-Speed 5069.25 samples/sec Loss 3.7604 Epoch: 8 Global Step: 144100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:42:59,695-Speed 5084.36 samples/sec Loss 3.7988 Epoch: 8 Global Step: 144150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:09,751-Speed 5091.67 samples/sec Loss 3.8099 Epoch: 8 Global Step: 144200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:19,620-Speed 5188.16 samples/sec Loss 3.7435 Epoch: 8 Global Step: 144250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:29,501-Speed 5181.85 samples/sec Loss 3.7563 Epoch: 8 Global Step: 144300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:39,295-Speed 5228.17 samples/sec Loss 3.7839 Epoch: 8 Global Step: 144350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:49,353-Speed 5090.63 samples/sec Loss 3.7762 Epoch: 8 Global Step: 144400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:43:59,557-Speed 5018.25 samples/sec Loss 3.7617 Epoch: 8 Global Step: 144450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:09,656-Speed 5069.70 samples/sec Loss 3.8025 Epoch: 8 Global Step: 144500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:19,748-Speed 5073.98 samples/sec Loss 3.7914 Epoch: 8 Global Step: 144550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:29,809-Speed 5088.86 samples/sec Loss 3.7536 Epoch: 8 Global Step: 144600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:39,889-Speed 5079.57 samples/sec Loss 3.7841 Epoch: 8 Global Step: 144650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:49,973-Speed 5077.65 samples/sec Loss 3.7676 Epoch: 8 Global Step: 144700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:44:59,958-Speed 5128.43 samples/sec Loss 3.7519 Epoch: 8 Global Step: 144750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:45:09,930-Speed 5134.44 samples/sec Loss 3.7803 Epoch: 8 Global Step: 144800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:45:20,199-Speed 4986.53 samples/sec Loss 3.7785 Epoch: 8 Global Step: 144850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:45:30,449-Speed 4995.28 samples/sec Loss 3.7874 Epoch: 8 Global Step: 144900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:45:40,563-Speed 5062.43 samples/sec Loss 3.7756 Epoch: 8 Global Step: 144950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:45:50,657-Speed 5072.51 samples/sec Loss 3.7487 Epoch: 8 Global Step: 145000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:00,687-Speed 5105.03 samples/sec Loss 3.7668 Epoch: 8 Global Step: 145050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:10,642-Speed 5143.54 samples/sec Loss 3.7389 Epoch: 8 Global Step: 145100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:20,641-Speed 5120.76 samples/sec Loss 3.7846 Epoch: 8 Global Step: 145150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:30,808-Speed 5036.24 samples/sec Loss 3.7287 Epoch: 8 Global Step: 145200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:40,985-Speed 5031.06 samples/sec Loss 3.8009 Epoch: 8 Global Step: 145250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:46:51,196-Speed 5014.67 samples/sec Loss 3.7379 Epoch: 8 Global Step: 145300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:01,424-Speed 5006.17 samples/sec Loss 3.7535 Epoch: 8 Global Step: 145350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:11,362-Speed 5151.81 samples/sec Loss 3.7553 Epoch: 8 Global Step: 145400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:21,686-Speed 4959.99 samples/sec Loss 3.7623 Epoch: 8 Global Step: 145450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:31,609-Speed 5159.71 samples/sec Loss 3.7418 Epoch: 8 Global Step: 145500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:41,640-Speed 5104.89 samples/sec Loss 3.7768 Epoch: 8 Global Step: 145550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:47:51,504-Speed 5190.61 samples/sec Loss 3.7673 Epoch: 8 Global Step: 145600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:01,425-Speed 5161.14 samples/sec Loss 3.7643 Epoch: 8 Global Step: 145650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:11,606-Speed 5029.34 samples/sec Loss 3.7880 Epoch: 8 Global Step: 145700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:21,833-Speed 5006.82 samples/sec Loss 3.7391 Epoch: 8 Global Step: 145750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:31,834-Speed 5119.78 samples/sec Loss 3.7594 Epoch: 8 Global Step: 145800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:41,768-Speed 5154.59 samples/sec Loss 3.7541 Epoch: 8 Global Step: 145850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:48:51,868-Speed 5069.75 samples/sec Loss 3.7949 Epoch: 8 Global Step: 145900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:49:01,925-Speed 5090.98 samples/sec Loss 3.7831 Epoch: 8 Global Step: 145950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:49:12,012-Speed 5076.24 samples/sec Loss 3.7833 Epoch: 8 Global Step: 146000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:49:28,769-[lfw][146000]XNorm: 24.151877 Training: 2021-03-19 05:49:28,769-[lfw][146000]Accuracy-Flip: 0.99617+-0.00289 Training: 2021-03-19 05:49:28,769-[lfw][146000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:49:47,399-[cfp_fp][146000]XNorm: 19.939693 Training: 2021-03-19 05:49:47,400-[cfp_fp][146000]Accuracy-Flip: 0.95329+-0.01486 Training: 2021-03-19 05:49:47,400-[cfp_fp][146000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:50:03,556-[agedb_30][146000]XNorm: 23.181193 Training: 2021-03-19 05:50:03,557-[agedb_30][146000]Accuracy-Flip: 0.96867+-0.00632 Training: 2021-03-19 05:50:03,557-[agedb_30][146000]Accuracy-Highest: 0.96867 Training: 2021-03-19 05:50:13,383-Speed 834.28 samples/sec Loss 3.8306 Epoch: 8 Global Step: 146050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:50:24,494-Speed 4608.45 samples/sec Loss 3.7591 Epoch: 8 Global Step: 146100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:50:34,657-Speed 5037.85 samples/sec Loss 3.7742 Epoch: 8 Global Step: 146150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:50:44,846-Speed 5025.49 samples/sec Loss 3.7862 Epoch: 8 Global Step: 146200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:50:54,961-Speed 5061.88 samples/sec Loss 3.7889 Epoch: 8 Global Step: 146250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:04,979-Speed 5111.36 samples/sec Loss 3.7965 Epoch: 8 Global Step: 146300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:15,029-Speed 5094.65 samples/sec Loss 3.7389 Epoch: 8 Global Step: 146350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:24,950-Speed 5161.50 samples/sec Loss 3.7380 Epoch: 8 Global Step: 146400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:35,140-Speed 5024.89 samples/sec Loss 3.8045 Epoch: 8 Global Step: 146450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:45,153-Speed 5113.43 samples/sec Loss 3.7549 Epoch: 8 Global Step: 146500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:51:55,161-Speed 5116.07 samples/sec Loss 3.7637 Epoch: 8 Global Step: 146550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:05,884-Speed 4775.05 samples/sec Loss 3.7929 Epoch: 8 Global Step: 146600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:15,797-Speed 5165.27 samples/sec Loss 3.7676 Epoch: 8 Global Step: 146650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:26,594-Speed 4742.32 samples/sec Loss 3.7859 Epoch: 8 Global Step: 146700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:37,535-Speed 4679.62 samples/sec Loss 3.7800 Epoch: 8 Global Step: 146750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:47,669-Speed 5052.86 samples/sec Loss 3.7488 Epoch: 8 Global Step: 146800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:52:57,788-Speed 5059.63 samples/sec Loss 3.7379 Epoch: 8 Global Step: 146850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:07,821-Speed 5103.52 samples/sec Loss 3.7828 Epoch: 8 Global Step: 146900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:17,826-Speed 5117.69 samples/sec Loss 3.7695 Epoch: 8 Global Step: 146950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:27,722-Speed 5174.46 samples/sec Loss 3.7668 Epoch: 8 Global Step: 147000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:37,784-Speed 5088.71 samples/sec Loss 3.7688 Epoch: 8 Global Step: 147050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:47,689-Speed 5169.29 samples/sec Loss 3.7649 Epoch: 8 Global Step: 147100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:53:57,788-Speed 5070.05 samples/sec Loss 3.7835 Epoch: 8 Global Step: 147150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:54:07,879-Speed 5074.16 samples/sec Loss 3.7627 Epoch: 8 Global Step: 147200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:54:18,525-Speed 4809.85 samples/sec Loss 3.7960 Epoch: 8 Global Step: 147250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:54:28,633-Speed 5065.66 samples/sec Loss 3.7892 Epoch: 8 Global Step: 147300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:54:40,211-Speed 4422.29 samples/sec Loss 3.7801 Epoch: 8 Global Step: 147350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:54:50,863-Speed 4806.96 samples/sec Loss 3.7317 Epoch: 8 Global Step: 147400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:00,941-Speed 5080.82 samples/sec Loss 3.7937 Epoch: 8 Global Step: 147450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:11,217-Speed 4982.95 samples/sec Loss 3.7785 Epoch: 8 Global Step: 147500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:21,255-Speed 5100.93 samples/sec Loss 3.7746 Epoch: 8 Global Step: 147550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:31,330-Speed 5081.97 samples/sec Loss 3.7797 Epoch: 8 Global Step: 147600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:41,399-Speed 5085.26 samples/sec Loss 3.7824 Epoch: 8 Global Step: 147650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:55:51,359-Speed 5140.68 samples/sec Loss 3.7359 Epoch: 8 Global Step: 147700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:01,372-Speed 5113.93 samples/sec Loss 3.7501 Epoch: 8 Global Step: 147750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:11,561-Speed 5024.86 samples/sec Loss 3.7593 Epoch: 8 Global Step: 147800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:21,689-Speed 5055.55 samples/sec Loss 3.7417 Epoch: 8 Global Step: 147850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:31,692-Speed 5118.95 samples/sec Loss 3.7666 Epoch: 8 Global Step: 147900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:41,700-Speed 5116.40 samples/sec Loss 3.7575 Epoch: 8 Global Step: 147950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:56:51,919-Speed 5010.36 samples/sec Loss 3.7499 Epoch: 8 Global Step: 148000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:57:08,820-[lfw][148000]XNorm: 24.002107 Training: 2021-03-19 05:57:08,821-[lfw][148000]Accuracy-Flip: 0.99617+-0.00236 Training: 2021-03-19 05:57:08,821-[lfw][148000]Accuracy-Highest: 0.99667 Training: 2021-03-19 05:57:27,619-[cfp_fp][148000]XNorm: 19.490763 Training: 2021-03-19 05:57:27,619-[cfp_fp][148000]Accuracy-Flip: 0.95371+-0.01069 Training: 2021-03-19 05:57:27,619-[cfp_fp][148000]Accuracy-Highest: 0.95614 Training: 2021-03-19 05:57:43,865-[agedb_30][148000]XNorm: 22.833934 Training: 2021-03-19 05:57:43,865-[agedb_30][148000]Accuracy-Flip: 0.96533+-0.01130 Training: 2021-03-19 05:57:43,866-[agedb_30][148000]Accuracy-Highest: 0.96867 Training: 2021-03-19 05:57:53,796-Speed 827.46 samples/sec Loss 3.7427 Epoch: 8 Global Step: 148050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:03,927-Speed 5053.94 samples/sec Loss 3.7661 Epoch: 8 Global Step: 148100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:13,619-Speed 5283.09 samples/sec Loss 3.7413 Epoch: 8 Global Step: 148150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:23,700-Speed 5078.95 samples/sec Loss 3.7752 Epoch: 8 Global Step: 148200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:33,770-Speed 5085.16 samples/sec Loss 3.7595 Epoch: 8 Global Step: 148250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:43,811-Speed 5099.38 samples/sec Loss 3.7773 Epoch: 8 Global Step: 148300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:58:53,894-Speed 5077.74 samples/sec Loss 3.7567 Epoch: 8 Global Step: 148350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:04,221-Speed 4958.51 samples/sec Loss 3.7445 Epoch: 8 Global Step: 148400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:14,181-Speed 5140.81 samples/sec Loss 3.7650 Epoch: 8 Global Step: 148450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:24,449-Speed 4986.94 samples/sec Loss 3.7622 Epoch: 8 Global Step: 148500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:34,556-Speed 5065.72 samples/sec Loss 3.7843 Epoch: 8 Global Step: 148550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:44,727-Speed 5034.44 samples/sec Loss 3.7529 Epoch: 8 Global Step: 148600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 05:59:54,891-Speed 5037.61 samples/sec Loss 3.7655 Epoch: 8 Global Step: 148650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:04,996-Speed 5067.22 samples/sec Loss 3.7574 Epoch: 8 Global Step: 148700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:14,948-Speed 5144.80 samples/sec Loss 3.7254 Epoch: 8 Global Step: 148750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:24,975-Speed 5106.78 samples/sec Loss 3.7519 Epoch: 8 Global Step: 148800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:34,870-Speed 5174.60 samples/sec Loss 3.8076 Epoch: 8 Global Step: 148850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:44,963-Speed 5072.89 samples/sec Loss 3.7421 Epoch: 8 Global Step: 148900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:00:55,024-Speed 5089.48 samples/sec Loss 3.7533 Epoch: 8 Global Step: 148950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:04,985-Speed 5140.19 samples/sec Loss 3.7297 Epoch: 8 Global Step: 149000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:15,090-Speed 5066.96 samples/sec Loss 3.7475 Epoch: 8 Global Step: 149050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:25,068-Speed 5131.95 samples/sec Loss 3.7780 Epoch: 8 Global Step: 149100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:35,097-Speed 5105.37 samples/sec Loss 3.7857 Epoch: 8 Global Step: 149150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:45,005-Speed 5167.90 samples/sec Loss 3.7465 Epoch: 8 Global Step: 149200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:01:54,896-Speed 5176.51 samples/sec Loss 3.7338 Epoch: 8 Global Step: 149250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:05,023-Speed 5056.26 samples/sec Loss 3.7387 Epoch: 8 Global Step: 149300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:15,077-Speed 5093.08 samples/sec Loss 3.7704 Epoch: 8 Global Step: 149350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:24,991-Speed 5164.58 samples/sec Loss 3.7577 Epoch: 8 Global Step: 149400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:35,096-Speed 5067.12 samples/sec Loss 3.7636 Epoch: 8 Global Step: 149450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:46,319-Speed 4562.45 samples/sec Loss 3.7380 Epoch: 8 Global Step: 149500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:02:56,492-Speed 5033.09 samples/sec Loss 3.7607 Epoch: 8 Global Step: 149550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:06,406-Speed 5164.96 samples/sec Loss 3.7169 Epoch: 8 Global Step: 149600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:16,470-Speed 5087.57 samples/sec Loss 3.7215 Epoch: 8 Global Step: 149650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:26,285-Speed 5216.98 samples/sec Loss 3.7499 Epoch: 8 Global Step: 149700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:36,313-Speed 5106.42 samples/sec Loss 3.7631 Epoch: 8 Global Step: 149750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:46,217-Speed 5169.72 samples/sec Loss 3.7434 Epoch: 8 Global Step: 149800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:03:56,274-Speed 5091.28 samples/sec Loss 3.8005 Epoch: 8 Global Step: 149850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:04:07,066-Speed 4744.65 samples/sec Loss 3.7756 Epoch: 8 Global Step: 149900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:04:17,363-Speed 4972.64 samples/sec Loss 3.7737 Epoch: 8 Global Step: 149950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:04:27,407-Speed 5098.18 samples/sec Loss 3.7428 Epoch: 8 Global Step: 150000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:04:44,261-[lfw][150000]XNorm: 23.100119 Training: 2021-03-19 06:04:44,261-[lfw][150000]Accuracy-Flip: 0.99583+-0.00261 Training: 2021-03-19 06:04:44,261-[lfw][150000]Accuracy-Highest: 0.99667 Training: 2021-03-19 06:05:02,953-[cfp_fp][150000]XNorm: 18.879323 Training: 2021-03-19 06:05:02,953-[cfp_fp][150000]Accuracy-Flip: 0.95329+-0.00935 Training: 2021-03-19 06:05:02,953-[cfp_fp][150000]Accuracy-Highest: 0.95614 Training: 2021-03-19 06:05:19,045-[agedb_30][150000]XNorm: 22.150058 Training: 2021-03-19 06:05:19,046-[agedb_30][150000]Accuracy-Flip: 0.96683+-0.00769 Training: 2021-03-19 06:05:19,046-[agedb_30][150000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:05:29,585-Speed 823.45 samples/sec Loss 3.7613 Epoch: 8 Global Step: 150050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:05:40,260-Speed 4796.57 samples/sec Loss 3.7811 Epoch: 8 Global Step: 150100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:05:50,067-Speed 5221.27 samples/sec Loss 3.7241 Epoch: 8 Global Step: 150150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:06:00,212-Speed 5046.75 samples/sec Loss 3.7629 Epoch: 8 Global Step: 150200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:06:22,860-Speed 2260.80 samples/sec Loss 3.5137 Epoch: 9 Global Step: 150250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:06:33,198-Speed 4952.87 samples/sec Loss 3.3585 Epoch: 9 Global Step: 150300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:06:43,464-Speed 4988.04 samples/sec Loss 3.3576 Epoch: 9 Global Step: 150350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:06:53,854-Speed 4928.17 samples/sec Loss 3.3323 Epoch: 9 Global Step: 150400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:04,202-Speed 4948.38 samples/sec Loss 3.3550 Epoch: 9 Global Step: 150450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:14,270-Speed 5085.48 samples/sec Loss 3.3503 Epoch: 9 Global Step: 150500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:24,149-Speed 5183.20 samples/sec Loss 3.3751 Epoch: 9 Global Step: 150550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:34,254-Speed 5067.23 samples/sec Loss 3.3516 Epoch: 9 Global Step: 150600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:45,352-Speed 4613.42 samples/sec Loss 3.4034 Epoch: 9 Global Step: 150650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:07:56,382-Speed 4642.37 samples/sec Loss 3.3513 Epoch: 9 Global Step: 150700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:07,633-Speed 4550.84 samples/sec Loss 3.3941 Epoch: 9 Global Step: 150750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:17,747-Speed 5062.47 samples/sec Loss 3.3864 Epoch: 9 Global Step: 150800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:27,626-Speed 5183.29 samples/sec Loss 3.4039 Epoch: 9 Global Step: 150850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:37,528-Speed 5170.81 samples/sec Loss 3.4383 Epoch: 9 Global Step: 150900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:47,543-Speed 5112.43 samples/sec Loss 3.3965 Epoch: 9 Global Step: 150950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:08:57,397-Speed 5196.21 samples/sec Loss 3.3838 Epoch: 9 Global Step: 151000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:07,256-Speed 5193.31 samples/sec Loss 3.3934 Epoch: 9 Global Step: 151050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:17,410-Speed 5042.56 samples/sec Loss 3.4301 Epoch: 9 Global Step: 151100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:27,398-Speed 5126.89 samples/sec Loss 3.3799 Epoch: 9 Global Step: 151150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:37,456-Speed 5091.03 samples/sec Loss 3.4192 Epoch: 9 Global Step: 151200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:47,602-Speed 5046.66 samples/sec Loss 3.4100 Epoch: 9 Global Step: 151250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:09:57,479-Speed 5183.72 samples/sec Loss 3.4219 Epoch: 9 Global Step: 151300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:07,608-Speed 5055.29 samples/sec Loss 3.3985 Epoch: 9 Global Step: 151350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:17,477-Speed 5188.66 samples/sec Loss 3.4127 Epoch: 9 Global Step: 151400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:27,208-Speed 5261.68 samples/sec Loss 3.3825 Epoch: 9 Global Step: 151450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:37,126-Speed 5162.46 samples/sec Loss 3.3883 Epoch: 9 Global Step: 151500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:46,941-Speed 5217.15 samples/sec Loss 3.4088 Epoch: 9 Global Step: 151550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:10:56,912-Speed 5135.03 samples/sec Loss 3.3887 Epoch: 9 Global Step: 151600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:06,958-Speed 5097.06 samples/sec Loss 3.4284 Epoch: 9 Global Step: 151650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:17,129-Speed 5033.77 samples/sec Loss 3.4139 Epoch: 9 Global Step: 151700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:26,997-Speed 5189.18 samples/sec Loss 3.4118 Epoch: 9 Global Step: 151750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:36,934-Speed 5152.98 samples/sec Loss 3.4212 Epoch: 9 Global Step: 151800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:46,985-Speed 5094.46 samples/sec Loss 3.4487 Epoch: 9 Global Step: 151850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:11:57,142-Speed 5040.95 samples/sec Loss 3.4028 Epoch: 9 Global Step: 151900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:12:06,821-Speed 5290.17 samples/sec Loss 3.4355 Epoch: 9 Global Step: 151950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:12:16,819-Speed 5121.21 samples/sec Loss 3.4324 Epoch: 9 Global Step: 152000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:12:33,596-[lfw][152000]XNorm: 23.986284 Training: 2021-03-19 06:12:33,596-[lfw][152000]Accuracy-Flip: 0.99683+-0.00283 Training: 2021-03-19 06:12:33,596-[lfw][152000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:12:52,239-[cfp_fp][152000]XNorm: 19.744008 Training: 2021-03-19 06:12:52,239-[cfp_fp][152000]Accuracy-Flip: 0.95886+-0.00968 Training: 2021-03-19 06:12:52,241-[cfp_fp][152000]Accuracy-Highest: 0.95886 Training: 2021-03-19 06:13:08,610-[agedb_30][152000]XNorm: 22.818813 Training: 2021-03-19 06:13:08,611-[agedb_30][152000]Accuracy-Flip: 0.96383+-0.00727 Training: 2021-03-19 06:13:08,611-[agedb_30][152000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:13:18,533-Speed 829.65 samples/sec Loss 3.4102 Epoch: 9 Global Step: 152050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:13:28,459-Speed 5158.48 samples/sec Loss 3.4814 Epoch: 9 Global Step: 152100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:13:38,545-Speed 5076.59 samples/sec Loss 3.4360 Epoch: 9 Global Step: 152150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:13:48,433-Speed 5178.21 samples/sec Loss 3.4667 Epoch: 9 Global Step: 152200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:13:58,562-Speed 5054.94 samples/sec Loss 3.4594 Epoch: 9 Global Step: 152250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:08,487-Speed 5159.13 samples/sec Loss 3.4707 Epoch: 9 Global Step: 152300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:18,097-Speed 5327.94 samples/sec Loss 3.4373 Epoch: 9 Global Step: 152350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:28,171-Speed 5082.95 samples/sec Loss 3.4887 Epoch: 9 Global Step: 152400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:38,073-Speed 5170.58 samples/sec Loss 3.4607 Epoch: 9 Global Step: 152450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:47,953-Speed 5182.83 samples/sec Loss 3.4898 Epoch: 9 Global Step: 152500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:14:57,890-Speed 5152.54 samples/sec Loss 3.4378 Epoch: 9 Global Step: 152550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:07,856-Speed 5137.81 samples/sec Loss 3.4638 Epoch: 9 Global Step: 152600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:17,803-Speed 5147.76 samples/sec Loss 3.4626 Epoch: 9 Global Step: 152650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:27,870-Speed 5086.23 samples/sec Loss 3.4593 Epoch: 9 Global Step: 152700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:37,766-Speed 5173.76 samples/sec Loss 3.4858 Epoch: 9 Global Step: 152750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:48,639-Speed 4709.26 samples/sec Loss 3.5150 Epoch: 9 Global Step: 152800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:15:58,924-Speed 4978.44 samples/sec Loss 3.4560 Epoch: 9 Global Step: 152850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:08,969-Speed 5097.15 samples/sec Loss 3.4556 Epoch: 9 Global Step: 152900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:18,900-Speed 5156.32 samples/sec Loss 3.4496 Epoch: 9 Global Step: 152950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:28,886-Speed 5127.17 samples/sec Loss 3.4656 Epoch: 9 Global Step: 153000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:38,830-Speed 5149.46 samples/sec Loss 3.5153 Epoch: 9 Global Step: 153050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:48,976-Speed 5046.24 samples/sec Loss 3.4863 Epoch: 9 Global Step: 153100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:16:58,942-Speed 5137.85 samples/sec Loss 3.5102 Epoch: 9 Global Step: 153150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:17:09,707-Speed 4756.59 samples/sec Loss 3.4973 Epoch: 9 Global Step: 153200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:17:19,857-Speed 5044.45 samples/sec Loss 3.5309 Epoch: 9 Global Step: 153250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:17:29,658-Speed 5224.81 samples/sec Loss 3.4803 Epoch: 9 Global Step: 153300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-19 06:17:40,474-Speed 4733.59 samples/sec Loss 3.5044 Epoch: 9 Global Step: 153350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:17:50,390-Speed 5163.70 samples/sec Loss 3.4881 Epoch: 9 Global Step: 153400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:00,939-Speed 4854.18 samples/sec Loss 3.5085 Epoch: 9 Global Step: 153450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:11,216-Speed 4982.10 samples/sec Loss 3.5235 Epoch: 9 Global Step: 153500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:21,247-Speed 5104.47 samples/sec Loss 3.5063 Epoch: 9 Global Step: 153550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:31,248-Speed 5120.13 samples/sec Loss 3.4802 Epoch: 9 Global Step: 153600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:41,059-Speed 5218.97 samples/sec Loss 3.5144 Epoch: 9 Global Step: 153650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:18:51,108-Speed 5095.35 samples/sec Loss 3.5086 Epoch: 9 Global Step: 153700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:01,112-Speed 5118.18 samples/sec Loss 3.5104 Epoch: 9 Global Step: 153750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:11,084-Speed 5134.62 samples/sec Loss 3.4908 Epoch: 9 Global Step: 153800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:21,063-Speed 5130.72 samples/sec Loss 3.4970 Epoch: 9 Global Step: 153850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:31,159-Speed 5071.54 samples/sec Loss 3.5095 Epoch: 9 Global Step: 153900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:41,082-Speed 5160.31 samples/sec Loss 3.5248 Epoch: 9 Global Step: 153950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:19:51,192-Speed 5064.51 samples/sec Loss 3.4905 Epoch: 9 Global Step: 154000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:20:07,820-[lfw][154000]XNorm: 23.241708 Training: 2021-03-19 06:20:07,820-[lfw][154000]Accuracy-Flip: 0.99450+-0.00224 Training: 2021-03-19 06:20:07,820-[lfw][154000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:20:26,428-[cfp_fp][154000]XNorm: 19.388142 Training: 2021-03-19 06:20:26,429-[cfp_fp][154000]Accuracy-Flip: 0.95243+-0.01434 Training: 2021-03-19 06:20:26,429-[cfp_fp][154000]Accuracy-Highest: 0.95886 Training: 2021-03-19 06:20:42,579-[agedb_30][154000]XNorm: 22.665595 Training: 2021-03-19 06:20:42,579-[agedb_30][154000]Accuracy-Flip: 0.96283+-0.01188 Training: 2021-03-19 06:20:42,579-[agedb_30][154000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:20:54,131-Speed 813.50 samples/sec Loss 3.5634 Epoch: 9 Global Step: 154050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:04,892-Speed 4758.26 samples/sec Loss 3.5116 Epoch: 9 Global Step: 154100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:14,806-Speed 5164.65 samples/sec Loss 3.5179 Epoch: 9 Global Step: 154150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:25,519-Speed 4779.61 samples/sec Loss 3.4885 Epoch: 9 Global Step: 154200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:35,546-Speed 5106.57 samples/sec Loss 3.5145 Epoch: 9 Global Step: 154250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:45,483-Speed 5152.67 samples/sec Loss 3.5016 Epoch: 9 Global Step: 154300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:21:55,692-Speed 5015.21 samples/sec Loss 3.5157 Epoch: 9 Global Step: 154350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:05,565-Speed 5186.17 samples/sec Loss 3.5047 Epoch: 9 Global Step: 154400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:15,639-Speed 5082.91 samples/sec Loss 3.5023 Epoch: 9 Global Step: 154450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:25,429-Speed 5230.34 samples/sec Loss 3.5056 Epoch: 9 Global Step: 154500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:35,543-Speed 5062.57 samples/sec Loss 3.5325 Epoch: 9 Global Step: 154550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:45,816-Speed 4983.85 samples/sec Loss 3.5324 Epoch: 9 Global Step: 154600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:22:55,957-Speed 5049.45 samples/sec Loss 3.5138 Epoch: 9 Global Step: 154650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:06,109-Speed 5043.54 samples/sec Loss 3.5543 Epoch: 9 Global Step: 154700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:16,034-Speed 5159.47 samples/sec Loss 3.5258 Epoch: 9 Global Step: 154750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:26,015-Speed 5129.71 samples/sec Loss 3.5489 Epoch: 9 Global Step: 154800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:35,937-Speed 5160.68 samples/sec Loss 3.5598 Epoch: 9 Global Step: 154850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:45,984-Speed 5096.72 samples/sec Loss 3.5394 Epoch: 9 Global Step: 154900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:23:56,107-Speed 5057.68 samples/sec Loss 3.5513 Epoch: 9 Global Step: 154950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:06,131-Speed 5108.38 samples/sec Loss 3.5071 Epoch: 9 Global Step: 155000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:16,063-Speed 5155.29 samples/sec Loss 3.5433 Epoch: 9 Global Step: 155050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:26,073-Speed 5115.03 samples/sec Loss 3.5281 Epoch: 9 Global Step: 155100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:35,937-Speed 5191.15 samples/sec Loss 3.5727 Epoch: 9 Global Step: 155150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:45,859-Speed 5160.22 samples/sec Loss 3.5094 Epoch: 9 Global Step: 155200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:24:55,815-Speed 5143.00 samples/sec Loss 3.5370 Epoch: 9 Global Step: 155250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:05,725-Speed 5167.07 samples/sec Loss 3.5613 Epoch: 9 Global Step: 155300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:15,711-Speed 5127.57 samples/sec Loss 3.5528 Epoch: 9 Global Step: 155350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:25,658-Speed 5147.17 samples/sec Loss 3.5483 Epoch: 9 Global Step: 155400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:35,766-Speed 5065.85 samples/sec Loss 3.5782 Epoch: 9 Global Step: 155450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:46,176-Speed 4918.78 samples/sec Loss 3.5582 Epoch: 9 Global Step: 155500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:25:56,029-Speed 5196.46 samples/sec Loss 3.5205 Epoch: 9 Global Step: 155550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:06,060-Speed 5104.61 samples/sec Loss 3.5278 Epoch: 9 Global Step: 155600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:16,189-Speed 5054.73 samples/sec Loss 3.5665 Epoch: 9 Global Step: 155650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:25,999-Speed 5219.90 samples/sec Loss 3.5608 Epoch: 9 Global Step: 155700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:36,093-Speed 5072.16 samples/sec Loss 3.5351 Epoch: 9 Global Step: 155750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:46,138-Speed 5097.75 samples/sec Loss 3.5493 Epoch: 9 Global Step: 155800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:26:56,202-Speed 5087.39 samples/sec Loss 3.5565 Epoch: 9 Global Step: 155850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:27:06,172-Speed 5136.15 samples/sec Loss 3.5323 Epoch: 9 Global Step: 155900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:27:16,115-Speed 5149.55 samples/sec Loss 3.5556 Epoch: 9 Global Step: 155950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:27:25,941-Speed 5210.93 samples/sec Loss 3.5390 Epoch: 9 Global Step: 156000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:27:42,721-[lfw][156000]XNorm: 25.080852 Training: 2021-03-19 06:27:42,722-[lfw][156000]Accuracy-Flip: 0.99650+-0.00241 Training: 2021-03-19 06:27:42,722-[lfw][156000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:28:01,424-[cfp_fp][156000]XNorm: 20.465663 Training: 2021-03-19 06:28:01,425-[cfp_fp][156000]Accuracy-Flip: 0.95971+-0.01273 Training: 2021-03-19 06:28:01,425-[cfp_fp][156000]Accuracy-Highest: 0.95971 Training: 2021-03-19 06:28:17,695-[agedb_30][156000]XNorm: 24.106106 Training: 2021-03-19 06:28:17,695-[agedb_30][156000]Accuracy-Flip: 0.96633+-0.01043 Training: 2021-03-19 06:28:17,695-[agedb_30][156000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:28:27,447-Speed 832.45 samples/sec Loss 3.5671 Epoch: 9 Global Step: 156050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:28:38,552-Speed 4610.78 samples/sec Loss 3.5485 Epoch: 9 Global Step: 156100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:28:48,557-Speed 5117.73 samples/sec Loss 3.5769 Epoch: 9 Global Step: 156150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:28:58,713-Speed 5041.70 samples/sec Loss 3.5621 Epoch: 9 Global Step: 156200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:08,625-Speed 5165.75 samples/sec Loss 3.5798 Epoch: 9 Global Step: 156250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:18,485-Speed 5193.33 samples/sec Loss 3.5587 Epoch: 9 Global Step: 156300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:28,513-Speed 5105.63 samples/sec Loss 3.6074 Epoch: 9 Global Step: 156350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:38,467-Speed 5144.50 samples/sec Loss 3.5680 Epoch: 9 Global Step: 156400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:48,416-Speed 5146.58 samples/sec Loss 3.5351 Epoch: 9 Global Step: 156450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:29:58,748-Speed 4955.69 samples/sec Loss 3.5358 Epoch: 9 Global Step: 156500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:30:08,740-Speed 5124.10 samples/sec Loss 3.5775 Epoch: 9 Global Step: 156550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:30:19,003-Speed 4989.02 samples/sec Loss 3.5849 Epoch: 9 Global Step: 156600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:30:29,678-Speed 4796.53 samples/sec Loss 3.5553 Epoch: 9 Global Step: 156650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:30:39,716-Speed 5101.11 samples/sec Loss 3.6085 Epoch: 9 Global Step: 156700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:30:50,418-Speed 4784.36 samples/sec Loss 3.5783 Epoch: 9 Global Step: 156750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:01,287-Speed 4711.11 samples/sec Loss 3.5797 Epoch: 9 Global Step: 156800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:11,501-Speed 5012.72 samples/sec Loss 3.5882 Epoch: 9 Global Step: 156850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:21,447-Speed 5148.18 samples/sec Loss 3.5831 Epoch: 9 Global Step: 156900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:31,465-Speed 5111.25 samples/sec Loss 3.5528 Epoch: 9 Global Step: 156950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:41,547-Speed 5078.68 samples/sec Loss 3.5757 Epoch: 9 Global Step: 157000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:31:51,647-Speed 5069.45 samples/sec Loss 3.5418 Epoch: 9 Global Step: 157050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:01,707-Speed 5089.78 samples/sec Loss 3.6066 Epoch: 9 Global Step: 157100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:11,519-Speed 5218.29 samples/sec Loss 3.6180 Epoch: 9 Global Step: 157150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:21,622-Speed 5067.96 samples/sec Loss 3.5886 Epoch: 9 Global Step: 157200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:31,705-Speed 5078.35 samples/sec Loss 3.5658 Epoch: 9 Global Step: 157250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:41,573-Speed 5188.83 samples/sec Loss 3.5832 Epoch: 9 Global Step: 157300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:32:53,307-Speed 4363.70 samples/sec Loss 3.5825 Epoch: 9 Global Step: 157350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:04,345-Speed 4638.54 samples/sec Loss 3.5543 Epoch: 9 Global Step: 157400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:14,094-Speed 5252.04 samples/sec Loss 3.6042 Epoch: 9 Global Step: 157450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:24,164-Speed 5084.66 samples/sec Loss 3.5866 Epoch: 9 Global Step: 157500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:34,854-Speed 4790.04 samples/sec Loss 3.5579 Epoch: 9 Global Step: 157550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:44,950-Speed 5071.68 samples/sec Loss 3.6222 Epoch: 9 Global Step: 157600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:33:55,034-Speed 5077.37 samples/sec Loss 3.6021 Epoch: 9 Global Step: 157650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:05,019-Speed 5128.44 samples/sec Loss 3.5792 Epoch: 9 Global Step: 157700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:14,978-Speed 5141.41 samples/sec Loss 3.5742 Epoch: 9 Global Step: 157750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:25,173-Speed 5021.89 samples/sec Loss 3.6265 Epoch: 9 Global Step: 157800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:34,982-Speed 5220.21 samples/sec Loss 3.5960 Epoch: 9 Global Step: 157850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:45,207-Speed 5007.36 samples/sec Loss 3.6229 Epoch: 9 Global Step: 157900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:34:55,170-Speed 5139.57 samples/sec Loss 3.5839 Epoch: 9 Global Step: 157950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:35:05,097-Speed 5157.61 samples/sec Loss 3.5659 Epoch: 9 Global Step: 158000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:35:21,849-[lfw][158000]XNorm: 22.942832 Training: 2021-03-19 06:35:21,849-[lfw][158000]Accuracy-Flip: 0.99667+-0.00298 Training: 2021-03-19 06:35:21,849-[lfw][158000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:35:40,533-[cfp_fp][158000]XNorm: 18.841374 Training: 2021-03-19 06:35:40,533-[cfp_fp][158000]Accuracy-Flip: 0.95229+-0.01145 Training: 2021-03-19 06:35:40,533-[cfp_fp][158000]Accuracy-Highest: 0.95971 Training: 2021-03-19 06:35:56,765-[agedb_30][158000]XNorm: 22.182208 Training: 2021-03-19 06:35:56,766-[agedb_30][158000]Accuracy-Flip: 0.96567+-0.00907 Training: 2021-03-19 06:35:56,766-[agedb_30][158000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:36:06,573-Speed 832.87 samples/sec Loss 3.5922 Epoch: 9 Global Step: 158050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:36:16,650-Speed 5080.98 samples/sec Loss 3.5819 Epoch: 9 Global Step: 158100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:36:26,762-Speed 5063.35 samples/sec Loss 3.5894 Epoch: 9 Global Step: 158150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:36:36,757-Speed 5122.95 samples/sec Loss 3.6233 Epoch: 9 Global Step: 158200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:36:46,745-Speed 5126.69 samples/sec Loss 3.5862 Epoch: 9 Global Step: 158250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:36:56,751-Speed 5117.21 samples/sec Loss 3.6037 Epoch: 9 Global Step: 158300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:06,706-Speed 5143.41 samples/sec Loss 3.6132 Epoch: 9 Global Step: 158350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:16,889-Speed 5028.10 samples/sec Loss 3.5913 Epoch: 9 Global Step: 158400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:26,866-Speed 5132.37 samples/sec Loss 3.6199 Epoch: 9 Global Step: 158450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:37,196-Speed 4956.71 samples/sec Loss 3.6452 Epoch: 9 Global Step: 158500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:47,321-Speed 5057.18 samples/sec Loss 3.6129 Epoch: 9 Global Step: 158550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:37:57,172-Speed 5197.36 samples/sec Loss 3.5754 Epoch: 9 Global Step: 158600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:07,358-Speed 5027.07 samples/sec Loss 3.5604 Epoch: 9 Global Step: 158650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:17,518-Speed 5039.78 samples/sec Loss 3.6112 Epoch: 9 Global Step: 158700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:27,647-Speed 5054.80 samples/sec Loss 3.6084 Epoch: 9 Global Step: 158750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:37,695-Speed 5096.15 samples/sec Loss 3.6052 Epoch: 9 Global Step: 158800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:47,703-Speed 5116.05 samples/sec Loss 3.5987 Epoch: 9 Global Step: 158850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:38:57,621-Speed 5162.59 samples/sec Loss 3.6390 Epoch: 9 Global Step: 158900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:07,606-Speed 5127.96 samples/sec Loss 3.5908 Epoch: 9 Global Step: 158950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:17,548-Speed 5150.50 samples/sec Loss 3.6137 Epoch: 9 Global Step: 159000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:27,611-Speed 5088.28 samples/sec Loss 3.6012 Epoch: 9 Global Step: 159050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:37,537-Speed 5158.12 samples/sec Loss 3.6194 Epoch: 9 Global Step: 159100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:47,662-Speed 5057.17 samples/sec Loss 3.6179 Epoch: 9 Global Step: 159150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:39:57,904-Speed 4999.71 samples/sec Loss 3.5799 Epoch: 9 Global Step: 159200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:07,888-Speed 5128.23 samples/sec Loss 3.6125 Epoch: 9 Global Step: 159250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:17,897-Speed 5115.65 samples/sec Loss 3.6310 Epoch: 9 Global Step: 159300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:27,829-Speed 5155.41 samples/sec Loss 3.6471 Epoch: 9 Global Step: 159350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:37,917-Speed 5075.70 samples/sec Loss 3.6128 Epoch: 9 Global Step: 159400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:47,955-Speed 5100.65 samples/sec Loss 3.6001 Epoch: 9 Global Step: 159450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:40:58,971-Speed 4648.12 samples/sec Loss 3.5957 Epoch: 9 Global Step: 159500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:08,988-Speed 5111.93 samples/sec Loss 3.6162 Epoch: 9 Global Step: 159550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:18,908-Speed 5161.54 samples/sec Loss 3.6493 Epoch: 9 Global Step: 159600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:28,897-Speed 5125.52 samples/sec Loss 3.6254 Epoch: 9 Global Step: 159650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:38,861-Speed 5138.73 samples/sec Loss 3.6190 Epoch: 9 Global Step: 159700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:48,856-Speed 5122.81 samples/sec Loss 3.5969 Epoch: 9 Global Step: 159750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:41:58,895-Speed 5100.38 samples/sec Loss 3.6308 Epoch: 9 Global Step: 159800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:42:09,028-Speed 5053.42 samples/sec Loss 3.5952 Epoch: 9 Global Step: 159850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:42:19,387-Speed 4942.93 samples/sec Loss 3.6481 Epoch: 9 Global Step: 159900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:42:29,370-Speed 5128.87 samples/sec Loss 3.6268 Epoch: 9 Global Step: 159950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:42:40,102-Speed 4771.15 samples/sec Loss 3.5878 Epoch: 9 Global Step: 160000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:42:56,851-[lfw][160000]XNorm: 23.613803 Training: 2021-03-19 06:42:56,851-[lfw][160000]Accuracy-Flip: 0.99517+-0.00283 Training: 2021-03-19 06:42:56,851-[lfw][160000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:43:15,525-[cfp_fp][160000]XNorm: 19.169997 Training: 2021-03-19 06:43:15,525-[cfp_fp][160000]Accuracy-Flip: 0.95714+-0.01303 Training: 2021-03-19 06:43:15,525-[cfp_fp][160000]Accuracy-Highest: 0.95971 Training: 2021-03-19 06:43:31,647-[agedb_30][160000]XNorm: 22.825963 Training: 2021-03-19 06:43:31,647-[agedb_30][160000]Accuracy-Flip: 0.96550+-0.01003 Training: 2021-03-19 06:43:31,647-[agedb_30][160000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:43:41,473-Speed 834.28 samples/sec Loss 3.6191 Epoch: 9 Global Step: 160050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:43:52,406-Speed 4683.46 samples/sec Loss 3.6739 Epoch: 9 Global Step: 160100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:03,091-Speed 4791.79 samples/sec Loss 3.5956 Epoch: 9 Global Step: 160150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:13,421-Speed 4956.87 samples/sec Loss 3.6190 Epoch: 9 Global Step: 160200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:23,495-Speed 5082.65 samples/sec Loss 3.6422 Epoch: 9 Global Step: 160250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:33,616-Speed 5059.47 samples/sec Loss 3.6267 Epoch: 9 Global Step: 160300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:43,752-Speed 5051.55 samples/sec Loss 3.6432 Epoch: 9 Global Step: 160350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:44:53,644-Speed 5176.14 samples/sec Loss 3.6199 Epoch: 9 Global Step: 160400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:03,579-Speed 5154.06 samples/sec Loss 3.6260 Epoch: 9 Global Step: 160450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:13,602-Speed 5108.76 samples/sec Loss 3.5930 Epoch: 9 Global Step: 160500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:23,775-Speed 5033.27 samples/sec Loss 3.6060 Epoch: 9 Global Step: 160550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:33,620-Speed 5200.88 samples/sec Loss 3.5933 Epoch: 9 Global Step: 160600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:43,638-Speed 5111.26 samples/sec Loss 3.6753 Epoch: 9 Global Step: 160650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:45:54,352-Speed 4778.84 samples/sec Loss 3.6361 Epoch: 9 Global Step: 160700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:05,156-Speed 4739.50 samples/sec Loss 3.6496 Epoch: 9 Global Step: 160750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:15,992-Speed 4725.42 samples/sec Loss 3.6276 Epoch: 9 Global Step: 160800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:25,997-Speed 5117.74 samples/sec Loss 3.5670 Epoch: 9 Global Step: 160850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:36,123-Speed 5056.36 samples/sec Loss 3.6479 Epoch: 9 Global Step: 160900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:46,914-Speed 4745.11 samples/sec Loss 3.6324 Epoch: 9 Global Step: 160950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:46:57,026-Speed 5063.56 samples/sec Loss 3.6114 Epoch: 9 Global Step: 161000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:07,132-Speed 5066.76 samples/sec Loss 3.6301 Epoch: 9 Global Step: 161050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:17,152-Speed 5109.87 samples/sec Loss 3.5979 Epoch: 9 Global Step: 161100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:27,035-Speed 5181.23 samples/sec Loss 3.6492 Epoch: 9 Global Step: 161150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:36,967-Speed 5155.22 samples/sec Loss 3.6615 Epoch: 9 Global Step: 161200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:47,301-Speed 4954.86 samples/sec Loss 3.6097 Epoch: 9 Global Step: 161250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:47:57,448-Speed 5045.98 samples/sec Loss 3.6511 Epoch: 9 Global Step: 161300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:07,584-Speed 5051.39 samples/sec Loss 3.6656 Epoch: 9 Global Step: 161350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:17,642-Speed 5091.25 samples/sec Loss 3.6545 Epoch: 9 Global Step: 161400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:27,525-Speed 5180.72 samples/sec Loss 3.6203 Epoch: 9 Global Step: 161450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:37,904-Speed 4933.30 samples/sec Loss 3.6220 Epoch: 9 Global Step: 161500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:48,045-Speed 5048.92 samples/sec Loss 3.6444 Epoch: 9 Global Step: 161550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:48:58,180-Speed 5052.07 samples/sec Loss 3.6542 Epoch: 9 Global Step: 161600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:08,166-Speed 5127.71 samples/sec Loss 3.6537 Epoch: 9 Global Step: 161650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:18,071-Speed 5169.37 samples/sec Loss 3.6145 Epoch: 9 Global Step: 161700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:28,103-Speed 5103.74 samples/sec Loss 3.6492 Epoch: 9 Global Step: 161750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:37,989-Speed 5179.65 samples/sec Loss 3.6016 Epoch: 9 Global Step: 161800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:48,023-Speed 5103.13 samples/sec Loss 3.6798 Epoch: 9 Global Step: 161850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:49:57,958-Speed 5153.70 samples/sec Loss 3.6373 Epoch: 9 Global Step: 161900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:50:08,012-Speed 5092.62 samples/sec Loss 3.6712 Epoch: 9 Global Step: 161950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:50:17,904-Speed 5176.18 samples/sec Loss 3.6703 Epoch: 9 Global Step: 162000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:50:34,826-[lfw][162000]XNorm: 21.999317 Training: 2021-03-19 06:50:34,826-[lfw][162000]Accuracy-Flip: 0.99567+-0.00271 Training: 2021-03-19 06:50:34,826-[lfw][162000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:50:53,509-[cfp_fp][162000]XNorm: 17.799416 Training: 2021-03-19 06:50:53,509-[cfp_fp][162000]Accuracy-Flip: 0.95643+-0.01060 Training: 2021-03-19 06:50:53,509-[cfp_fp][162000]Accuracy-Highest: 0.95971 Training: 2021-03-19 06:51:09,701-[agedb_30][162000]XNorm: 21.127390 Training: 2021-03-19 06:51:09,701-[agedb_30][162000]Accuracy-Flip: 0.96400+-0.00920 Training: 2021-03-19 06:51:09,701-[agedb_30][162000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:51:19,679-Speed 828.83 samples/sec Loss 3.6451 Epoch: 9 Global Step: 162050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:51:29,652-Speed 5133.99 samples/sec Loss 3.6688 Epoch: 9 Global Step: 162100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:51:39,769-Speed 5061.08 samples/sec Loss 3.6130 Epoch: 9 Global Step: 162150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:51:50,193-Speed 4912.01 samples/sec Loss 3.6423 Epoch: 9 Global Step: 162200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:00,116-Speed 5159.96 samples/sec Loss 3.6698 Epoch: 9 Global Step: 162250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:10,126-Speed 5115.46 samples/sec Loss 3.6374 Epoch: 9 Global Step: 162300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:20,146-Speed 5109.64 samples/sec Loss 3.6335 Epoch: 9 Global Step: 162350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:30,244-Speed 5070.96 samples/sec Loss 3.6306 Epoch: 9 Global Step: 162400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:40,397-Speed 5042.90 samples/sec Loss 3.6434 Epoch: 9 Global Step: 162450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:52:50,274-Speed 5184.14 samples/sec Loss 3.6852 Epoch: 9 Global Step: 162500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:00,283-Speed 5115.38 samples/sec Loss 3.6515 Epoch: 9 Global Step: 162550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:10,301-Speed 5111.56 samples/sec Loss 3.6511 Epoch: 9 Global Step: 162600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:20,020-Speed 5268.06 samples/sec Loss 3.6394 Epoch: 9 Global Step: 162650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:30,161-Speed 5049.22 samples/sec Loss 3.7010 Epoch: 9 Global Step: 162700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:40,151-Speed 5125.43 samples/sec Loss 3.6893 Epoch: 9 Global Step: 162750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:53:50,172-Speed 5109.49 samples/sec Loss 3.6182 Epoch: 9 Global Step: 162800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:00,845-Speed 4797.65 samples/sec Loss 3.6786 Epoch: 9 Global Step: 162850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:10,847-Speed 5119.00 samples/sec Loss 3.6631 Epoch: 9 Global Step: 162900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:20,811-Speed 5138.84 samples/sec Loss 3.6533 Epoch: 9 Global Step: 162950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:30,720-Speed 5167.49 samples/sec Loss 3.6559 Epoch: 9 Global Step: 163000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:40,809-Speed 5075.14 samples/sec Loss 3.6585 Epoch: 9 Global Step: 163050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:54:50,720-Speed 5166.12 samples/sec Loss 3.6607 Epoch: 9 Global Step: 163100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:00,803-Speed 5078.09 samples/sec Loss 3.6798 Epoch: 9 Global Step: 163150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:10,775-Speed 5134.43 samples/sec Loss 3.6515 Epoch: 9 Global Step: 163200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:20,597-Speed 5213.36 samples/sec Loss 3.6870 Epoch: 9 Global Step: 163250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:31,298-Speed 4784.95 samples/sec Loss 3.6427 Epoch: 9 Global Step: 163300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:41,248-Speed 5145.79 samples/sec Loss 3.6977 Epoch: 9 Global Step: 163350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:55:51,863-Speed 4823.69 samples/sec Loss 3.6432 Epoch: 9 Global Step: 163400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:01,734-Speed 5187.43 samples/sec Loss 3.6456 Epoch: 9 Global Step: 163450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:12,310-Speed 4841.28 samples/sec Loss 3.6459 Epoch: 9 Global Step: 163500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:22,233-Speed 5159.69 samples/sec Loss 3.6852 Epoch: 9 Global Step: 163550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:32,140-Speed 5168.73 samples/sec Loss 3.6262 Epoch: 9 Global Step: 163600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:42,376-Speed 5002.10 samples/sec Loss 3.6336 Epoch: 9 Global Step: 163650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:56:52,508-Speed 5053.57 samples/sec Loss 3.6863 Epoch: 9 Global Step: 163700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:02,696-Speed 5025.95 samples/sec Loss 3.6191 Epoch: 9 Global Step: 163750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:12,614-Speed 5162.82 samples/sec Loss 3.6409 Epoch: 9 Global Step: 163800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:22,655-Speed 5099.64 samples/sec Loss 3.6526 Epoch: 9 Global Step: 163850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:32,646-Speed 5124.49 samples/sec Loss 3.6774 Epoch: 9 Global Step: 163900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:42,774-Speed 5055.62 samples/sec Loss 3.6371 Epoch: 9 Global Step: 163950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:57:52,881-Speed 5066.09 samples/sec Loss 3.6941 Epoch: 9 Global Step: 164000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:58:09,389-[lfw][164000]XNorm: 23.283027 Training: 2021-03-19 06:58:09,389-[lfw][164000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-19 06:58:09,389-[lfw][164000]Accuracy-Highest: 0.99683 Training: 2021-03-19 06:58:28,069-[cfp_fp][164000]XNorm: 18.908500 Training: 2021-03-19 06:58:28,069-[cfp_fp][164000]Accuracy-Flip: 0.95471+-0.01116 Training: 2021-03-19 06:58:28,069-[cfp_fp][164000]Accuracy-Highest: 0.95971 Training: 2021-03-19 06:58:44,106-[agedb_30][164000]XNorm: 22.273113 Training: 2021-03-19 06:58:44,107-[agedb_30][164000]Accuracy-Flip: 0.96600+-0.00841 Training: 2021-03-19 06:58:44,107-[agedb_30][164000]Accuracy-Highest: 0.96867 Training: 2021-03-19 06:58:54,854-Speed 826.17 samples/sec Loss 3.6898 Epoch: 9 Global Step: 164050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:06,264-Speed 4487.51 samples/sec Loss 3.6702 Epoch: 9 Global Step: 164100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:16,605-Speed 4951.73 samples/sec Loss 3.6350 Epoch: 9 Global Step: 164150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:26,888-Speed 4979.30 samples/sec Loss 3.6750 Epoch: 9 Global Step: 164200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:36,844-Speed 5143.08 samples/sec Loss 3.6716 Epoch: 9 Global Step: 164250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:47,491-Speed 4808.85 samples/sec Loss 3.6685 Epoch: 9 Global Step: 164300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 06:59:57,541-Speed 5094.96 samples/sec Loss 3.6871 Epoch: 9 Global Step: 164350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:07,767-Speed 5007.25 samples/sec Loss 3.6719 Epoch: 9 Global Step: 164400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:17,714-Speed 5147.52 samples/sec Loss 3.6993 Epoch: 9 Global Step: 164450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:27,616-Speed 5171.10 samples/sec Loss 3.6800 Epoch: 9 Global Step: 164500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:37,663-Speed 5096.16 samples/sec Loss 3.6586 Epoch: 9 Global Step: 164550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:47,816-Speed 5043.47 samples/sec Loss 3.6853 Epoch: 9 Global Step: 164600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:00:57,941-Speed 5056.94 samples/sec Loss 3.6405 Epoch: 9 Global Step: 164650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:08,019-Speed 5080.55 samples/sec Loss 3.6604 Epoch: 9 Global Step: 164700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:18,037-Speed 5110.96 samples/sec Loss 3.6610 Epoch: 9 Global Step: 164750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:28,034-Speed 5122.07 samples/sec Loss 3.6979 Epoch: 9 Global Step: 164800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:38,054-Speed 5109.86 samples/sec Loss 3.6704 Epoch: 9 Global Step: 164850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:48,026-Speed 5135.03 samples/sec Loss 3.6686 Epoch: 9 Global Step: 164900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:01:57,978-Speed 5144.58 samples/sec Loss 3.6527 Epoch: 9 Global Step: 164950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:08,090-Speed 5063.67 samples/sec Loss 3.6410 Epoch: 9 Global Step: 165000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:18,324-Speed 5003.42 samples/sec Loss 3.6543 Epoch: 9 Global Step: 165050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:28,488-Speed 5037.64 samples/sec Loss 3.7013 Epoch: 9 Global Step: 165100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:38,377-Speed 5177.71 samples/sec Loss 3.6604 Epoch: 9 Global Step: 165150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:48,298-Speed 5160.88 samples/sec Loss 3.6646 Epoch: 9 Global Step: 165200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:02:58,545-Speed 4997.02 samples/sec Loss 3.6963 Epoch: 9 Global Step: 165250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:08,587-Speed 5098.60 samples/sec Loss 3.6487 Epoch: 9 Global Step: 165300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:18,627-Speed 5099.76 samples/sec Loss 3.6784 Epoch: 9 Global Step: 165350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:29,039-Speed 4917.97 samples/sec Loss 3.6822 Epoch: 9 Global Step: 165400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:39,133-Speed 5072.38 samples/sec Loss 3.6659 Epoch: 9 Global Step: 165450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:49,171-Speed 5100.90 samples/sec Loss 3.6498 Epoch: 9 Global Step: 165500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:03:59,313-Speed 5048.74 samples/sec Loss 3.6687 Epoch: 9 Global Step: 165550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:09,208-Speed 5174.50 samples/sec Loss 3.6592 Epoch: 9 Global Step: 165600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:19,224-Speed 5111.96 samples/sec Loss 3.6987 Epoch: 9 Global Step: 165650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:29,034-Speed 5219.76 samples/sec Loss 3.6596 Epoch: 9 Global Step: 165700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:38,982-Speed 5147.03 samples/sec Loss 3.6638 Epoch: 9 Global Step: 165750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:48,807-Speed 5211.56 samples/sec Loss 3.6728 Epoch: 9 Global Step: 165800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:04:58,578-Speed 5240.22 samples/sec Loss 3.6971 Epoch: 9 Global Step: 165850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:05:08,403-Speed 5211.15 samples/sec Loss 3.6932 Epoch: 9 Global Step: 165900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:05:18,664-Speed 4990.14 samples/sec Loss 3.7058 Epoch: 9 Global Step: 165950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:05:28,617-Speed 5144.45 samples/sec Loss 3.6796 Epoch: 9 Global Step: 166000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:05:45,433-[lfw][166000]XNorm: 23.447003 Training: 2021-03-19 07:05:45,433-[lfw][166000]Accuracy-Flip: 0.99567+-0.00309 Training: 2021-03-19 07:05:45,435-[lfw][166000]Accuracy-Highest: 0.99683 Training: 2021-03-19 07:06:04,241-[cfp_fp][166000]XNorm: 19.023471 Training: 2021-03-19 07:06:04,242-[cfp_fp][166000]Accuracy-Flip: 0.95329+-0.01040 Training: 2021-03-19 07:06:04,242-[cfp_fp][166000]Accuracy-Highest: 0.95971 Training: 2021-03-19 07:06:20,522-[agedb_30][166000]XNorm: 22.612187 Training: 2021-03-19 07:06:20,522-[agedb_30][166000]Accuracy-Flip: 0.96750+-0.00793 Training: 2021-03-19 07:06:20,523-[agedb_30][166000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:06:30,426-Speed 828.37 samples/sec Loss 3.6890 Epoch: 9 Global Step: 166050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:06:40,390-Speed 5138.87 samples/sec Loss 3.6473 Epoch: 9 Global Step: 166100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:06:50,162-Speed 5239.78 samples/sec Loss 3.6725 Epoch: 9 Global Step: 166150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:01,106-Speed 4678.49 samples/sec Loss 3.7134 Epoch: 9 Global Step: 166200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:11,228-Speed 5058.88 samples/sec Loss 3.7042 Epoch: 9 Global Step: 166250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:21,087-Speed 5193.72 samples/sec Loss 3.6949 Epoch: 9 Global Step: 166300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:31,306-Speed 5010.73 samples/sec Loss 3.6722 Epoch: 9 Global Step: 166350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:41,128-Speed 5212.74 samples/sec Loss 3.6807 Epoch: 9 Global Step: 166400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:07:51,184-Speed 5091.87 samples/sec Loss 3.7204 Epoch: 9 Global Step: 166450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:01,060-Speed 5184.87 samples/sec Loss 3.6758 Epoch: 9 Global Step: 166500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:11,168-Speed 5065.28 samples/sec Loss 3.6725 Epoch: 9 Global Step: 166550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:21,190-Speed 5108.93 samples/sec Loss 3.6874 Epoch: 9 Global Step: 166600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:32,045-Speed 4717.15 samples/sec Loss 3.6690 Epoch: 9 Global Step: 166650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:41,934-Speed 5177.51 samples/sec Loss 3.6605 Epoch: 9 Global Step: 166700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:08:52,782-Speed 4720.30 samples/sec Loss 3.6613 Epoch: 9 Global Step: 166750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:09:02,946-Speed 5037.76 samples/sec Loss 3.6892 Epoch: 9 Global Step: 166800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:09:13,929-Speed 4661.83 samples/sec Loss 3.6325 Epoch: 9 Global Step: 166850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:09:24,488-Speed 4849.37 samples/sec Loss 3.6876 Epoch: 9 Global Step: 166900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:09:46,458-Speed 2330.45 samples/sec Loss 3.2981 Epoch: 10 Global Step: 166950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:09:56,476-Speed 5111.58 samples/sec Loss 3.2748 Epoch: 10 Global Step: 167000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:06,587-Speed 5064.12 samples/sec Loss 3.2372 Epoch: 10 Global Step: 167050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:16,791-Speed 5018.00 samples/sec Loss 3.2626 Epoch: 10 Global Step: 167100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:26,772-Speed 5130.09 samples/sec Loss 3.2659 Epoch: 10 Global Step: 167150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:36,738-Speed 5137.66 samples/sec Loss 3.2501 Epoch: 10 Global Step: 167200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:46,969-Speed 5004.73 samples/sec Loss 3.2492 Epoch: 10 Global Step: 167250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:10:57,112-Speed 5048.11 samples/sec Loss 3.2719 Epoch: 10 Global Step: 167300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:07,093-Speed 5130.23 samples/sec Loss 3.2766 Epoch: 10 Global Step: 167350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:17,155-Speed 5088.64 samples/sec Loss 3.2848 Epoch: 10 Global Step: 167400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:29,510-Speed 4144.18 samples/sec Loss 3.2960 Epoch: 10 Global Step: 167450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:39,406-Speed 5174.04 samples/sec Loss 3.2893 Epoch: 10 Global Step: 167500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:49,022-Speed 5324.80 samples/sec Loss 3.3228 Epoch: 10 Global Step: 167550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:11:59,863-Speed 4723.37 samples/sec Loss 3.3096 Epoch: 10 Global Step: 167600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:12:10,100-Speed 5001.83 samples/sec Loss 3.2905 Epoch: 10 Global Step: 167650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:12:20,163-Speed 5088.09 samples/sec Loss 3.3407 Epoch: 10 Global Step: 167700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:12:29,996-Speed 5207.28 samples/sec Loss 3.2862 Epoch: 10 Global Step: 167750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:12:40,028-Speed 5104.08 samples/sec Loss 3.3257 Epoch: 10 Global Step: 167800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:12:50,344-Speed 4963.37 samples/sec Loss 3.3054 Epoch: 10 Global Step: 167850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:13:00,287-Speed 5149.28 samples/sec Loss 3.3287 Epoch: 10 Global Step: 167900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:13:10,254-Speed 5137.48 samples/sec Loss 3.3351 Epoch: 10 Global Step: 167950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:13:20,355-Speed 5068.83 samples/sec Loss 3.3172 Epoch: 10 Global Step: 168000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:13:37,265-[lfw][168000]XNorm: 24.310461 Training: 2021-03-19 07:13:37,265-[lfw][168000]Accuracy-Flip: 0.99600+-0.00271 Training: 2021-03-19 07:13:37,265-[lfw][168000]Accuracy-Highest: 0.99683 Training: 2021-03-19 07:13:55,909-[cfp_fp][168000]XNorm: 19.850797 Training: 2021-03-19 07:13:55,909-[cfp_fp][168000]Accuracy-Flip: 0.95343+-0.01292 Training: 2021-03-19 07:13:55,909-[cfp_fp][168000]Accuracy-Highest: 0.95971 Training: 2021-03-19 07:14:12,123-[agedb_30][168000]XNorm: 23.363582 Training: 2021-03-19 07:14:12,123-[agedb_30][168000]Accuracy-Flip: 0.96467+-0.00759 Training: 2021-03-19 07:14:12,123-[agedb_30][168000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:14:21,982-Speed 830.81 samples/sec Loss 3.3284 Epoch: 10 Global Step: 168050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:14:32,173-Speed 5024.24 samples/sec Loss 3.3253 Epoch: 10 Global Step: 168100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:14:42,107-Speed 5154.11 samples/sec Loss 3.3118 Epoch: 10 Global Step: 168150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:14:51,964-Speed 5195.11 samples/sec Loss 3.3128 Epoch: 10 Global Step: 168200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:01,841-Speed 5183.95 samples/sec Loss 3.3582 Epoch: 10 Global Step: 168250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:11,859-Speed 5111.46 samples/sec Loss 3.3169 Epoch: 10 Global Step: 168300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:21,924-Speed 5087.05 samples/sec Loss 3.3331 Epoch: 10 Global Step: 168350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:31,951-Speed 5107.04 samples/sec Loss 3.3498 Epoch: 10 Global Step: 168400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:41,963-Speed 5114.30 samples/sec Loss 3.3602 Epoch: 10 Global Step: 168450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:15:52,072-Speed 5064.95 samples/sec Loss 3.3833 Epoch: 10 Global Step: 168500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:02,450-Speed 4934.09 samples/sec Loss 3.4171 Epoch: 10 Global Step: 168550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:12,524-Speed 5082.71 samples/sec Loss 3.3278 Epoch: 10 Global Step: 168600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:22,481-Speed 5142.59 samples/sec Loss 3.3868 Epoch: 10 Global Step: 168650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:32,578-Speed 5071.12 samples/sec Loss 3.3582 Epoch: 10 Global Step: 168700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:42,597-Speed 5110.21 samples/sec Loss 3.3863 Epoch: 10 Global Step: 168750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:16:52,620-Speed 5108.41 samples/sec Loss 3.4189 Epoch: 10 Global Step: 168800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:17:02,802-Speed 5029.14 samples/sec Loss 3.3844 Epoch: 10 Global Step: 168850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:17:13,046-Speed 4998.02 samples/sec Loss 3.3631 Epoch: 10 Global Step: 168900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:17:23,029-Speed 5129.27 samples/sec Loss 3.3962 Epoch: 10 Global Step: 168950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:17:33,352-Speed 4960.25 samples/sec Loss 3.3877 Epoch: 10 Global Step: 169000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-19 07:17:43,475-Speed 5057.85 samples/sec Loss 3.3653 Epoch: 10 Global Step: 169050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:17:53,434-Speed 5141.37 samples/sec Loss 3.4044 Epoch: 10 Global Step: 169100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:03,410-Speed 5132.86 samples/sec Loss 3.4359 Epoch: 10 Global Step: 169150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:13,275-Speed 5190.15 samples/sec Loss 3.4103 Epoch: 10 Global Step: 169200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:23,394-Speed 5060.11 samples/sec Loss 3.4231 Epoch: 10 Global Step: 169250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:33,594-Speed 5019.71 samples/sec Loss 3.3735 Epoch: 10 Global Step: 169300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:43,442-Speed 5199.55 samples/sec Loss 3.4211 Epoch: 10 Global Step: 169350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:18:53,450-Speed 5116.29 samples/sec Loss 3.4456 Epoch: 10 Global Step: 169400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:03,276-Speed 5210.69 samples/sec Loss 3.3960 Epoch: 10 Global Step: 169450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:13,363-Speed 5076.07 samples/sec Loss 3.4289 Epoch: 10 Global Step: 169500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:23,332-Speed 5136.35 samples/sec Loss 3.4007 Epoch: 10 Global Step: 169550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:34,173-Speed 4723.21 samples/sec Loss 3.3952 Epoch: 10 Global Step: 169600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:44,133-Speed 5140.84 samples/sec Loss 3.4252 Epoch: 10 Global Step: 169650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:19:54,257-Speed 5057.79 samples/sec Loss 3.4128 Epoch: 10 Global Step: 169700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:04,400-Speed 5048.20 samples/sec Loss 3.4382 Epoch: 10 Global Step: 169750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:14,403-Speed 5118.73 samples/sec Loss 3.3897 Epoch: 10 Global Step: 169800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:24,400-Speed 5121.77 samples/sec Loss 3.4304 Epoch: 10 Global Step: 169850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:34,539-Speed 5050.03 samples/sec Loss 3.4666 Epoch: 10 Global Step: 169900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:44,421-Speed 5181.57 samples/sec Loss 3.4499 Epoch: 10 Global Step: 169950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:20:54,452-Speed 5104.32 samples/sec Loss 3.4356 Epoch: 10 Global Step: 170000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:21:11,297-[lfw][170000]XNorm: 24.222418 Training: 2021-03-19 07:21:11,297-[lfw][170000]Accuracy-Flip: 0.99700+-0.00245 Training: 2021-03-19 07:21:11,297-[lfw][170000]Accuracy-Highest: 0.99700 Training: 2021-03-19 07:21:29,981-[cfp_fp][170000]XNorm: 19.557524 Training: 2021-03-19 07:21:29,982-[cfp_fp][170000]Accuracy-Flip: 0.95214+-0.01642 Training: 2021-03-19 07:21:29,983-[cfp_fp][170000]Accuracy-Highest: 0.95971 Training: 2021-03-19 07:21:46,216-[agedb_30][170000]XNorm: 23.217369 Training: 2021-03-19 07:21:46,216-[agedb_30][170000]Accuracy-Flip: 0.96683+-0.00835 Training: 2021-03-19 07:21:46,217-[agedb_30][170000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:21:57,095-Speed 817.35 samples/sec Loss 3.4611 Epoch: 10 Global Step: 170050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:07,926-Speed 4727.37 samples/sec Loss 3.4623 Epoch: 10 Global Step: 170100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:17,931-Speed 5117.57 samples/sec Loss 3.4396 Epoch: 10 Global Step: 170150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:27,861-Speed 5156.82 samples/sec Loss 3.4425 Epoch: 10 Global Step: 170200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:38,040-Speed 5029.93 samples/sec Loss 3.4071 Epoch: 10 Global Step: 170250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:48,898-Speed 4715.77 samples/sec Loss 3.4328 Epoch: 10 Global Step: 170300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:22:58,738-Speed 5203.56 samples/sec Loss 3.4560 Epoch: 10 Global Step: 170350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:08,674-Speed 5153.16 samples/sec Loss 3.4782 Epoch: 10 Global Step: 170400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:18,753-Speed 5080.50 samples/sec Loss 3.4401 Epoch: 10 Global Step: 170450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:28,654-Speed 5171.27 samples/sec Loss 3.4624 Epoch: 10 Global Step: 170500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:38,748-Speed 5072.54 samples/sec Loss 3.4697 Epoch: 10 Global Step: 170550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:48,843-Speed 5072.23 samples/sec Loss 3.4853 Epoch: 10 Global Step: 170600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:23:59,032-Speed 5025.21 samples/sec Loss 3.5112 Epoch: 10 Global Step: 170650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:24:09,209-Speed 5031.01 samples/sec Loss 3.5265 Epoch: 10 Global Step: 170700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:24:19,372-Speed 5038.24 samples/sec Loss 3.4766 Epoch: 10 Global Step: 170750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:24:30,374-Speed 4654.21 samples/sec Loss 3.4700 Epoch: 10 Global Step: 170800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:24:41,686-Speed 4526.10 samples/sec Loss 3.4417 Epoch: 10 Global Step: 170850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:24:51,778-Speed 5073.73 samples/sec Loss 3.4444 Epoch: 10 Global Step: 170900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:02,613-Speed 4725.39 samples/sec Loss 3.4925 Epoch: 10 Global Step: 170950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:12,613-Speed 5120.31 samples/sec Loss 3.4846 Epoch: 10 Global Step: 171000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:22,784-Speed 5034.57 samples/sec Loss 3.4808 Epoch: 10 Global Step: 171050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:32,739-Speed 5143.44 samples/sec Loss 3.4889 Epoch: 10 Global Step: 171100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:42,756-Speed 5111.29 samples/sec Loss 3.4966 Epoch: 10 Global Step: 171150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:25:52,810-Speed 5092.87 samples/sec Loss 3.4819 Epoch: 10 Global Step: 171200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:02,822-Speed 5114.25 samples/sec Loss 3.5088 Epoch: 10 Global Step: 171250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:12,853-Speed 5104.59 samples/sec Loss 3.4794 Epoch: 10 Global Step: 171300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:22,859-Speed 5117.23 samples/sec Loss 3.4865 Epoch: 10 Global Step: 171350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:32,819-Speed 5140.67 samples/sec Loss 3.4640 Epoch: 10 Global Step: 171400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:42,895-Speed 5081.81 samples/sec Loss 3.4943 Epoch: 10 Global Step: 171450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:26:52,865-Speed 5135.80 samples/sec Loss 3.5042 Epoch: 10 Global Step: 171500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:02,876-Speed 5114.97 samples/sec Loss 3.4481 Epoch: 10 Global Step: 171550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:13,047-Speed 5033.87 samples/sec Loss 3.4947 Epoch: 10 Global Step: 171600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:23,069-Speed 5109.00 samples/sec Loss 3.4614 Epoch: 10 Global Step: 171650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:33,043-Speed 5133.90 samples/sec Loss 3.4992 Epoch: 10 Global Step: 171700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:43,111-Speed 5085.38 samples/sec Loss 3.4707 Epoch: 10 Global Step: 171750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:27:53,360-Speed 4996.10 samples/sec Loss 3.5122 Epoch: 10 Global Step: 171800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:28:03,326-Speed 5138.01 samples/sec Loss 3.5091 Epoch: 10 Global Step: 171850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:28:13,331-Speed 5117.33 samples/sec Loss 3.5241 Epoch: 10 Global Step: 171900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:28:23,201-Speed 5188.09 samples/sec Loss 3.5398 Epoch: 10 Global Step: 171950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:28:33,302-Speed 5069.09 samples/sec Loss 3.5393 Epoch: 10 Global Step: 172000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:28:50,065-[lfw][172000]XNorm: 23.728506 Training: 2021-03-19 07:28:50,065-[lfw][172000]Accuracy-Flip: 0.99633+-0.00277 Training: 2021-03-19 07:28:50,065-[lfw][172000]Accuracy-Highest: 0.99700 Training: 2021-03-19 07:29:08,855-[cfp_fp][172000]XNorm: 19.432327 Training: 2021-03-19 07:29:08,855-[cfp_fp][172000]Accuracy-Flip: 0.95614+-0.01328 Training: 2021-03-19 07:29:08,855-[cfp_fp][172000]Accuracy-Highest: 0.95971 Training: 2021-03-19 07:29:25,067-[agedb_30][172000]XNorm: 22.943054 Training: 2021-03-19 07:29:25,067-[agedb_30][172000]Accuracy-Flip: 0.96600+-0.00708 Training: 2021-03-19 07:29:25,067-[agedb_30][172000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:29:34,926-Speed 830.85 samples/sec Loss 3.5532 Epoch: 10 Global Step: 172050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:29:44,846-Speed 5161.11 samples/sec Loss 3.4970 Epoch: 10 Global Step: 172100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:29:54,535-Speed 5284.72 samples/sec Loss 3.5310 Epoch: 10 Global Step: 172150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:04,345-Speed 5219.79 samples/sec Loss 3.4846 Epoch: 10 Global Step: 172200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:14,304-Speed 5141.42 samples/sec Loss 3.5171 Epoch: 10 Global Step: 172250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:24,154-Speed 5198.30 samples/sec Loss 3.5009 Epoch: 10 Global Step: 172300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:34,111-Speed 5142.29 samples/sec Loss 3.4921 Epoch: 10 Global Step: 172350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:44,059-Speed 5147.03 samples/sec Loss 3.5284 Epoch: 10 Global Step: 172400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:30:54,229-Speed 5034.45 samples/sec Loss 3.5028 Epoch: 10 Global Step: 172450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:04,384-Speed 5042.51 samples/sec Loss 3.5052 Epoch: 10 Global Step: 172500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:14,428-Speed 5097.98 samples/sec Loss 3.5125 Epoch: 10 Global Step: 172550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:24,448-Speed 5109.89 samples/sec Loss 3.4961 Epoch: 10 Global Step: 172600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:34,375-Speed 5158.03 samples/sec Loss 3.5294 Epoch: 10 Global Step: 172650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:44,329-Speed 5143.53 samples/sec Loss 3.5393 Epoch: 10 Global Step: 172700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:31:54,354-Speed 5107.98 samples/sec Loss 3.5411 Epoch: 10 Global Step: 172750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:04,557-Speed 5018.22 samples/sec Loss 3.5312 Epoch: 10 Global Step: 172800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:14,777-Speed 5009.99 samples/sec Loss 3.5396 Epoch: 10 Global Step: 172850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:25,006-Speed 5005.77 samples/sec Loss 3.5399 Epoch: 10 Global Step: 172900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:35,955-Speed 4676.35 samples/sec Loss 3.5679 Epoch: 10 Global Step: 172950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:45,914-Speed 5141.40 samples/sec Loss 3.5369 Epoch: 10 Global Step: 173000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:32:55,917-Speed 5118.74 samples/sec Loss 3.5258 Epoch: 10 Global Step: 173050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:05,730-Speed 5218.28 samples/sec Loss 3.5353 Epoch: 10 Global Step: 173100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:15,819-Speed 5075.12 samples/sec Loss 3.5653 Epoch: 10 Global Step: 173150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:25,896-Speed 5081.00 samples/sec Loss 3.5433 Epoch: 10 Global Step: 173200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:35,807-Speed 5166.22 samples/sec Loss 3.5689 Epoch: 10 Global Step: 173250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:45,874-Speed 5086.19 samples/sec Loss 3.5194 Epoch: 10 Global Step: 173300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:33:55,856-Speed 5129.58 samples/sec Loss 3.5544 Epoch: 10 Global Step: 173350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:06,539-Speed 4792.82 samples/sec Loss 3.5738 Epoch: 10 Global Step: 173400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:16,707-Speed 5035.64 samples/sec Loss 3.5428 Epoch: 10 Global Step: 173450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:27,559-Speed 4718.19 samples/sec Loss 3.5492 Epoch: 10 Global Step: 173500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:37,807-Speed 4996.82 samples/sec Loss 3.5385 Epoch: 10 Global Step: 173550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:47,619-Speed 5218.52 samples/sec Loss 3.5608 Epoch: 10 Global Step: 173600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:34:58,338-Speed 4776.91 samples/sec Loss 3.5555 Epoch: 10 Global Step: 173650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:08,476-Speed 5050.72 samples/sec Loss 3.5227 Epoch: 10 Global Step: 173700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:18,566-Speed 5074.58 samples/sec Loss 3.5626 Epoch: 10 Global Step: 173750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:28,453-Speed 5178.57 samples/sec Loss 3.5933 Epoch: 10 Global Step: 173800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:38,484-Speed 5104.43 samples/sec Loss 3.5820 Epoch: 10 Global Step: 173850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:48,536-Speed 5094.09 samples/sec Loss 3.5702 Epoch: 10 Global Step: 173900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:35:58,770-Speed 5003.22 samples/sec Loss 3.5888 Epoch: 10 Global Step: 173950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:36:08,847-Speed 5081.05 samples/sec Loss 3.5554 Epoch: 10 Global Step: 174000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:36:25,705-[lfw][174000]XNorm: 24.353053 Training: 2021-03-19 07:36:25,705-[lfw][174000]Accuracy-Flip: 0.99717+-0.00183 Training: 2021-03-19 07:36:25,705-[lfw][174000]Accuracy-Highest: 0.99717 Training: 2021-03-19 07:36:44,305-[cfp_fp][174000]XNorm: 19.870067 Training: 2021-03-19 07:36:44,305-[cfp_fp][174000]Accuracy-Flip: 0.95600+-0.00945 Training: 2021-03-19 07:36:44,306-[cfp_fp][174000]Accuracy-Highest: 0.95971 Training: 2021-03-19 07:37:00,413-[agedb_30][174000]XNorm: 23.470219 Training: 2021-03-19 07:37:00,414-[agedb_30][174000]Accuracy-Flip: 0.96717+-0.00833 Training: 2021-03-19 07:37:00,414-[agedb_30][174000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:37:10,493-Speed 830.56 samples/sec Loss 3.5690 Epoch: 10 Global Step: 174050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:37:21,440-Speed 4677.44 samples/sec Loss 3.5421 Epoch: 10 Global Step: 174100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:37:31,574-Speed 5052.83 samples/sec Loss 3.6119 Epoch: 10 Global Step: 174150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:37:42,234-Speed 4803.34 samples/sec Loss 3.5804 Epoch: 10 Global Step: 174200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:37:52,884-Speed 4807.78 samples/sec Loss 3.5978 Epoch: 10 Global Step: 174250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:03,010-Speed 5056.32 samples/sec Loss 3.5400 Epoch: 10 Global Step: 174300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:14,036-Speed 4643.85 samples/sec Loss 3.5670 Epoch: 10 Global Step: 174350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:24,128-Speed 5073.75 samples/sec Loss 3.5443 Epoch: 10 Global Step: 174400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:34,095-Speed 5137.20 samples/sec Loss 3.6052 Epoch: 10 Global Step: 174450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:44,103-Speed 5115.99 samples/sec Loss 3.5923 Epoch: 10 Global Step: 174500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:38:54,069-Speed 5137.67 samples/sec Loss 3.5960 Epoch: 10 Global Step: 174550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:04,081-Speed 5114.48 samples/sec Loss 3.5737 Epoch: 10 Global Step: 174600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:14,095-Speed 5113.04 samples/sec Loss 3.5835 Epoch: 10 Global Step: 174650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:24,348-Speed 4993.95 samples/sec Loss 3.5421 Epoch: 10 Global Step: 174700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:34,565-Speed 5011.56 samples/sec Loss 3.5946 Epoch: 10 Global Step: 174750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:44,658-Speed 5073.38 samples/sec Loss 3.5613 Epoch: 10 Global Step: 174800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:39:54,786-Speed 5055.56 samples/sec Loss 3.5622 Epoch: 10 Global Step: 174850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:04,921-Speed 5051.94 samples/sec Loss 3.5977 Epoch: 10 Global Step: 174900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:14,927-Speed 5116.93 samples/sec Loss 3.5407 Epoch: 10 Global Step: 174950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:25,034-Speed 5066.18 samples/sec Loss 3.5506 Epoch: 10 Global Step: 175000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:35,201-Speed 5036.18 samples/sec Loss 3.5954 Epoch: 10 Global Step: 175050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:45,418-Speed 5011.63 samples/sec Loss 3.5846 Epoch: 10 Global Step: 175100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:40:55,284-Speed 5189.57 samples/sec Loss 3.5842 Epoch: 10 Global Step: 175150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:05,211-Speed 5158.21 samples/sec Loss 3.6028 Epoch: 10 Global Step: 175200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:15,299-Speed 5075.45 samples/sec Loss 3.5784 Epoch: 10 Global Step: 175250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:25,426-Speed 5055.88 samples/sec Loss 3.6071 Epoch: 10 Global Step: 175300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:35,575-Speed 5045.20 samples/sec Loss 3.5720 Epoch: 10 Global Step: 175350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:45,657-Speed 5078.83 samples/sec Loss 3.5981 Epoch: 10 Global Step: 175400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:41:55,638-Speed 5129.77 samples/sec Loss 3.6150 Epoch: 10 Global Step: 175450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:05,680-Speed 5099.29 samples/sec Loss 3.6187 Epoch: 10 Global Step: 175500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:15,526-Speed 5200.38 samples/sec Loss 3.5808 Epoch: 10 Global Step: 175550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:25,387-Speed 5192.71 samples/sec Loss 3.6313 Epoch: 10 Global Step: 175600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:35,659-Speed 4984.56 samples/sec Loss 3.6036 Epoch: 10 Global Step: 175650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:45,693-Speed 5102.98 samples/sec Loss 3.5846 Epoch: 10 Global Step: 175700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:42:55,479-Speed 5232.23 samples/sec Loss 3.6012 Epoch: 10 Global Step: 175750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:43:05,437-Speed 5142.16 samples/sec Loss 3.6335 Epoch: 10 Global Step: 175800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:43:15,452-Speed 5112.32 samples/sec Loss 3.6053 Epoch: 10 Global Step: 175850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:43:25,477-Speed 5107.89 samples/sec Loss 3.6217 Epoch: 10 Global Step: 175900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:43:35,492-Speed 5112.16 samples/sec Loss 3.5698 Epoch: 10 Global Step: 175950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:43:45,675-Speed 5028.44 samples/sec Loss 3.5852 Epoch: 10 Global Step: 176000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:44:02,575-[lfw][176000]XNorm: 23.989427 Training: 2021-03-19 07:44:02,575-[lfw][176000]Accuracy-Flip: 0.99650+-0.00252 Training: 2021-03-19 07:44:02,575-[lfw][176000]Accuracy-Highest: 0.99717 Training: 2021-03-19 07:44:21,469-[cfp_fp][176000]XNorm: 19.622166 Training: 2021-03-19 07:44:21,470-[cfp_fp][176000]Accuracy-Flip: 0.95986+-0.01031 Training: 2021-03-19 07:44:21,470-[cfp_fp][176000]Accuracy-Highest: 0.95986 Training: 2021-03-19 07:44:37,635-[agedb_30][176000]XNorm: 23.053768 Training: 2021-03-19 07:44:37,635-[agedb_30][176000]Accuracy-Flip: 0.96600+-0.00672 Training: 2021-03-19 07:44:37,635-[agedb_30][176000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:44:47,603-Speed 826.77 samples/sec Loss 3.6010 Epoch: 10 Global Step: 176050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:44:57,589-Speed 5127.82 samples/sec Loss 3.5891 Epoch: 10 Global Step: 176100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:07,729-Speed 5049.24 samples/sec Loss 3.6161 Epoch: 10 Global Step: 176150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:17,715-Speed 5128.02 samples/sec Loss 3.6126 Epoch: 10 Global Step: 176200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:27,576-Speed 5192.15 samples/sec Loss 3.6626 Epoch: 10 Global Step: 176250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:38,530-Speed 4674.40 samples/sec Loss 3.5931 Epoch: 10 Global Step: 176300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:48,672-Speed 5048.93 samples/sec Loss 3.5861 Epoch: 10 Global Step: 176350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:45:58,737-Speed 5087.17 samples/sec Loss 3.6064 Epoch: 10 Global Step: 176400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:08,728-Speed 5124.66 samples/sec Loss 3.6368 Epoch: 10 Global Step: 176450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:18,915-Speed 5026.51 samples/sec Loss 3.6084 Epoch: 10 Global Step: 176500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:28,867-Speed 5145.25 samples/sec Loss 3.6232 Epoch: 10 Global Step: 176550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:38,963-Speed 5071.97 samples/sec Loss 3.6167 Epoch: 10 Global Step: 176600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:49,260-Speed 4972.17 samples/sec Loss 3.5999 Epoch: 10 Global Step: 176650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:46:59,223-Speed 5139.72 samples/sec Loss 3.5727 Epoch: 10 Global Step: 176700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:47:09,906-Speed 4792.79 samples/sec Loss 3.6326 Epoch: 10 Global Step: 176750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:47:19,953-Speed 5096.24 samples/sec Loss 3.6657 Epoch: 10 Global Step: 176800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:47:29,882-Speed 5157.21 samples/sec Loss 3.6174 Epoch: 10 Global Step: 176850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:47:39,724-Speed 5202.20 samples/sec Loss 3.6271 Epoch: 10 Global Step: 176900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:47:50,868-Speed 4594.80 samples/sec Loss 3.6254 Epoch: 10 Global Step: 176950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:00,911-Speed 5097.95 samples/sec Loss 3.6229 Epoch: 10 Global Step: 177000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:11,608-Speed 4787.10 samples/sec Loss 3.6219 Epoch: 10 Global Step: 177050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:21,773-Speed 5037.02 samples/sec Loss 3.6474 Epoch: 10 Global Step: 177100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:31,892-Speed 5060.29 samples/sec Loss 3.6343 Epoch: 10 Global Step: 177150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:41,953-Speed 5089.22 samples/sec Loss 3.6299 Epoch: 10 Global Step: 177200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:48:51,990-Speed 5101.13 samples/sec Loss 3.6044 Epoch: 10 Global Step: 177250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:02,259-Speed 4986.21 samples/sec Loss 3.6210 Epoch: 10 Global Step: 177300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:12,201-Speed 5150.73 samples/sec Loss 3.6753 Epoch: 10 Global Step: 177350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:22,436-Speed 5002.52 samples/sec Loss 3.6616 Epoch: 10 Global Step: 177400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:32,672-Speed 5002.29 samples/sec Loss 3.6547 Epoch: 10 Global Step: 177450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:43,524-Speed 4718.43 samples/sec Loss 3.6341 Epoch: 10 Global Step: 177500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:49:54,481-Speed 4672.82 samples/sec Loss 3.6680 Epoch: 10 Global Step: 177550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:05,157-Speed 4795.96 samples/sec Loss 3.6380 Epoch: 10 Global Step: 177600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:15,228-Speed 5084.50 samples/sec Loss 3.6555 Epoch: 10 Global Step: 177650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:25,966-Speed 4768.57 samples/sec Loss 3.6486 Epoch: 10 Global Step: 177700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:35,985-Speed 5110.17 samples/sec Loss 3.5912 Epoch: 10 Global Step: 177750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:46,171-Speed 5026.96 samples/sec Loss 3.6465 Epoch: 10 Global Step: 177800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:50:56,166-Speed 5122.99 samples/sec Loss 3.6421 Epoch: 10 Global Step: 177850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:51:06,222-Speed 5091.78 samples/sec Loss 3.6666 Epoch: 10 Global Step: 177900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:51:15,967-Speed 5253.99 samples/sec Loss 3.6516 Epoch: 10 Global Step: 177950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:51:26,004-Speed 5101.79 samples/sec Loss 3.6061 Epoch: 10 Global Step: 178000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:51:42,774-[lfw][178000]XNorm: 23.685917 Training: 2021-03-19 07:51:42,774-[lfw][178000]Accuracy-Flip: 0.99567+-0.00300 Training: 2021-03-19 07:51:42,774-[lfw][178000]Accuracy-Highest: 0.99717 Training: 2021-03-19 07:52:01,415-[cfp_fp][178000]XNorm: 19.198399 Training: 2021-03-19 07:52:01,416-[cfp_fp][178000]Accuracy-Flip: 0.95786+-0.00897 Training: 2021-03-19 07:52:01,456-[cfp_fp][178000]Accuracy-Highest: 0.95986 Training: 2021-03-19 07:52:17,748-[agedb_30][178000]XNorm: 22.502892 Training: 2021-03-19 07:52:17,749-[agedb_30][178000]Accuracy-Flip: 0.96500+-0.00940 Training: 2021-03-19 07:52:17,749-[agedb_30][178000]Accuracy-Highest: 0.96867 Training: 2021-03-19 07:52:27,728-Speed 829.51 samples/sec Loss 3.6705 Epoch: 10 Global Step: 178050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:52:37,693-Speed 5138.37 samples/sec Loss 3.6712 Epoch: 10 Global Step: 178100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:52:47,849-Speed 5041.57 samples/sec Loss 3.6223 Epoch: 10 Global Step: 178150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:52:57,866-Speed 5112.09 samples/sec Loss 3.6319 Epoch: 10 Global Step: 178200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:07,973-Speed 5066.02 samples/sec Loss 3.6200 Epoch: 10 Global Step: 178250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:18,052-Speed 5080.30 samples/sec Loss 3.6756 Epoch: 10 Global Step: 178300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:28,069-Speed 5111.53 samples/sec Loss 3.6515 Epoch: 10 Global Step: 178350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:37,913-Speed 5201.44 samples/sec Loss 3.6453 Epoch: 10 Global Step: 178400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:47,890-Speed 5131.73 samples/sec Loss 3.6615 Epoch: 10 Global Step: 178450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:53:58,035-Speed 5047.64 samples/sec Loss 3.6425 Epoch: 10 Global Step: 178500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:08,279-Speed 4998.03 samples/sec Loss 3.6568 Epoch: 10 Global Step: 178550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:18,358-Speed 5080.51 samples/sec Loss 3.6448 Epoch: 10 Global Step: 178600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:28,195-Speed 5205.22 samples/sec Loss 3.7030 Epoch: 10 Global Step: 178650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:38,392-Speed 5021.25 samples/sec Loss 3.6330 Epoch: 10 Global Step: 178700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:48,222-Speed 5209.11 samples/sec Loss 3.6396 Epoch: 10 Global Step: 178750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:54:58,431-Speed 5015.67 samples/sec Loss 3.6586 Epoch: 10 Global Step: 178800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:08,381-Speed 5145.76 samples/sec Loss 3.6457 Epoch: 10 Global Step: 178850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:18,474-Speed 5073.66 samples/sec Loss 3.6723 Epoch: 10 Global Step: 178900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:28,671-Speed 5021.20 samples/sec Loss 3.6477 Epoch: 10 Global Step: 178950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:38,700-Speed 5105.53 samples/sec Loss 3.6453 Epoch: 10 Global Step: 179000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:48,688-Speed 5126.62 samples/sec Loss 3.6624 Epoch: 10 Global Step: 179050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:55:58,693-Speed 5117.34 samples/sec Loss 3.6366 Epoch: 10 Global Step: 179100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:08,592-Speed 5172.64 samples/sec Loss 3.6754 Epoch: 10 Global Step: 179150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:18,573-Speed 5129.85 samples/sec Loss 3.6968 Epoch: 10 Global Step: 179200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:28,743-Speed 5035.09 samples/sec Loss 3.6606 Epoch: 10 Global Step: 179250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:38,522-Speed 5235.64 samples/sec Loss 3.6354 Epoch: 10 Global Step: 179300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:48,469-Speed 5147.67 samples/sec Loss 3.6563 Epoch: 10 Global Step: 179350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:56:58,388-Speed 5162.34 samples/sec Loss 3.6769 Epoch: 10 Global Step: 179400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:08,509-Speed 5059.25 samples/sec Loss 3.6892 Epoch: 10 Global Step: 179450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:18,475-Speed 5137.52 samples/sec Loss 3.6440 Epoch: 10 Global Step: 179500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:28,286-Speed 5219.06 samples/sec Loss 3.6755 Epoch: 10 Global Step: 179550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:38,446-Speed 5039.87 samples/sec Loss 3.6290 Epoch: 10 Global Step: 179600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:48,268-Speed 5213.17 samples/sec Loss 3.6781 Epoch: 10 Global Step: 179650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:57:59,197-Speed 4684.83 samples/sec Loss 3.6921 Epoch: 10 Global Step: 179700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:09,310-Speed 5063.02 samples/sec Loss 3.7081 Epoch: 10 Global Step: 179750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:19,386-Speed 5081.56 samples/sec Loss 3.6805 Epoch: 10 Global Step: 179800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:29,293-Speed 5168.67 samples/sec Loss 3.6888 Epoch: 10 Global Step: 179850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:39,384-Speed 5074.08 samples/sec Loss 3.6752 Epoch: 10 Global Step: 179900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:49,660-Speed 4982.69 samples/sec Loss 3.6814 Epoch: 10 Global Step: 179950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:58:59,711-Speed 5094.12 samples/sec Loss 3.6907 Epoch: 10 Global Step: 180000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 07:59:16,560-[lfw][180000]XNorm: 22.823604 Training: 2021-03-19 07:59:16,560-[lfw][180000]Accuracy-Flip: 0.99633+-0.00296 Training: 2021-03-19 07:59:16,560-[lfw][180000]Accuracy-Highest: 0.99717 Training: 2021-03-19 07:59:35,382-[cfp_fp][180000]XNorm: 18.631017 Training: 2021-03-19 07:59:35,382-[cfp_fp][180000]Accuracy-Flip: 0.95471+-0.01239 Training: 2021-03-19 07:59:35,382-[cfp_fp][180000]Accuracy-Highest: 0.95986 Training: 2021-03-19 07:59:51,561-[agedb_30][180000]XNorm: 21.843396 Training: 2021-03-19 07:59:51,561-[agedb_30][180000]Accuracy-Flip: 0.96800+-0.00999 Training: 2021-03-19 07:59:51,561-[agedb_30][180000]Accuracy-Highest: 0.96867 Training: 2021-03-19 08:00:01,435-Speed 829.51 samples/sec Loss 3.7054 Epoch: 10 Global Step: 180050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:00:12,308-Speed 4709.48 samples/sec Loss 3.6859 Epoch: 10 Global Step: 180100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:00:22,267-Speed 5141.18 samples/sec Loss 3.6587 Epoch: 10 Global Step: 180150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:00:32,297-Speed 5105.03 samples/sec Loss 3.6942 Epoch: 10 Global Step: 180200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:00:42,356-Speed 5090.43 samples/sec Loss 3.6555 Epoch: 10 Global Step: 180250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:00:53,394-Speed 4638.79 samples/sec Loss 3.6578 Epoch: 10 Global Step: 180300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:03,116-Speed 5266.72 samples/sec Loss 3.6590 Epoch: 10 Global Step: 180350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:13,112-Speed 5122.71 samples/sec Loss 3.6592 Epoch: 10 Global Step: 180400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:23,147-Speed 5102.38 samples/sec Loss 3.6600 Epoch: 10 Global Step: 180450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:33,787-Speed 4812.07 samples/sec Loss 3.6973 Epoch: 10 Global Step: 180500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:43,898-Speed 5064.13 samples/sec Loss 3.6796 Epoch: 10 Global Step: 180550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:01:53,856-Speed 5141.98 samples/sec Loss 3.7127 Epoch: 10 Global Step: 180600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:04,055-Speed 5020.45 samples/sec Loss 3.6981 Epoch: 10 Global Step: 180650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:14,184-Speed 5055.12 samples/sec Loss 3.6852 Epoch: 10 Global Step: 180700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:24,070-Speed 5179.43 samples/sec Loss 3.7044 Epoch: 10 Global Step: 180750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:34,348-Speed 4981.29 samples/sec Loss 3.6691 Epoch: 10 Global Step: 180800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:45,077-Speed 4772.64 samples/sec Loss 3.6509 Epoch: 10 Global Step: 180850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:02:55,863-Speed 4746.88 samples/sec Loss 3.6867 Epoch: 10 Global Step: 180900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:06,730-Speed 4711.79 samples/sec Loss 3.7124 Epoch: 10 Global Step: 180950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:16,652-Speed 5160.76 samples/sec Loss 3.7083 Epoch: 10 Global Step: 181000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:27,439-Speed 4746.81 samples/sec Loss 3.6685 Epoch: 10 Global Step: 181050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:37,376-Speed 5152.61 samples/sec Loss 3.6643 Epoch: 10 Global Step: 181100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:47,491-Speed 5062.07 samples/sec Loss 3.6539 Epoch: 10 Global Step: 181150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:03:57,560-Speed 5085.18 samples/sec Loss 3.6267 Epoch: 10 Global Step: 181200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:07,754-Speed 5023.35 samples/sec Loss 3.7244 Epoch: 10 Global Step: 181250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:17,664-Speed 5166.70 samples/sec Loss 3.6990 Epoch: 10 Global Step: 181300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:27,599-Speed 5153.71 samples/sec Loss 3.7138 Epoch: 10 Global Step: 181350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:38,091-Speed 4880.25 samples/sec Loss 3.6778 Epoch: 10 Global Step: 181400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:48,332-Speed 4999.80 samples/sec Loss 3.6655 Epoch: 10 Global Step: 181450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:04:58,158-Speed 5211.01 samples/sec Loss 3.6855 Epoch: 10 Global Step: 181500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:08,308-Speed 5044.10 samples/sec Loss 3.6617 Epoch: 10 Global Step: 181550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:18,569-Speed 4990.05 samples/sec Loss 3.7067 Epoch: 10 Global Step: 181600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:28,824-Speed 4993.35 samples/sec Loss 3.6269 Epoch: 10 Global Step: 181650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:38,883-Speed 5090.09 samples/sec Loss 3.7141 Epoch: 10 Global Step: 181700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:48,737-Speed 5196.43 samples/sec Loss 3.6848 Epoch: 10 Global Step: 181750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:05:58,722-Speed 5128.00 samples/sec Loss 3.6911 Epoch: 10 Global Step: 181800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:06:08,697-Speed 5133.28 samples/sec Loss 3.6641 Epoch: 10 Global Step: 181850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:06:18,691-Speed 5122.86 samples/sec Loss 3.6871 Epoch: 10 Global Step: 181900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:06:28,697-Speed 5117.35 samples/sec Loss 3.6595 Epoch: 10 Global Step: 181950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:06:38,845-Speed 5045.58 samples/sec Loss 3.6919 Epoch: 10 Global Step: 182000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:06:55,549-[lfw][182000]XNorm: 22.382213 Training: 2021-03-19 08:06:55,549-[lfw][182000]Accuracy-Flip: 0.99633+-0.00340 Training: 2021-03-19 08:06:55,549-[lfw][182000]Accuracy-Highest: 0.99717 Training: 2021-03-19 08:07:14,116-[cfp_fp][182000]XNorm: 18.326283 Training: 2021-03-19 08:07:14,116-[cfp_fp][182000]Accuracy-Flip: 0.95914+-0.01216 Training: 2021-03-19 08:07:14,116-[cfp_fp][182000]Accuracy-Highest: 0.95986 Training: 2021-03-19 08:07:30,259-[agedb_30][182000]XNorm: 21.757106 Training: 2021-03-19 08:07:30,260-[agedb_30][182000]Accuracy-Flip: 0.96700+-0.00909 Training: 2021-03-19 08:07:30,260-[agedb_30][182000]Accuracy-Highest: 0.96867 Training: 2021-03-19 08:07:40,254-Speed 833.77 samples/sec Loss 3.6969 Epoch: 10 Global Step: 182050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:07:50,319-Speed 5087.40 samples/sec Loss 3.7136 Epoch: 10 Global Step: 182100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:00,084-Speed 5243.50 samples/sec Loss 3.7060 Epoch: 10 Global Step: 182150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:10,488-Speed 4920.97 samples/sec Loss 3.6873 Epoch: 10 Global Step: 182200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:20,651-Speed 5038.59 samples/sec Loss 3.6907 Epoch: 10 Global Step: 182250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:30,731-Speed 5079.30 samples/sec Loss 3.7343 Epoch: 10 Global Step: 182300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:40,970-Speed 5001.01 samples/sec Loss 3.7316 Epoch: 10 Global Step: 182350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:08:51,052-Speed 5078.33 samples/sec Loss 3.7124 Epoch: 10 Global Step: 182400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:01,003-Speed 5145.78 samples/sec Loss 3.6832 Epoch: 10 Global Step: 182450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:11,005-Speed 5118.96 samples/sec Loss 3.6949 Epoch: 10 Global Step: 182500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:21,094-Speed 5075.36 samples/sec Loss 3.6548 Epoch: 10 Global Step: 182550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:31,016-Speed 5160.43 samples/sec Loss 3.6950 Epoch: 10 Global Step: 182600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:41,133-Speed 5061.12 samples/sec Loss 3.7240 Epoch: 10 Global Step: 182650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:09:51,160-Speed 5106.56 samples/sec Loss 3.6781 Epoch: 10 Global Step: 182700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:01,098-Speed 5152.47 samples/sec Loss 3.7325 Epoch: 10 Global Step: 182750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:11,216-Speed 5060.38 samples/sec Loss 3.6920 Epoch: 10 Global Step: 182800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:21,321-Speed 5067.14 samples/sec Loss 3.6876 Epoch: 10 Global Step: 182850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:31,469-Speed 5045.60 samples/sec Loss 3.7153 Epoch: 10 Global Step: 182900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:41,308-Speed 5204.16 samples/sec Loss 3.6814 Epoch: 10 Global Step: 182950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:10:51,474-Speed 5036.57 samples/sec Loss 3.6719 Epoch: 10 Global Step: 183000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:01,756-Speed 4980.10 samples/sec Loss 3.6992 Epoch: 10 Global Step: 183050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:12,317-Speed 4848.10 samples/sec Loss 3.7048 Epoch: 10 Global Step: 183100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:22,460-Speed 5048.04 samples/sec Loss 3.7158 Epoch: 10 Global Step: 183150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:32,527-Speed 5086.36 samples/sec Loss 3.7159 Epoch: 10 Global Step: 183200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:42,610-Speed 5078.09 samples/sec Loss 3.7076 Epoch: 10 Global Step: 183250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:11:52,726-Speed 5061.61 samples/sec Loss 3.7133 Epoch: 10 Global Step: 183300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:02,666-Speed 5151.19 samples/sec Loss 3.6943 Epoch: 10 Global Step: 183350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:12,690-Speed 5108.09 samples/sec Loss 3.6975 Epoch: 10 Global Step: 183400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:23,673-Speed 4661.81 samples/sec Loss 3.7034 Epoch: 10 Global Step: 183450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:33,840-Speed 5036.37 samples/sec Loss 3.7103 Epoch: 10 Global Step: 183500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:43,794-Speed 5143.80 samples/sec Loss 3.7084 Epoch: 10 Global Step: 183550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:12:54,554-Speed 4758.75 samples/sec Loss 3.6562 Epoch: 10 Global Step: 183600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:13:16,868-Speed 2294.51 samples/sec Loss 3.0630 Epoch: 11 Global Step: 183650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:13:28,672-Speed 4337.81 samples/sec Loss 2.9007 Epoch: 11 Global Step: 183700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:13:38,881-Speed 5015.71 samples/sec Loss 2.8764 Epoch: 11 Global Step: 183750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:13:49,080-Speed 5020.26 samples/sec Loss 2.8461 Epoch: 11 Global Step: 183800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:13:59,274-Speed 5023.16 samples/sec Loss 2.7856 Epoch: 11 Global Step: 183850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:14:10,204-Speed 4684.88 samples/sec Loss 2.8077 Epoch: 11 Global Step: 183900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:14:20,003-Speed 5225.41 samples/sec Loss 2.7654 Epoch: 11 Global Step: 183950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:14:29,868-Speed 5190.41 samples/sec Loss 2.7467 Epoch: 11 Global Step: 184000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:14:46,747-[lfw][184000]XNorm: 23.427960 Training: 2021-03-19 08:14:46,747-[lfw][184000]Accuracy-Flip: 0.99750+-0.00300 Training: 2021-03-19 08:14:46,747-[lfw][184000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:15:05,445-[cfp_fp][184000]XNorm: 19.279224 Training: 2021-03-19 08:15:05,446-[cfp_fp][184000]Accuracy-Flip: 0.96543+-0.01229 Training: 2021-03-19 08:15:05,446-[cfp_fp][184000]Accuracy-Highest: 0.96543 Training: 2021-03-19 08:15:21,577-[agedb_30][184000]XNorm: 22.525333 Training: 2021-03-19 08:15:21,577-[agedb_30][184000]Accuracy-Flip: 0.97017+-0.00917 Training: 2021-03-19 08:15:21,577-[agedb_30][184000]Accuracy-Highest: 0.97017 Training: 2021-03-19 08:15:31,607-Speed 829.30 samples/sec Loss 2.7641 Epoch: 11 Global Step: 184050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:15:41,592-Speed 5128.08 samples/sec Loss 2.7462 Epoch: 11 Global Step: 184100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:15:51,647-Speed 5092.19 samples/sec Loss 2.7310 Epoch: 11 Global Step: 184150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:01,339-Speed 5283.19 samples/sec Loss 2.7160 Epoch: 11 Global Step: 184200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:13,294-Speed 4282.86 samples/sec Loss 2.7101 Epoch: 11 Global Step: 184250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:23,106-Speed 5218.27 samples/sec Loss 2.6862 Epoch: 11 Global Step: 184300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:33,918-Speed 4736.08 samples/sec Loss 2.6756 Epoch: 11 Global Step: 184350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:44,520-Speed 4829.46 samples/sec Loss 2.6944 Epoch: 11 Global Step: 184400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:16:54,673-Speed 5043.39 samples/sec Loss 2.6839 Epoch: 11 Global Step: 184450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:04,685-Speed 5114.02 samples/sec Loss 2.6856 Epoch: 11 Global Step: 184500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:14,851-Speed 5036.40 samples/sec Loss 2.6363 Epoch: 11 Global Step: 184550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:24,738-Speed 5179.04 samples/sec Loss 2.6650 Epoch: 11 Global Step: 184600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:34,547-Speed 5219.96 samples/sec Loss 2.6172 Epoch: 11 Global Step: 184650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:44,549-Speed 5119.45 samples/sec Loss 2.6095 Epoch: 11 Global Step: 184700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:17:54,446-Speed 5173.87 samples/sec Loss 2.6415 Epoch: 11 Global Step: 184750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-19 08:18:04,520-Speed 5082.46 samples/sec Loss 2.5936 Epoch: 11 Global Step: 184800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:18:14,767-Speed 4997.31 samples/sec Loss 2.6236 Epoch: 11 Global Step: 184850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:18:24,592-Speed 5211.71 samples/sec Loss 2.6342 Epoch: 11 Global Step: 184900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:18:34,521-Speed 5156.72 samples/sec Loss 2.6341 Epoch: 11 Global Step: 184950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:18:44,642-Speed 5058.97 samples/sec Loss 2.6256 Epoch: 11 Global Step: 185000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:18:54,701-Speed 5090.11 samples/sec Loss 2.6036 Epoch: 11 Global Step: 185050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:04,436-Speed 5260.04 samples/sec Loss 2.6076 Epoch: 11 Global Step: 185100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:14,487-Speed 5094.32 samples/sec Loss 2.5910 Epoch: 11 Global Step: 185150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:24,232-Speed 5254.49 samples/sec Loss 2.5940 Epoch: 11 Global Step: 185200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:34,478-Speed 4997.28 samples/sec Loss 2.5942 Epoch: 11 Global Step: 185250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:44,533-Speed 5092.18 samples/sec Loss 2.5844 Epoch: 11 Global Step: 185300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:19:54,570-Speed 5101.66 samples/sec Loss 2.5898 Epoch: 11 Global Step: 185350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:04,475-Speed 5169.10 samples/sec Loss 2.5867 Epoch: 11 Global Step: 185400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:14,366-Speed 5176.84 samples/sec Loss 2.5471 Epoch: 11 Global Step: 185450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:24,269-Speed 5170.70 samples/sec Loss 2.5754 Epoch: 11 Global Step: 185500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:34,378-Speed 5064.86 samples/sec Loss 2.5756 Epoch: 11 Global Step: 185550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:44,357-Speed 5131.25 samples/sec Loss 2.5978 Epoch: 11 Global Step: 185600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:20:54,257-Speed 5172.05 samples/sec Loss 2.5703 Epoch: 11 Global Step: 185650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:04,082-Speed 5211.24 samples/sec Loss 2.5822 Epoch: 11 Global Step: 185700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:14,270-Speed 5025.82 samples/sec Loss 2.5494 Epoch: 11 Global Step: 185750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:24,422-Speed 5043.58 samples/sec Loss 2.5522 Epoch: 11 Global Step: 185800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:34,417-Speed 5123.21 samples/sec Loss 2.5696 Epoch: 11 Global Step: 185850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:44,288-Speed 5187.24 samples/sec Loss 2.5668 Epoch: 11 Global Step: 185900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:21:54,268-Speed 5130.41 samples/sec Loss 2.5746 Epoch: 11 Global Step: 185950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:22:04,306-Speed 5101.17 samples/sec Loss 2.5533 Epoch: 11 Global Step: 186000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:22:21,249-[lfw][186000]XNorm: 23.528333 Training: 2021-03-19 08:22:21,249-[lfw][186000]Accuracy-Flip: 0.99700+-0.00306 Training: 2021-03-19 08:22:21,249-[lfw][186000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:22:40,092-[cfp_fp][186000]XNorm: 19.378271 Training: 2021-03-19 08:22:40,092-[cfp_fp][186000]Accuracy-Flip: 0.96600+-0.01114 Training: 2021-03-19 08:22:40,093-[cfp_fp][186000]Accuracy-Highest: 0.96600 Training: 2021-03-19 08:22:56,304-[agedb_30][186000]XNorm: 22.643751 Training: 2021-03-19 08:22:56,304-[agedb_30][186000]Accuracy-Flip: 0.97200+-0.00897 Training: 2021-03-19 08:22:56,304-[agedb_30][186000]Accuracy-Highest: 0.97200 Training: 2021-03-19 08:23:06,136-Speed 828.07 samples/sec Loss 2.5794 Epoch: 11 Global Step: 186050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:23:16,329-Speed 5023.43 samples/sec Loss 2.5475 Epoch: 11 Global Step: 186100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:23:26,286-Speed 5142.88 samples/sec Loss 2.5596 Epoch: 11 Global Step: 186150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:23:36,838-Speed 4852.25 samples/sec Loss 2.5645 Epoch: 11 Global Step: 186200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:23:46,875-Speed 5101.42 samples/sec Loss 2.5478 Epoch: 11 Global Step: 186250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:23:56,855-Speed 5130.86 samples/sec Loss 2.5410 Epoch: 11 Global Step: 186300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:06,832-Speed 5132.01 samples/sec Loss 2.5528 Epoch: 11 Global Step: 186350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:16,960-Speed 5055.92 samples/sec Loss 2.5305 Epoch: 11 Global Step: 186400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:27,162-Speed 5018.54 samples/sec Loss 2.5125 Epoch: 11 Global Step: 186450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:38,205-Speed 4636.79 samples/sec Loss 2.5631 Epoch: 11 Global Step: 186500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:48,305-Speed 5069.81 samples/sec Loss 2.5320 Epoch: 11 Global Step: 186550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:24:58,410-Speed 5067.05 samples/sec Loss 2.4908 Epoch: 11 Global Step: 186600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:08,457-Speed 5096.29 samples/sec Loss 2.5167 Epoch: 11 Global Step: 186650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:18,641-Speed 5027.86 samples/sec Loss 2.5446 Epoch: 11 Global Step: 186700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:28,573-Speed 5155.27 samples/sec Loss 2.5240 Epoch: 11 Global Step: 186750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:38,599-Speed 5107.37 samples/sec Loss 2.5562 Epoch: 11 Global Step: 186800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:49,410-Speed 4735.84 samples/sec Loss 2.5273 Epoch: 11 Global Step: 186850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:25:59,251-Speed 5203.26 samples/sec Loss 2.4967 Epoch: 11 Global Step: 186900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:09,296-Speed 5097.30 samples/sec Loss 2.5227 Epoch: 11 Global Step: 186950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:18,992-Speed 5280.87 samples/sec Loss 2.5201 Epoch: 11 Global Step: 187000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:28,940-Speed 5146.90 samples/sec Loss 2.5147 Epoch: 11 Global Step: 187050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:39,757-Speed 4733.52 samples/sec Loss 2.4988 Epoch: 11 Global Step: 187100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:49,804-Speed 5096.47 samples/sec Loss 2.4834 Epoch: 11 Global Step: 187150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:26:59,829-Speed 5107.95 samples/sec Loss 2.5129 Epoch: 11 Global Step: 187200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:27:09,859-Speed 5105.07 samples/sec Loss 2.4879 Epoch: 11 Global Step: 187250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:27:20,518-Speed 4803.85 samples/sec Loss 2.5064 Epoch: 11 Global Step: 187300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:27:30,469-Speed 5145.50 samples/sec Loss 2.4952 Epoch: 11 Global Step: 187350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:27:40,380-Speed 5166.42 samples/sec Loss 2.4975 Epoch: 11 Global Step: 187400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:27:50,277-Speed 5173.44 samples/sec Loss 2.4891 Epoch: 11 Global Step: 187450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:00,574-Speed 4972.79 samples/sec Loss 2.4789 Epoch: 11 Global Step: 187500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:10,600-Speed 5107.51 samples/sec Loss 2.5001 Epoch: 11 Global Step: 187550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:22,095-Speed 4454.26 samples/sec Loss 2.4981 Epoch: 11 Global Step: 187600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:32,026-Speed 5155.97 samples/sec Loss 2.4717 Epoch: 11 Global Step: 187650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:42,769-Speed 4766.40 samples/sec Loss 2.4722 Epoch: 11 Global Step: 187700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:28:53,412-Speed 4810.45 samples/sec Loss 2.4722 Epoch: 11 Global Step: 187750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:03,408-Speed 5122.79 samples/sec Loss 2.5030 Epoch: 11 Global Step: 187800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:13,356-Speed 5146.60 samples/sec Loss 2.4915 Epoch: 11 Global Step: 187850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:23,189-Speed 5207.58 samples/sec Loss 2.4827 Epoch: 11 Global Step: 187900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:33,113-Speed 5159.62 samples/sec Loss 2.4844 Epoch: 11 Global Step: 187950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:43,019-Speed 5169.08 samples/sec Loss 2.5103 Epoch: 11 Global Step: 188000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:29:59,910-[lfw][188000]XNorm: 23.518705 Training: 2021-03-19 08:29:59,910-[lfw][188000]Accuracy-Flip: 0.99717+-0.00259 Training: 2021-03-19 08:29:59,910-[lfw][188000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:30:18,584-[cfp_fp][188000]XNorm: 19.407039 Training: 2021-03-19 08:30:18,584-[cfp_fp][188000]Accuracy-Flip: 0.97014+-0.01135 Training: 2021-03-19 08:30:18,584-[cfp_fp][188000]Accuracy-Highest: 0.97014 Training: 2021-03-19 08:30:34,716-[agedb_30][188000]XNorm: 22.571760 Training: 2021-03-19 08:30:34,716-[agedb_30][188000]Accuracy-Flip: 0.97417+-0.00761 Training: 2021-03-19 08:30:34,716-[agedb_30][188000]Accuracy-Highest: 0.97417 Training: 2021-03-19 08:30:44,367-Speed 834.59 samples/sec Loss 2.4847 Epoch: 11 Global Step: 188050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:30:54,304-Speed 5152.81 samples/sec Loss 2.4962 Epoch: 11 Global Step: 188100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:04,221-Speed 5162.76 samples/sec Loss 2.4834 Epoch: 11 Global Step: 188150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:14,162-Speed 5150.99 samples/sec Loss 2.4880 Epoch: 11 Global Step: 188200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:24,222-Speed 5089.58 samples/sec Loss 2.4650 Epoch: 11 Global Step: 188250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:34,221-Speed 5121.11 samples/sec Loss 2.4941 Epoch: 11 Global Step: 188300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:44,248-Speed 5106.46 samples/sec Loss 2.4704 Epoch: 11 Global Step: 188350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:31:54,360-Speed 5063.47 samples/sec Loss 2.4762 Epoch: 11 Global Step: 188400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:04,584-Speed 5008.38 samples/sec Loss 2.4547 Epoch: 11 Global Step: 188450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:14,585-Speed 5119.85 samples/sec Loss 2.4527 Epoch: 11 Global Step: 188500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:24,496-Speed 5165.97 samples/sec Loss 2.4707 Epoch: 11 Global Step: 188550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:34,461-Speed 5138.30 samples/sec Loss 2.4740 Epoch: 11 Global Step: 188600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:44,343-Speed 5181.29 samples/sec Loss 2.4652 Epoch: 11 Global Step: 188650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:32:54,494-Speed 5044.11 samples/sec Loss 2.4375 Epoch: 11 Global Step: 188700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:04,524-Speed 5104.86 samples/sec Loss 2.4729 Epoch: 11 Global Step: 188750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:14,368-Speed 5201.66 samples/sec Loss 2.4906 Epoch: 11 Global Step: 188800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:24,519-Speed 5044.25 samples/sec Loss 2.4682 Epoch: 11 Global Step: 188850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:34,482-Speed 5138.83 samples/sec Loss 2.4660 Epoch: 11 Global Step: 188900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:44,721-Speed 5001.33 samples/sec Loss 2.4718 Epoch: 11 Global Step: 188950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:33:54,809-Speed 5075.22 samples/sec Loss 2.4705 Epoch: 11 Global Step: 189000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:04,570-Speed 5246.00 samples/sec Loss 2.4519 Epoch: 11 Global Step: 189050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:14,817-Speed 4996.68 samples/sec Loss 2.4613 Epoch: 11 Global Step: 189100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:24,825-Speed 5116.59 samples/sec Loss 2.4575 Epoch: 11 Global Step: 189150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:34,808-Speed 5129.15 samples/sec Loss 2.4603 Epoch: 11 Global Step: 189200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:44,784-Speed 5132.53 samples/sec Loss 2.4870 Epoch: 11 Global Step: 189250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:34:54,672-Speed 5178.34 samples/sec Loss 2.4778 Epoch: 11 Global Step: 189300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:04,499-Speed 5210.76 samples/sec Loss 2.4520 Epoch: 11 Global Step: 189350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:14,557-Speed 5090.67 samples/sec Loss 2.4587 Epoch: 11 Global Step: 189400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:24,390-Speed 5207.40 samples/sec Loss 2.4630 Epoch: 11 Global Step: 189450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:34,366-Speed 5132.64 samples/sec Loss 2.4689 Epoch: 11 Global Step: 189500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:44,437-Speed 5083.88 samples/sec Loss 2.4922 Epoch: 11 Global Step: 189550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:35:54,402-Speed 5138.19 samples/sec Loss 2.4510 Epoch: 11 Global Step: 189600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:04,383-Speed 5130.25 samples/sec Loss 2.4305 Epoch: 11 Global Step: 189650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:14,539-Speed 5041.89 samples/sec Loss 2.4496 Epoch: 11 Global Step: 189700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:24,500-Speed 5140.34 samples/sec Loss 2.4400 Epoch: 11 Global Step: 189750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:34,481-Speed 5129.67 samples/sec Loss 2.4376 Epoch: 11 Global Step: 189800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:45,457-Speed 4664.94 samples/sec Loss 2.4466 Epoch: 11 Global Step: 189850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:36:55,312-Speed 5195.51 samples/sec Loss 2.4530 Epoch: 11 Global Step: 189900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:37:05,237-Speed 5159.04 samples/sec Loss 2.4301 Epoch: 11 Global Step: 189950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:37:15,225-Speed 5126.27 samples/sec Loss 2.4395 Epoch: 11 Global Step: 190000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:37:32,021-[lfw][190000]XNorm: 23.489651 Training: 2021-03-19 08:37:32,022-[lfw][190000]Accuracy-Flip: 0.99683+-0.00241 Training: 2021-03-19 08:37:32,022-[lfw][190000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:37:50,802-[cfp_fp][190000]XNorm: 19.272852 Training: 2021-03-19 08:37:50,802-[cfp_fp][190000]Accuracy-Flip: 0.97114+-0.01047 Training: 2021-03-19 08:37:50,802-[cfp_fp][190000]Accuracy-Highest: 0.97114 Training: 2021-03-19 08:38:07,060-[agedb_30][190000]XNorm: 22.600507 Training: 2021-03-19 08:38:07,060-[agedb_30][190000]Accuracy-Flip: 0.97400+-0.00790 Training: 2021-03-19 08:38:07,060-[agedb_30][190000]Accuracy-Highest: 0.97417 Training: 2021-03-19 08:38:16,691-Speed 832.99 samples/sec Loss 2.4768 Epoch: 11 Global Step: 190050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:38:26,562-Speed 5187.01 samples/sec Loss 2.4421 Epoch: 11 Global Step: 190100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:38:36,732-Speed 5034.90 samples/sec Loss 2.4380 Epoch: 11 Global Step: 190150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:38:47,588-Speed 4716.32 samples/sec Loss 2.4393 Epoch: 11 Global Step: 190200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:38:57,711-Speed 5058.20 samples/sec Loss 2.4467 Epoch: 11 Global Step: 190250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:07,737-Speed 5106.75 samples/sec Loss 2.4487 Epoch: 11 Global Step: 190300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:17,898-Speed 5039.27 samples/sec Loss 2.4760 Epoch: 11 Global Step: 190350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:27,758-Speed 5192.90 samples/sec Loss 2.4362 Epoch: 11 Global Step: 190400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:37,550-Speed 5229.01 samples/sec Loss 2.4131 Epoch: 11 Global Step: 190450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:48,927-Speed 4500.63 samples/sec Loss 2.4021 Epoch: 11 Global Step: 190500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:39:58,985-Speed 5090.67 samples/sec Loss 2.4214 Epoch: 11 Global Step: 190550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:09,110-Speed 5057.05 samples/sec Loss 2.3986 Epoch: 11 Global Step: 190600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:19,106-Speed 5122.53 samples/sec Loss 2.4544 Epoch: 11 Global Step: 190650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:29,743-Speed 4813.78 samples/sec Loss 2.4234 Epoch: 11 Global Step: 190700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:39,701-Speed 5141.87 samples/sec Loss 2.4502 Epoch: 11 Global Step: 190750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:49,746-Speed 5097.15 samples/sec Loss 2.4304 Epoch: 11 Global Step: 190800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:40:59,525-Speed 5236.10 samples/sec Loss 2.4334 Epoch: 11 Global Step: 190850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:41:09,679-Speed 5042.36 samples/sec Loss 2.4316 Epoch: 11 Global Step: 190900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:41:19,702-Speed 5108.80 samples/sec Loss 2.4370 Epoch: 11 Global Step: 190950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:41:30,603-Speed 4697.12 samples/sec Loss 2.4220 Epoch: 11 Global Step: 191000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:41:41,514-Speed 4692.90 samples/sec Loss 2.4385 Epoch: 11 Global Step: 191050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:41:52,137-Speed 4819.96 samples/sec Loss 2.4113 Epoch: 11 Global Step: 191100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:02,860-Speed 4775.14 samples/sec Loss 2.4375 Epoch: 11 Global Step: 191150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:12,729-Speed 5188.16 samples/sec Loss 2.4267 Epoch: 11 Global Step: 191200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:22,842-Speed 5062.81 samples/sec Loss 2.3864 Epoch: 11 Global Step: 191250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:32,788-Speed 5148.24 samples/sec Loss 2.4060 Epoch: 11 Global Step: 191300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:42,704-Speed 5163.73 samples/sec Loss 2.4058 Epoch: 11 Global Step: 191350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:42:52,677-Speed 5134.10 samples/sec Loss 2.4458 Epoch: 11 Global Step: 191400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:02,563-Speed 5179.88 samples/sec Loss 2.4347 Epoch: 11 Global Step: 191450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:12,689-Speed 5056.71 samples/sec Loss 2.4223 Epoch: 11 Global Step: 191500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:22,659-Speed 5135.77 samples/sec Loss 2.4170 Epoch: 11 Global Step: 191550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:32,620-Speed 5140.15 samples/sec Loss 2.4256 Epoch: 11 Global Step: 191600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:42,706-Speed 5076.65 samples/sec Loss 2.4078 Epoch: 11 Global Step: 191650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:43:52,834-Speed 5055.73 samples/sec Loss 2.4007 Epoch: 11 Global Step: 191700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:02,904-Speed 5084.74 samples/sec Loss 2.4066 Epoch: 11 Global Step: 191750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:13,083-Speed 5029.98 samples/sec Loss 2.4222 Epoch: 11 Global Step: 191800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:23,225-Speed 5048.85 samples/sec Loss 2.4239 Epoch: 11 Global Step: 191850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:33,283-Speed 5090.45 samples/sec Loss 2.4111 Epoch: 11 Global Step: 191900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:43,390-Speed 5066.13 samples/sec Loss 2.4169 Epoch: 11 Global Step: 191950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:44:53,423-Speed 5103.61 samples/sec Loss 2.4197 Epoch: 11 Global Step: 192000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:45:10,242-[lfw][192000]XNorm: 23.657688 Training: 2021-03-19 08:45:10,242-[lfw][192000]Accuracy-Flip: 0.99717+-0.00289 Training: 2021-03-19 08:45:10,242-[lfw][192000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:45:29,068-[cfp_fp][192000]XNorm: 19.577669 Training: 2021-03-19 08:45:29,068-[cfp_fp][192000]Accuracy-Flip: 0.96900+-0.01262 Training: 2021-03-19 08:45:29,068-[cfp_fp][192000]Accuracy-Highest: 0.97114 Training: 2021-03-19 08:45:45,507-[agedb_30][192000]XNorm: 22.862839 Training: 2021-03-19 08:45:45,508-[agedb_30][192000]Accuracy-Flip: 0.97467+-0.00795 Training: 2021-03-19 08:45:45,508-[agedb_30][192000]Accuracy-Highest: 0.97467 Training: 2021-03-19 08:45:55,270-Speed 827.85 samples/sec Loss 2.4214 Epoch: 11 Global Step: 192050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:05,426-Speed 5041.86 samples/sec Loss 2.4001 Epoch: 11 Global Step: 192100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:15,489-Speed 5087.96 samples/sec Loss 2.4238 Epoch: 11 Global Step: 192150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:25,803-Speed 4964.91 samples/sec Loss 2.4304 Epoch: 11 Global Step: 192200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:35,913-Speed 5064.39 samples/sec Loss 2.4291 Epoch: 11 Global Step: 192250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:46,237-Speed 4959.86 samples/sec Loss 2.4330 Epoch: 11 Global Step: 192300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:46:56,611-Speed 4935.91 samples/sec Loss 2.4276 Epoch: 11 Global Step: 192350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:06,998-Speed 4929.14 samples/sec Loss 2.3954 Epoch: 11 Global Step: 192400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:16,916-Speed 5162.55 samples/sec Loss 2.3920 Epoch: 11 Global Step: 192450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:26,663-Speed 5253.58 samples/sec Loss 2.4142 Epoch: 11 Global Step: 192500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:36,836-Speed 5032.82 samples/sec Loss 2.4323 Epoch: 11 Global Step: 192550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:46,620-Speed 5233.40 samples/sec Loss 2.4000 Epoch: 11 Global Step: 192600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:47:56,907-Speed 4977.59 samples/sec Loss 2.3911 Epoch: 11 Global Step: 192650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:07,150-Speed 4998.94 samples/sec Loss 2.4166 Epoch: 11 Global Step: 192700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:17,214-Speed 5088.15 samples/sec Loss 2.4064 Epoch: 11 Global Step: 192750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:27,410-Speed 5021.80 samples/sec Loss 2.3826 Epoch: 11 Global Step: 192800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:37,527-Speed 5060.87 samples/sec Loss 2.4064 Epoch: 11 Global Step: 192850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:47,617-Speed 5074.78 samples/sec Loss 2.4352 Epoch: 11 Global Step: 192900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:48:57,748-Speed 5054.29 samples/sec Loss 2.4229 Epoch: 11 Global Step: 192950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:07,729-Speed 5129.75 samples/sec Loss 2.3983 Epoch: 11 Global Step: 193000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:17,784-Speed 5092.29 samples/sec Loss 2.3945 Epoch: 11 Global Step: 193050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:27,797-Speed 5113.66 samples/sec Loss 2.3974 Epoch: 11 Global Step: 193100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:37,870-Speed 5083.02 samples/sec Loss 2.4278 Epoch: 11 Global Step: 193150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:47,915-Speed 5097.59 samples/sec Loss 2.4117 Epoch: 11 Global Step: 193200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:49:58,849-Speed 4682.68 samples/sec Loss 2.4032 Epoch: 11 Global Step: 193250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:09,018-Speed 5035.32 samples/sec Loss 2.3758 Epoch: 11 Global Step: 193300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:19,070-Speed 5093.77 samples/sec Loss 2.3903 Epoch: 11 Global Step: 193350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:29,165-Speed 5072.47 samples/sec Loss 2.3824 Epoch: 11 Global Step: 193400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:39,067-Speed 5170.80 samples/sec Loss 2.3932 Epoch: 11 Global Step: 193450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:49,064-Speed 5122.03 samples/sec Loss 2.3940 Epoch: 11 Global Step: 193500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:50:59,115-Speed 5094.47 samples/sec Loss 2.4010 Epoch: 11 Global Step: 193550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:51:09,871-Speed 4760.40 samples/sec Loss 2.4196 Epoch: 11 Global Step: 193600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:51:20,107-Speed 5002.09 samples/sec Loss 2.3851 Epoch: 11 Global Step: 193650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:51:30,317-Speed 5015.20 samples/sec Loss 2.3871 Epoch: 11 Global Step: 193700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:51:40,574-Speed 4991.82 samples/sec Loss 2.3905 Epoch: 11 Global Step: 193750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:51:50,671-Speed 5071.47 samples/sec Loss 2.3836 Epoch: 11 Global Step: 193800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:52:00,613-Speed 5150.17 samples/sec Loss 2.3958 Epoch: 11 Global Step: 193850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:52:10,730-Speed 5060.94 samples/sec Loss 2.3641 Epoch: 11 Global Step: 193900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:52:21,479-Speed 4763.64 samples/sec Loss 2.3985 Epoch: 11 Global Step: 193950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:52:31,460-Speed 5130.14 samples/sec Loss 2.4190 Epoch: 11 Global Step: 194000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:52:47,688-[lfw][194000]XNorm: 23.337431 Training: 2021-03-19 08:52:47,688-[lfw][194000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-19 08:52:47,688-[lfw][194000]Accuracy-Highest: 0.99750 Training: 2021-03-19 08:53:06,218-[cfp_fp][194000]XNorm: 19.359059 Training: 2021-03-19 08:53:06,219-[cfp_fp][194000]Accuracy-Flip: 0.96929+-0.01170 Training: 2021-03-19 08:53:06,219-[cfp_fp][194000]Accuracy-Highest: 0.97114 Training: 2021-03-19 08:53:22,337-[agedb_30][194000]XNorm: 22.515982 Training: 2021-03-19 08:53:22,337-[agedb_30][194000]Accuracy-Flip: 0.97400+-0.00655 Training: 2021-03-19 08:53:22,338-[agedb_30][194000]Accuracy-Highest: 0.97467 Training: 2021-03-19 08:53:33,028-Speed 831.61 samples/sec Loss 2.3703 Epoch: 11 Global Step: 194050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:53:42,961-Speed 5154.71 samples/sec Loss 2.3798 Epoch: 11 Global Step: 194100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:53:53,129-Speed 5035.84 samples/sec Loss 2.3592 Epoch: 11 Global Step: 194150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:03,303-Speed 5032.42 samples/sec Loss 2.4038 Epoch: 11 Global Step: 194200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:13,314-Speed 5115.02 samples/sec Loss 2.3759 Epoch: 11 Global Step: 194250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:23,137-Speed 5212.66 samples/sec Loss 2.3698 Epoch: 11 Global Step: 194300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:33,947-Speed 4736.44 samples/sec Loss 2.3768 Epoch: 11 Global Step: 194350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:44,979-Speed 4641.22 samples/sec Loss 2.3788 Epoch: 11 Global Step: 194400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:54:55,283-Speed 4969.51 samples/sec Loss 2.4115 Epoch: 11 Global Step: 194450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:06,861-Speed 4422.29 samples/sec Loss 2.3830 Epoch: 11 Global Step: 194500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:16,770-Speed 5167.11 samples/sec Loss 2.4231 Epoch: 11 Global Step: 194550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:27,009-Speed 5001.09 samples/sec Loss 2.3960 Epoch: 11 Global Step: 194600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:36,888-Speed 5182.73 samples/sec Loss 2.3719 Epoch: 11 Global Step: 194650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:46,945-Speed 5091.22 samples/sec Loss 2.3818 Epoch: 11 Global Step: 194700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:55:56,987-Speed 5099.15 samples/sec Loss 2.3917 Epoch: 11 Global Step: 194750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:06,948-Speed 5140.04 samples/sec Loss 2.3892 Epoch: 11 Global Step: 194800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:17,148-Speed 5020.19 samples/sec Loss 2.3629 Epoch: 11 Global Step: 194850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:27,601-Speed 4898.14 samples/sec Loss 2.4094 Epoch: 11 Global Step: 194900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:37,614-Speed 5114.21 samples/sec Loss 2.3923 Epoch: 11 Global Step: 194950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:47,583-Speed 5135.91 samples/sec Loss 2.3771 Epoch: 11 Global Step: 195000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:56:57,808-Speed 5007.95 samples/sec Loss 2.3820 Epoch: 11 Global Step: 195050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:08,065-Speed 4991.89 samples/sec Loss 2.4151 Epoch: 11 Global Step: 195100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:17,884-Speed 5214.35 samples/sec Loss 2.3561 Epoch: 11 Global Step: 195150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:27,896-Speed 5114.31 samples/sec Loss 2.3812 Epoch: 11 Global Step: 195200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:38,051-Speed 5042.36 samples/sec Loss 2.3788 Epoch: 11 Global Step: 195250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:48,190-Speed 5049.74 samples/sec Loss 2.3994 Epoch: 11 Global Step: 195300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:57:58,244-Speed 5092.92 samples/sec Loss 2.3464 Epoch: 11 Global Step: 195350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:08,230-Speed 5127.38 samples/sec Loss 2.3758 Epoch: 11 Global Step: 195400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:18,389-Speed 5040.29 samples/sec Loss 2.3605 Epoch: 11 Global Step: 195450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:28,384-Speed 5122.73 samples/sec Loss 2.3651 Epoch: 11 Global Step: 195500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:38,323-Speed 5152.02 samples/sec Loss 2.3867 Epoch: 11 Global Step: 195550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:48,430-Speed 5065.93 samples/sec Loss 2.3645 Epoch: 11 Global Step: 195600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:58:58,367-Speed 5152.43 samples/sec Loss 2.3678 Epoch: 11 Global Step: 195650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:08,432-Speed 5087.39 samples/sec Loss 2.3507 Epoch: 11 Global Step: 195700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:18,483-Speed 5094.24 samples/sec Loss 2.3733 Epoch: 11 Global Step: 195750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:28,593-Speed 5064.99 samples/sec Loss 2.3613 Epoch: 11 Global Step: 195800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:38,615-Speed 5108.91 samples/sec Loss 2.3304 Epoch: 11 Global Step: 195850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:48,643-Speed 5105.75 samples/sec Loss 2.3577 Epoch: 11 Global Step: 195900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 08:59:58,671-Speed 5106.16 samples/sec Loss 2.3713 Epoch: 11 Global Step: 195950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:00:08,602-Speed 5156.27 samples/sec Loss 2.3561 Epoch: 11 Global Step: 196000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:00:25,136-[lfw][196000]XNorm: 23.437355 Training: 2021-03-19 09:00:25,136-[lfw][196000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 09:00:25,136-[lfw][196000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:00:43,863-[cfp_fp][196000]XNorm: 19.402259 Training: 2021-03-19 09:00:43,863-[cfp_fp][196000]Accuracy-Flip: 0.96971+-0.00951 Training: 2021-03-19 09:00:43,863-[cfp_fp][196000]Accuracy-Highest: 0.97114 Training: 2021-03-19 09:01:00,037-[agedb_30][196000]XNorm: 22.622314 Training: 2021-03-19 09:01:00,037-[agedb_30][196000]Accuracy-Flip: 0.97583+-0.00783 Training: 2021-03-19 09:01:00,037-[agedb_30][196000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:01:09,812-Speed 836.47 samples/sec Loss 2.3802 Epoch: 11 Global Step: 196050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:01:19,895-Speed 5078.18 samples/sec Loss 2.4010 Epoch: 11 Global Step: 196100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:01:30,120-Speed 5007.77 samples/sec Loss 2.3602 Epoch: 11 Global Step: 196150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:01:40,214-Speed 5072.59 samples/sec Loss 2.3803 Epoch: 11 Global Step: 196200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:01:50,299-Speed 5077.14 samples/sec Loss 2.3655 Epoch: 11 Global Step: 196250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:00,362-Speed 5088.39 samples/sec Loss 2.3583 Epoch: 11 Global Step: 196300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:10,516-Speed 5042.54 samples/sec Loss 2.3342 Epoch: 11 Global Step: 196350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:20,503-Speed 5126.73 samples/sec Loss 2.3617 Epoch: 11 Global Step: 196400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:30,510-Speed 5117.09 samples/sec Loss 2.3952 Epoch: 11 Global Step: 196450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:40,493-Speed 5128.75 samples/sec Loss 2.3580 Epoch: 11 Global Step: 196500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:02:50,397-Speed 5170.31 samples/sec Loss 2.3602 Epoch: 11 Global Step: 196550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:01,116-Speed 4776.98 samples/sec Loss 2.3204 Epoch: 11 Global Step: 196600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:10,877-Speed 5245.51 samples/sec Loss 2.3789 Epoch: 11 Global Step: 196650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:20,964-Speed 5076.21 samples/sec Loss 2.3728 Epoch: 11 Global Step: 196700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:31,045-Speed 5079.28 samples/sec Loss 2.3646 Epoch: 11 Global Step: 196750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:40,961-Speed 5163.44 samples/sec Loss 2.3562 Epoch: 11 Global Step: 196800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:03:50,783-Speed 5212.92 samples/sec Loss 2.3873 Epoch: 11 Global Step: 196850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:00,601-Speed 5215.62 samples/sec Loss 2.3878 Epoch: 11 Global Step: 196900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:11,341-Speed 4767.16 samples/sec Loss 2.3658 Epoch: 11 Global Step: 196950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:21,272-Speed 5156.05 samples/sec Loss 2.3888 Epoch: 11 Global Step: 197000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:31,353-Speed 5079.67 samples/sec Loss 2.3936 Epoch: 11 Global Step: 197050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:41,285-Speed 5155.11 samples/sec Loss 2.4133 Epoch: 11 Global Step: 197100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:04:51,253-Speed 5137.17 samples/sec Loss 2.3501 Epoch: 11 Global Step: 197150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:01,238-Speed 5127.92 samples/sec Loss 2.3440 Epoch: 11 Global Step: 197200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:11,945-Speed 4782.32 samples/sec Loss 2.3689 Epoch: 11 Global Step: 197250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:22,046-Speed 5068.92 samples/sec Loss 2.3640 Epoch: 11 Global Step: 197300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:32,618-Speed 4843.20 samples/sec Loss 2.3787 Epoch: 11 Global Step: 197350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:42,448-Speed 5209.01 samples/sec Loss 2.3543 Epoch: 11 Global Step: 197400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:05:52,596-Speed 5045.32 samples/sec Loss 2.3609 Epoch: 11 Global Step: 197450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:02,704-Speed 5065.96 samples/sec Loss 2.3660 Epoch: 11 Global Step: 197500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:12,967-Speed 4988.80 samples/sec Loss 2.3653 Epoch: 11 Global Step: 197550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:23,233-Speed 4988.12 samples/sec Loss 2.3712 Epoch: 11 Global Step: 197600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:33,363-Speed 5054.42 samples/sec Loss 2.3604 Epoch: 11 Global Step: 197650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:44,026-Speed 4801.80 samples/sec Loss 2.3407 Epoch: 11 Global Step: 197700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:06:54,954-Speed 4685.87 samples/sec Loss 2.3609 Epoch: 11 Global Step: 197750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:07:05,595-Speed 4811.49 samples/sec Loss 2.3988 Epoch: 11 Global Step: 197800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:07:15,676-Speed 5079.31 samples/sec Loss 2.3428 Epoch: 11 Global Step: 197850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:07:26,613-Speed 4681.49 samples/sec Loss 2.3793 Epoch: 11 Global Step: 197900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:07:36,564-Speed 5145.98 samples/sec Loss 2.3728 Epoch: 11 Global Step: 197950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:07:46,368-Speed 5222.64 samples/sec Loss 2.3580 Epoch: 11 Global Step: 198000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:08:02,977-[lfw][198000]XNorm: 23.097170 Training: 2021-03-19 09:08:02,978-[lfw][198000]Accuracy-Flip: 0.99567+-0.00291 Training: 2021-03-19 09:08:02,978-[lfw][198000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:08:21,608-[cfp_fp][198000]XNorm: 19.216094 Training: 2021-03-19 09:08:21,608-[cfp_fp][198000]Accuracy-Flip: 0.97229+-0.01052 Training: 2021-03-19 09:08:21,608-[cfp_fp][198000]Accuracy-Highest: 0.97229 Training: 2021-03-19 09:08:37,711-[agedb_30][198000]XNorm: 22.273975 Training: 2021-03-19 09:08:37,712-[agedb_30][198000]Accuracy-Flip: 0.97317+-0.00818 Training: 2021-03-19 09:08:37,712-[agedb_30][198000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:08:47,599-Speed 836.19 samples/sec Loss 2.3462 Epoch: 11 Global Step: 198050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:08:57,641-Speed 5098.75 samples/sec Loss 2.3473 Epoch: 11 Global Step: 198100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:07,739-Speed 5070.27 samples/sec Loss 2.3558 Epoch: 11 Global Step: 198150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:17,818-Speed 5080.41 samples/sec Loss 2.3559 Epoch: 11 Global Step: 198200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:27,927-Speed 5064.94 samples/sec Loss 2.3558 Epoch: 11 Global Step: 198250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:38,136-Speed 5015.74 samples/sec Loss 2.3420 Epoch: 11 Global Step: 198300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:48,193-Speed 5090.79 samples/sec Loss 2.3664 Epoch: 11 Global Step: 198350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:09:57,986-Speed 5228.62 samples/sec Loss 2.3491 Epoch: 11 Global Step: 198400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:08,157-Speed 5034.49 samples/sec Loss 2.3182 Epoch: 11 Global Step: 198450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:18,354-Speed 5021.46 samples/sec Loss 2.3351 Epoch: 11 Global Step: 198500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:28,355-Speed 5119.75 samples/sec Loss 2.3454 Epoch: 11 Global Step: 198550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:38,491-Speed 5051.95 samples/sec Loss 2.3494 Epoch: 11 Global Step: 198600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:48,440-Speed 5146.38 samples/sec Loss 2.3385 Epoch: 11 Global Step: 198650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:10:58,381-Speed 5150.50 samples/sec Loss 2.3713 Epoch: 11 Global Step: 198700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:08,425-Speed 5098.12 samples/sec Loss 2.3667 Epoch: 11 Global Step: 198750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:18,560-Speed 5052.18 samples/sec Loss 2.3336 Epoch: 11 Global Step: 198800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:28,663-Speed 5067.94 samples/sec Loss 2.3746 Epoch: 11 Global Step: 198850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:38,632-Speed 5136.41 samples/sec Loss 2.3660 Epoch: 11 Global Step: 198900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:48,752-Speed 5059.39 samples/sec Loss 2.3519 Epoch: 11 Global Step: 198950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:11:58,705-Speed 5144.47 samples/sec Loss 2.3117 Epoch: 11 Global Step: 199000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:08,956-Speed 4994.84 samples/sec Loss 2.3157 Epoch: 11 Global Step: 199050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:18,924-Speed 5137.10 samples/sec Loss 2.3614 Epoch: 11 Global Step: 199100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:29,024-Speed 5069.47 samples/sec Loss 2.3364 Epoch: 11 Global Step: 199150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:38,937-Speed 5165.32 samples/sec Loss 2.3655 Epoch: 11 Global Step: 199200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:49,092-Speed 5041.99 samples/sec Loss 2.3337 Epoch: 11 Global Step: 199250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:12:59,259-Speed 5036.09 samples/sec Loss 2.3897 Epoch: 11 Global Step: 199300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:09,373-Speed 5062.62 samples/sec Loss 2.3257 Epoch: 11 Global Step: 199350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:19,443-Speed 5084.82 samples/sec Loss 2.3466 Epoch: 11 Global Step: 199400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:29,497-Speed 5092.89 samples/sec Loss 2.3394 Epoch: 11 Global Step: 199450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:39,378-Speed 5181.95 samples/sec Loss 2.3657 Epoch: 11 Global Step: 199500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:49,629-Speed 4994.88 samples/sec Loss 2.3401 Epoch: 11 Global Step: 199550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:13:59,298-Speed 5295.37 samples/sec Loss 2.3330 Epoch: 11 Global Step: 199600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:14:09,241-Speed 5149.72 samples/sec Loss 2.3724 Epoch: 11 Global Step: 199650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:14:19,331-Speed 5074.69 samples/sec Loss 2.3297 Epoch: 11 Global Step: 199700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:14:29,528-Speed 5021.19 samples/sec Loss 2.3740 Epoch: 11 Global Step: 199750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:14:39,757-Speed 5005.81 samples/sec Loss 2.3341 Epoch: 11 Global Step: 199800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:14:49,707-Speed 5145.67 samples/sec Loss 2.3630 Epoch: 11 Global Step: 199850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:15:00,052-Speed 4949.68 samples/sec Loss 2.3062 Epoch: 11 Global Step: 199900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:15:10,958-Speed 4694.86 samples/sec Loss 2.3356 Epoch: 11 Global Step: 199950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:15:20,939-Speed 5130.08 samples/sec Loss 2.3615 Epoch: 11 Global Step: 200000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:15:37,439-[lfw][200000]XNorm: 23.185830 Training: 2021-03-19 09:15:37,439-[lfw][200000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-19 09:15:37,439-[lfw][200000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:15:56,011-[cfp_fp][200000]XNorm: 19.317410 Training: 2021-03-19 09:15:56,011-[cfp_fp][200000]Accuracy-Flip: 0.97200+-0.01152 Training: 2021-03-19 09:15:56,011-[cfp_fp][200000]Accuracy-Highest: 0.97229 Training: 2021-03-19 09:16:12,112-[agedb_30][200000]XNorm: 22.342473 Training: 2021-03-19 09:16:12,112-[agedb_30][200000]Accuracy-Flip: 0.97500+-0.00726 Training: 2021-03-19 09:16:12,112-[agedb_30][200000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:16:22,030-Speed 838.11 samples/sec Loss 2.3726 Epoch: 11 Global Step: 200050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:16:32,025-Speed 5122.73 samples/sec Loss 2.3308 Epoch: 11 Global Step: 200100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:16:42,149-Speed 5057.76 samples/sec Loss 2.3469 Epoch: 11 Global Step: 200150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:16:52,276-Speed 5056.25 samples/sec Loss 2.3719 Epoch: 11 Global Step: 200200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:17:02,516-Speed 5000.11 samples/sec Loss 2.3120 Epoch: 11 Global Step: 200250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:17:25,297-Speed 2247.54 samples/sec Loss 2.3430 Epoch: 12 Global Step: 200300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:17:35,850-Speed 4852.28 samples/sec Loss 2.2342 Epoch: 12 Global Step: 200350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:17:45,892-Speed 5098.49 samples/sec Loss 2.2254 Epoch: 12 Global Step: 200400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:17:56,003-Speed 5064.54 samples/sec Loss 2.2229 Epoch: 12 Global Step: 200450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-19 09:18:06,456-Speed 4898.39 samples/sec Loss 2.2045 Epoch: 12 Global Step: 200500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:18:16,775-Speed 4961.95 samples/sec Loss 2.2449 Epoch: 12 Global Step: 200550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:18:27,846-Speed 4624.93 samples/sec Loss 2.2280 Epoch: 12 Global Step: 200600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:18:37,745-Speed 5172.57 samples/sec Loss 2.2387 Epoch: 12 Global Step: 200650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:18:48,731-Speed 4661.03 samples/sec Loss 2.2324 Epoch: 12 Global Step: 200700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:18:58,937-Speed 5016.69 samples/sec Loss 2.2473 Epoch: 12 Global Step: 200750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:08,888-Speed 5145.26 samples/sec Loss 2.2209 Epoch: 12 Global Step: 200800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:18,948-Speed 5089.92 samples/sec Loss 2.2222 Epoch: 12 Global Step: 200850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:29,190-Speed 4999.07 samples/sec Loss 2.2345 Epoch: 12 Global Step: 200900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:39,138-Speed 5147.72 samples/sec Loss 2.2340 Epoch: 12 Global Step: 200950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:49,123-Speed 5127.90 samples/sec Loss 2.2262 Epoch: 12 Global Step: 201000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:19:59,103-Speed 5130.52 samples/sec Loss 2.2167 Epoch: 12 Global Step: 201050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:20:09,059-Speed 5142.61 samples/sec Loss 2.2155 Epoch: 12 Global Step: 201100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:20:20,159-Speed 4613.24 samples/sec Loss 2.2160 Epoch: 12 Global Step: 201150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:20:31,712-Speed 4431.78 samples/sec Loss 2.2608 Epoch: 12 Global Step: 201200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:20:42,053-Speed 4951.59 samples/sec Loss 2.2118 Epoch: 12 Global Step: 201250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:20:52,027-Speed 5133.46 samples/sec Loss 2.2235 Epoch: 12 Global Step: 201300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:02,857-Speed 4728.11 samples/sec Loss 2.2211 Epoch: 12 Global Step: 201350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:12,874-Speed 5111.37 samples/sec Loss 2.2337 Epoch: 12 Global Step: 201400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:22,950-Speed 5082.10 samples/sec Loss 2.2337 Epoch: 12 Global Step: 201450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:32,909-Speed 5141.18 samples/sec Loss 2.2201 Epoch: 12 Global Step: 201500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:42,886-Speed 5132.34 samples/sec Loss 2.2444 Epoch: 12 Global Step: 201550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:21:53,000-Speed 5062.37 samples/sec Loss 2.2049 Epoch: 12 Global Step: 201600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:03,035-Speed 5102.20 samples/sec Loss 2.2258 Epoch: 12 Global Step: 201650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:12,955-Speed 5161.59 samples/sec Loss 2.1950 Epoch: 12 Global Step: 201700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:22,957-Speed 5119.20 samples/sec Loss 2.2513 Epoch: 12 Global Step: 201750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:33,058-Speed 5069.14 samples/sec Loss 2.2171 Epoch: 12 Global Step: 201800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:43,180-Speed 5058.96 samples/sec Loss 2.2287 Epoch: 12 Global Step: 201850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:22:53,328-Speed 5045.68 samples/sec Loss 2.2127 Epoch: 12 Global Step: 201900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:23:03,319-Speed 5124.67 samples/sec Loss 2.2296 Epoch: 12 Global Step: 201950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:23:13,480-Speed 5039.16 samples/sec Loss 2.2107 Epoch: 12 Global Step: 202000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:23:30,162-[lfw][202000]XNorm: 23.461711 Training: 2021-03-19 09:23:30,162-[lfw][202000]Accuracy-Flip: 0.99650+-0.00263 Training: 2021-03-19 09:23:30,162-[lfw][202000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:23:49,000-[cfp_fp][202000]XNorm: 19.541470 Training: 2021-03-19 09:23:49,000-[cfp_fp][202000]Accuracy-Flip: 0.97171+-0.01181 Training: 2021-03-19 09:23:49,000-[cfp_fp][202000]Accuracy-Highest: 0.97229 Training: 2021-03-19 09:24:05,285-[agedb_30][202000]XNorm: 22.648094 Training: 2021-03-19 09:24:05,285-[agedb_30][202000]Accuracy-Flip: 0.97517+-0.00724 Training: 2021-03-19 09:24:05,285-[agedb_30][202000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:24:15,197-Speed 829.59 samples/sec Loss 2.2533 Epoch: 12 Global Step: 202050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:24:25,614-Speed 4915.21 samples/sec Loss 2.2563 Epoch: 12 Global Step: 202100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:24:35,675-Speed 5089.23 samples/sec Loss 2.2565 Epoch: 12 Global Step: 202150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:24:45,816-Speed 5049.43 samples/sec Loss 2.2704 Epoch: 12 Global Step: 202200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:24:55,954-Speed 5050.33 samples/sec Loss 2.2278 Epoch: 12 Global Step: 202250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:05,976-Speed 5109.26 samples/sec Loss 2.2194 Epoch: 12 Global Step: 202300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:16,107-Speed 5053.89 samples/sec Loss 2.2180 Epoch: 12 Global Step: 202350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:26,182-Speed 5082.56 samples/sec Loss 2.2352 Epoch: 12 Global Step: 202400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:36,415-Speed 5003.72 samples/sec Loss 2.2220 Epoch: 12 Global Step: 202450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:46,428-Speed 5113.39 samples/sec Loss 2.2125 Epoch: 12 Global Step: 202500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:25:56,499-Speed 5084.51 samples/sec Loss 2.2149 Epoch: 12 Global Step: 202550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:06,477-Speed 5131.22 samples/sec Loss 2.2670 Epoch: 12 Global Step: 202600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:16,441-Speed 5139.25 samples/sec Loss 2.2194 Epoch: 12 Global Step: 202650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:26,438-Speed 5122.02 samples/sec Loss 2.2753 Epoch: 12 Global Step: 202700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:36,655-Speed 5011.27 samples/sec Loss 2.2262 Epoch: 12 Global Step: 202750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:46,779-Speed 5057.40 samples/sec Loss 2.2380 Epoch: 12 Global Step: 202800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:26:56,873-Speed 5072.80 samples/sec Loss 2.2164 Epoch: 12 Global Step: 202850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:07,058-Speed 5027.56 samples/sec Loss 2.2136 Epoch: 12 Global Step: 202900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:17,222-Speed 5037.44 samples/sec Loss 2.2511 Epoch: 12 Global Step: 202950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:27,323-Speed 5069.46 samples/sec Loss 2.2510 Epoch: 12 Global Step: 203000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:37,272-Speed 5146.28 samples/sec Loss 2.2283 Epoch: 12 Global Step: 203050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:47,387-Speed 5062.14 samples/sec Loss 2.2263 Epoch: 12 Global Step: 203100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:27:57,559-Speed 5033.69 samples/sec Loss 2.2252 Epoch: 12 Global Step: 203150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:07,566-Speed 5116.97 samples/sec Loss 2.2353 Epoch: 12 Global Step: 203200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:17,760-Speed 5023.13 samples/sec Loss 2.2343 Epoch: 12 Global Step: 203250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:27,879-Speed 5059.64 samples/sec Loss 2.2518 Epoch: 12 Global Step: 203300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:38,949-Speed 4625.68 samples/sec Loss 2.2181 Epoch: 12 Global Step: 203350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:49,216-Speed 4987.21 samples/sec Loss 2.2539 Epoch: 12 Global Step: 203400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:28:59,103-Speed 5178.80 samples/sec Loss 2.2435 Epoch: 12 Global Step: 203450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:29:09,276-Speed 5033.04 samples/sec Loss 2.2198 Epoch: 12 Global Step: 203500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:29:19,410-Speed 5052.58 samples/sec Loss 2.2381 Epoch: 12 Global Step: 203550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:29:29,661-Speed 4994.94 samples/sec Loss 2.2143 Epoch: 12 Global Step: 203600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:29:39,717-Speed 5092.14 samples/sec Loss 2.2464 Epoch: 12 Global Step: 203650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:29:50,628-Speed 4692.66 samples/sec Loss 2.2463 Epoch: 12 Global Step: 203700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:00,794-Speed 5036.97 samples/sec Loss 2.2232 Epoch: 12 Global Step: 203750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:10,827-Speed 5103.52 samples/sec Loss 2.2152 Epoch: 12 Global Step: 203800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:20,905-Speed 5080.97 samples/sec Loss 2.2399 Epoch: 12 Global Step: 203850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:31,014-Speed 5064.75 samples/sec Loss 2.2444 Epoch: 12 Global Step: 203900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:41,836-Speed 4731.53 samples/sec Loss 2.2154 Epoch: 12 Global Step: 203950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:30:51,790-Speed 5143.98 samples/sec Loss 2.2230 Epoch: 12 Global Step: 204000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:31:08,509-[lfw][204000]XNorm: 23.325831 Training: 2021-03-19 09:31:08,509-[lfw][204000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-19 09:31:08,509-[lfw][204000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:31:27,170-[cfp_fp][204000]XNorm: 19.451352 Training: 2021-03-19 09:31:27,170-[cfp_fp][204000]Accuracy-Flip: 0.97357+-0.00904 Training: 2021-03-19 09:31:27,170-[cfp_fp][204000]Accuracy-Highest: 0.97357 Training: 2021-03-19 09:31:43,271-[agedb_30][204000]XNorm: 22.501246 Training: 2021-03-19 09:31:43,271-[agedb_30][204000]Accuracy-Flip: 0.97583+-0.00797 Training: 2021-03-19 09:31:43,271-[agedb_30][204000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:31:53,298-Speed 832.42 samples/sec Loss 2.2600 Epoch: 12 Global Step: 204050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:04,157-Speed 4714.90 samples/sec Loss 2.2233 Epoch: 12 Global Step: 204100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:14,319-Speed 5039.00 samples/sec Loss 2.2814 Epoch: 12 Global Step: 204150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:24,496-Speed 5031.46 samples/sec Loss 2.2352 Epoch: 12 Global Step: 204200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:34,787-Speed 4975.33 samples/sec Loss 2.2390 Epoch: 12 Global Step: 204250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:45,010-Speed 5008.41 samples/sec Loss 2.2534 Epoch: 12 Global Step: 204300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:32:55,201-Speed 5024.80 samples/sec Loss 2.2481 Epoch: 12 Global Step: 204350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:05,226-Speed 5107.32 samples/sec Loss 2.1980 Epoch: 12 Global Step: 204400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:15,057-Speed 5208.71 samples/sec Loss 2.2204 Epoch: 12 Global Step: 204450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:25,868-Speed 4735.94 samples/sec Loss 2.2438 Epoch: 12 Global Step: 204500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:36,453-Speed 4837.15 samples/sec Loss 2.2562 Epoch: 12 Global Step: 204550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:47,186-Speed 4770.72 samples/sec Loss 2.2420 Epoch: 12 Global Step: 204600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:33:57,154-Speed 5136.49 samples/sec Loss 2.2193 Epoch: 12 Global Step: 204650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:07,390-Speed 5002.20 samples/sec Loss 2.2805 Epoch: 12 Global Step: 204700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:18,108-Speed 4777.42 samples/sec Loss 2.2447 Epoch: 12 Global Step: 204750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:28,453-Speed 4949.67 samples/sec Loss 2.2010 Epoch: 12 Global Step: 204800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:38,467-Speed 5112.80 samples/sec Loss 2.2197 Epoch: 12 Global Step: 204850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:48,575-Speed 5065.59 samples/sec Loss 2.2192 Epoch: 12 Global Step: 204900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:34:58,839-Speed 4988.74 samples/sec Loss 2.2483 Epoch: 12 Global Step: 204950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:08,762-Speed 5159.94 samples/sec Loss 2.2357 Epoch: 12 Global Step: 205000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:18,803-Speed 5099.51 samples/sec Loss 2.2242 Epoch: 12 Global Step: 205050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:28,792-Speed 5126.04 samples/sec Loss 2.2424 Epoch: 12 Global Step: 205100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:39,086-Speed 4974.25 samples/sec Loss 2.2390 Epoch: 12 Global Step: 205150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:49,153-Speed 5086.18 samples/sec Loss 2.2429 Epoch: 12 Global Step: 205200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:35:59,202-Speed 5095.22 samples/sec Loss 2.2580 Epoch: 12 Global Step: 205250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:36:09,426-Speed 5008.11 samples/sec Loss 2.2350 Epoch: 12 Global Step: 205300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:36:19,772-Speed 4949.03 samples/sec Loss 2.2328 Epoch: 12 Global Step: 205350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:36:29,814-Speed 5098.63 samples/sec Loss 2.2225 Epoch: 12 Global Step: 205400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:36:40,109-Speed 4973.53 samples/sec Loss 2.2431 Epoch: 12 Global Step: 205450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:36:50,155-Speed 5097.28 samples/sec Loss 2.2113 Epoch: 12 Global Step: 205500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:00,475-Speed 4961.50 samples/sec Loss 2.2289 Epoch: 12 Global Step: 205550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:10,371-Speed 5173.84 samples/sec Loss 2.2492 Epoch: 12 Global Step: 205600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:20,536-Speed 5037.36 samples/sec Loss 2.2444 Epoch: 12 Global Step: 205650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:30,550-Speed 5113.14 samples/sec Loss 2.2610 Epoch: 12 Global Step: 205700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:40,805-Speed 4992.73 samples/sec Loss 2.1894 Epoch: 12 Global Step: 205750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:37:50,832-Speed 5106.52 samples/sec Loss 2.2156 Epoch: 12 Global Step: 205800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:38:00,926-Speed 5072.76 samples/sec Loss 2.2569 Epoch: 12 Global Step: 205850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:38:10,790-Speed 5191.14 samples/sec Loss 2.2267 Epoch: 12 Global Step: 205900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:38:20,645-Speed 5195.31 samples/sec Loss 2.2481 Epoch: 12 Global Step: 205950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:38:30,627-Speed 5129.83 samples/sec Loss 2.2111 Epoch: 12 Global Step: 206000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:38:47,548-[lfw][206000]XNorm: 22.997089 Training: 2021-03-19 09:38:47,548-[lfw][206000]Accuracy-Flip: 0.99750+-0.00291 Training: 2021-03-19 09:38:47,548-[lfw][206000]Accuracy-Highest: 0.99750 Training: 2021-03-19 09:39:06,268-[cfp_fp][206000]XNorm: 19.192052 Training: 2021-03-19 09:39:06,269-[cfp_fp][206000]Accuracy-Flip: 0.97457+-0.00899 Training: 2021-03-19 09:39:06,270-[cfp_fp][206000]Accuracy-Highest: 0.97457 Training: 2021-03-19 09:39:22,468-[agedb_30][206000]XNorm: 22.154296 Training: 2021-03-19 09:39:22,468-[agedb_30][206000]Accuracy-Flip: 0.97367+-0.00799 Training: 2021-03-19 09:39:22,468-[agedb_30][206000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:39:32,310-Speed 830.06 samples/sec Loss 2.2276 Epoch: 12 Global Step: 206050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:39:42,381-Speed 5083.97 samples/sec Loss 2.2156 Epoch: 12 Global Step: 206100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:39:52,458-Speed 5081.44 samples/sec Loss 2.2459 Epoch: 12 Global Step: 206150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:02,546-Speed 5075.48 samples/sec Loss 2.2435 Epoch: 12 Global Step: 206200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:12,679-Speed 5052.90 samples/sec Loss 2.2157 Epoch: 12 Global Step: 206250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:22,766-Speed 5076.38 samples/sec Loss 2.1934 Epoch: 12 Global Step: 206300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:32,904-Speed 5050.20 samples/sec Loss 2.2301 Epoch: 12 Global Step: 206350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:43,143-Speed 5000.68 samples/sec Loss 2.2216 Epoch: 12 Global Step: 206400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:40:53,058-Speed 5164.53 samples/sec Loss 2.2046 Epoch: 12 Global Step: 206450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:03,259-Speed 5019.48 samples/sec Loss 2.2106 Epoch: 12 Global Step: 206500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:13,390-Speed 5053.77 samples/sec Loss 2.2369 Epoch: 12 Global Step: 206550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:23,549-Speed 5040.15 samples/sec Loss 2.2572 Epoch: 12 Global Step: 206600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:33,784-Speed 5002.86 samples/sec Loss 2.2156 Epoch: 12 Global Step: 206650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:43,866-Speed 5078.81 samples/sec Loss 2.2663 Epoch: 12 Global Step: 206700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:41:54,949-Speed 4619.92 samples/sec Loss 2.2686 Epoch: 12 Global Step: 206750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:04,943-Speed 5123.04 samples/sec Loss 2.2201 Epoch: 12 Global Step: 206800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:15,134-Speed 5024.39 samples/sec Loss 2.2303 Epoch: 12 Global Step: 206850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:25,225-Speed 5074.56 samples/sec Loss 2.2325 Epoch: 12 Global Step: 206900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:35,307-Speed 5078.43 samples/sec Loss 2.2143 Epoch: 12 Global Step: 206950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:45,279-Speed 5135.09 samples/sec Loss 2.2368 Epoch: 12 Global Step: 207000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:42:55,298-Speed 5110.38 samples/sec Loss 2.2505 Epoch: 12 Global Step: 207050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:05,926-Speed 4817.92 samples/sec Loss 2.2364 Epoch: 12 Global Step: 207100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:16,044-Speed 5060.27 samples/sec Loss 2.2126 Epoch: 12 Global Step: 207150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:26,226-Speed 5029.08 samples/sec Loss 2.2280 Epoch: 12 Global Step: 207200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:36,250-Speed 5107.70 samples/sec Loss 2.2191 Epoch: 12 Global Step: 207250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:46,215-Speed 5138.65 samples/sec Loss 2.2401 Epoch: 12 Global Step: 207300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:43:56,234-Speed 5110.35 samples/sec Loss 2.2322 Epoch: 12 Global Step: 207350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:06,938-Speed 4783.61 samples/sec Loss 2.2356 Epoch: 12 Global Step: 207400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:17,183-Speed 4997.73 samples/sec Loss 2.2314 Epoch: 12 Global Step: 207450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:27,153-Speed 5136.09 samples/sec Loss 2.2164 Epoch: 12 Global Step: 207500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:37,920-Speed 4755.42 samples/sec Loss 2.2246 Epoch: 12 Global Step: 207550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:48,123-Speed 5018.22 samples/sec Loss 2.2523 Epoch: 12 Global Step: 207600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:44:58,188-Speed 5087.26 samples/sec Loss 2.2365 Epoch: 12 Global Step: 207650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:08,122-Speed 5154.21 samples/sec Loss 2.2511 Epoch: 12 Global Step: 207700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:18,741-Speed 4822.11 samples/sec Loss 2.2363 Epoch: 12 Global Step: 207750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:28,712-Speed 5135.05 samples/sec Loss 2.2405 Epoch: 12 Global Step: 207800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:38,787-Speed 5082.43 samples/sec Loss 2.2244 Epoch: 12 Global Step: 207850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:49,627-Speed 4723.28 samples/sec Loss 2.2171 Epoch: 12 Global Step: 207900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:45:59,785-Speed 5041.00 samples/sec Loss 2.2442 Epoch: 12 Global Step: 207950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:46:10,537-Speed 4761.78 samples/sec Loss 2.2239 Epoch: 12 Global Step: 208000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:46:27,359-[lfw][208000]XNorm: 23.368886 Training: 2021-03-19 09:46:27,359-[lfw][208000]Accuracy-Flip: 0.99767+-0.00271 Training: 2021-03-19 09:46:27,359-[lfw][208000]Accuracy-Highest: 0.99767 Training: 2021-03-19 09:46:46,082-[cfp_fp][208000]XNorm: 19.474732 Training: 2021-03-19 09:46:46,082-[cfp_fp][208000]Accuracy-Flip: 0.97371+-0.01140 Training: 2021-03-19 09:46:46,082-[cfp_fp][208000]Accuracy-Highest: 0.97457 Training: 2021-03-19 09:47:02,205-[agedb_30][208000]XNorm: 22.571777 Training: 2021-03-19 09:47:02,205-[agedb_30][208000]Accuracy-Flip: 0.97567+-0.00775 Training: 2021-03-19 09:47:02,205-[agedb_30][208000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:47:12,092-Speed 831.78 samples/sec Loss 2.2210 Epoch: 12 Global Step: 208050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:47:23,004-Speed 4692.39 samples/sec Loss 2.2378 Epoch: 12 Global Step: 208100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:47:33,052-Speed 5096.00 samples/sec Loss 2.2290 Epoch: 12 Global Step: 208150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:47:43,075-Speed 5108.53 samples/sec Loss 2.2151 Epoch: 12 Global Step: 208200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:47:53,205-Speed 5054.39 samples/sec Loss 2.2451 Epoch: 12 Global Step: 208250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:03,268-Speed 5088.54 samples/sec Loss 2.2220 Epoch: 12 Global Step: 208300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:13,497-Speed 5005.54 samples/sec Loss 2.2421 Epoch: 12 Global Step: 208350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:23,322-Speed 5211.77 samples/sec Loss 2.2375 Epoch: 12 Global Step: 208400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:33,454-Speed 5053.56 samples/sec Loss 2.2299 Epoch: 12 Global Step: 208450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:43,751-Speed 4972.54 samples/sec Loss 2.2326 Epoch: 12 Global Step: 208500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:48:53,748-Speed 5121.66 samples/sec Loss 2.2385 Epoch: 12 Global Step: 208550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:03,861-Speed 5063.08 samples/sec Loss 2.2382 Epoch: 12 Global Step: 208600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:13,933-Speed 5084.05 samples/sec Loss 2.2015 Epoch: 12 Global Step: 208650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:24,004-Speed 5084.31 samples/sec Loss 2.2430 Epoch: 12 Global Step: 208700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:34,014-Speed 5114.94 samples/sec Loss 2.2083 Epoch: 12 Global Step: 208750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:44,129-Speed 5062.01 samples/sec Loss 2.2031 Epoch: 12 Global Step: 208800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:49:54,010-Speed 5181.99 samples/sec Loss 2.2471 Epoch: 12 Global Step: 208850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:03,932-Speed 5161.02 samples/sec Loss 2.2345 Epoch: 12 Global Step: 208900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:14,211-Speed 4981.38 samples/sec Loss 2.2382 Epoch: 12 Global Step: 208950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:24,424-Speed 5013.43 samples/sec Loss 2.2193 Epoch: 12 Global Step: 209000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:34,497-Speed 5083.38 samples/sec Loss 2.2264 Epoch: 12 Global Step: 209050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:44,760-Speed 4989.08 samples/sec Loss 2.2107 Epoch: 12 Global Step: 209100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:50:54,990-Speed 5004.87 samples/sec Loss 2.2061 Epoch: 12 Global Step: 209150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:04,899-Speed 5167.69 samples/sec Loss 2.2378 Epoch: 12 Global Step: 209200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:14,977-Speed 5080.64 samples/sec Loss 2.2335 Epoch: 12 Global Step: 209250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:25,206-Speed 5005.70 samples/sec Loss 2.2331 Epoch: 12 Global Step: 209300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:35,140-Speed 5154.56 samples/sec Loss 2.2142 Epoch: 12 Global Step: 209350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:45,190-Speed 5094.56 samples/sec Loss 2.2398 Epoch: 12 Global Step: 209400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:51:55,138-Speed 5147.37 samples/sec Loss 2.2361 Epoch: 12 Global Step: 209450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:05,199-Speed 5089.32 samples/sec Loss 2.2186 Epoch: 12 Global Step: 209500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:15,361-Speed 5038.83 samples/sec Loss 2.2113 Epoch: 12 Global Step: 209550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:25,288-Speed 5157.66 samples/sec Loss 2.2482 Epoch: 12 Global Step: 209600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:35,295-Speed 5117.10 samples/sec Loss 2.2404 Epoch: 12 Global Step: 209650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:45,298-Speed 5118.69 samples/sec Loss 2.2252 Epoch: 12 Global Step: 209700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:52:55,394-Speed 5071.72 samples/sec Loss 2.1966 Epoch: 12 Global Step: 209750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:53:05,671-Speed 4982.45 samples/sec Loss 2.2395 Epoch: 12 Global Step: 209800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:53:15,650-Speed 5130.87 samples/sec Loss 2.2288 Epoch: 12 Global Step: 209850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:53:25,651-Speed 5119.97 samples/sec Loss 2.2296 Epoch: 12 Global Step: 209900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:53:35,636-Speed 5128.14 samples/sec Loss 2.2214 Epoch: 12 Global Step: 209950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:53:46,429-Speed 4743.88 samples/sec Loss 2.2322 Epoch: 12 Global Step: 210000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:54:02,631-[lfw][210000]XNorm: 23.086464 Training: 2021-03-19 09:54:02,632-[lfw][210000]Accuracy-Flip: 0.99750+-0.00261 Training: 2021-03-19 09:54:02,632-[lfw][210000]Accuracy-Highest: 0.99767 Training: 2021-03-19 09:54:21,309-[cfp_fp][210000]XNorm: 19.252603 Training: 2021-03-19 09:54:21,310-[cfp_fp][210000]Accuracy-Flip: 0.97171+-0.01004 Training: 2021-03-19 09:54:21,310-[cfp_fp][210000]Accuracy-Highest: 0.97457 Training: 2021-03-19 09:54:37,469-[agedb_30][210000]XNorm: 22.255191 Training: 2021-03-19 09:54:37,469-[agedb_30][210000]Accuracy-Flip: 0.97350+-0.00736 Training: 2021-03-19 09:54:37,470-[agedb_30][210000]Accuracy-Highest: 0.97583 Training: 2021-03-19 09:54:47,273-Speed 841.50 samples/sec Loss 2.2263 Epoch: 12 Global Step: 210050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:54:57,303-Speed 5105.07 samples/sec Loss 2.2078 Epoch: 12 Global Step: 210100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:07,732-Speed 4909.56 samples/sec Loss 2.2023 Epoch: 12 Global Step: 210150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:17,837-Speed 5067.11 samples/sec Loss 2.2441 Epoch: 12 Global Step: 210200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:27,939-Speed 5068.83 samples/sec Loss 2.2143 Epoch: 12 Global Step: 210250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:37,916-Speed 5131.98 samples/sec Loss 2.2374 Epoch: 12 Global Step: 210300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:47,761-Speed 5201.14 samples/sec Loss 2.2414 Epoch: 12 Global Step: 210350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:55:58,415-Speed 4806.19 samples/sec Loss 2.2105 Epoch: 12 Global Step: 210400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:08,506-Speed 5073.79 samples/sec Loss 2.2070 Epoch: 12 Global Step: 210450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:18,471-Speed 5138.24 samples/sec Loss 2.2388 Epoch: 12 Global Step: 210500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:28,407-Speed 5153.44 samples/sec Loss 2.2541 Epoch: 12 Global Step: 210550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:38,352-Speed 5148.80 samples/sec Loss 2.2396 Epoch: 12 Global Step: 210600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:48,574-Speed 5009.15 samples/sec Loss 2.1929 Epoch: 12 Global Step: 210650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:56:58,775-Speed 5019.34 samples/sec Loss 2.2435 Epoch: 12 Global Step: 210700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:57:08,842-Speed 5086.14 samples/sec Loss 2.2392 Epoch: 12 Global Step: 210750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:57:19,597-Speed 4760.89 samples/sec Loss 2.2238 Epoch: 12 Global Step: 210800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:57:29,616-Speed 5110.16 samples/sec Loss 2.2613 Epoch: 12 Global Step: 210850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:57:40,234-Speed 4822.26 samples/sec Loss 2.2436 Epoch: 12 Global Step: 210900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:57:50,200-Speed 5137.81 samples/sec Loss 2.2514 Epoch: 12 Global Step: 210950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:00,889-Speed 4790.25 samples/sec Loss 2.2531 Epoch: 12 Global Step: 211000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:10,798-Speed 5167.13 samples/sec Loss 2.2104 Epoch: 12 Global Step: 211050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:20,895-Speed 5071.28 samples/sec Loss 2.2446 Epoch: 12 Global Step: 211100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:30,754-Speed 5193.36 samples/sec Loss 2.2160 Epoch: 12 Global Step: 211150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:41,791-Speed 4639.40 samples/sec Loss 2.1930 Epoch: 12 Global Step: 211200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:58:51,877-Speed 5076.60 samples/sec Loss 2.2216 Epoch: 12 Global Step: 211250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:02,779-Speed 4696.67 samples/sec Loss 2.2095 Epoch: 12 Global Step: 211300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:12,845-Speed 5086.55 samples/sec Loss 2.2256 Epoch: 12 Global Step: 211350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:23,639-Speed 4744.06 samples/sec Loss 2.2400 Epoch: 12 Global Step: 211400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:33,782-Speed 5048.09 samples/sec Loss 2.2359 Epoch: 12 Global Step: 211450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:43,849-Speed 5086.50 samples/sec Loss 2.2345 Epoch: 12 Global Step: 211500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 09:59:54,155-Speed 4968.02 samples/sec Loss 2.2364 Epoch: 12 Global Step: 211550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:04,177-Speed 5109.22 samples/sec Loss 2.1899 Epoch: 12 Global Step: 211600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:14,100-Speed 5159.96 samples/sec Loss 2.2211 Epoch: 12 Global Step: 211650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:24,501-Speed 4922.71 samples/sec Loss 2.2321 Epoch: 12 Global Step: 211700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:34,923-Speed 4913.09 samples/sec Loss 2.2412 Epoch: 12 Global Step: 211750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:45,006-Speed 5078.48 samples/sec Loss 2.2110 Epoch: 12 Global Step: 211800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:00:55,174-Speed 5035.21 samples/sec Loss 2.2364 Epoch: 12 Global Step: 211850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:01:05,276-Speed 5068.77 samples/sec Loss 2.2301 Epoch: 12 Global Step: 211900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:01:15,010-Speed 5260.30 samples/sec Loss 2.2338 Epoch: 12 Global Step: 211950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:01:25,233-Speed 5008.62 samples/sec Loss 2.2515 Epoch: 12 Global Step: 212000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:01:42,006-[lfw][212000]XNorm: 23.330650 Training: 2021-03-19 10:01:42,007-[lfw][212000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-19 10:01:42,007-[lfw][212000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:02:00,675-[cfp_fp][212000]XNorm: 19.449392 Training: 2021-03-19 10:02:00,675-[cfp_fp][212000]Accuracy-Flip: 0.97371+-0.01075 Training: 2021-03-19 10:02:00,675-[cfp_fp][212000]Accuracy-Highest: 0.97457 Training: 2021-03-19 10:02:16,781-[agedb_30][212000]XNorm: 22.549505 Training: 2021-03-19 10:02:16,781-[agedb_30][212000]Accuracy-Flip: 0.97567+-0.00731 Training: 2021-03-19 10:02:16,782-[agedb_30][212000]Accuracy-Highest: 0.97583 Training: 2021-03-19 10:02:26,562-Speed 834.86 samples/sec Loss 2.2472 Epoch: 12 Global Step: 212050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:02:36,532-Speed 5135.53 samples/sec Loss 2.2362 Epoch: 12 Global Step: 212100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:02:46,500-Speed 5137.01 samples/sec Loss 2.2653 Epoch: 12 Global Step: 212150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:02:56,629-Speed 5055.04 samples/sec Loss 2.2104 Epoch: 12 Global Step: 212200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:06,593-Speed 5138.78 samples/sec Loss 2.2464 Epoch: 12 Global Step: 212250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:16,491-Speed 5173.08 samples/sec Loss 2.1938 Epoch: 12 Global Step: 212300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:26,380-Speed 5177.97 samples/sec Loss 2.2433 Epoch: 12 Global Step: 212350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:36,636-Speed 4992.43 samples/sec Loss 2.2427 Epoch: 12 Global Step: 212400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:46,706-Speed 5084.44 samples/sec Loss 2.2252 Epoch: 12 Global Step: 212450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:03:56,776-Speed 5084.84 samples/sec Loss 2.2134 Epoch: 12 Global Step: 212500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:07,002-Speed 5007.27 samples/sec Loss 2.2573 Epoch: 12 Global Step: 212550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:16,974-Speed 5134.24 samples/sec Loss 2.2627 Epoch: 12 Global Step: 212600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:27,127-Speed 5043.32 samples/sec Loss 2.2065 Epoch: 12 Global Step: 212650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:37,308-Speed 5029.43 samples/sec Loss 2.2220 Epoch: 12 Global Step: 212700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:47,474-Speed 5036.68 samples/sec Loss 2.2712 Epoch: 12 Global Step: 212750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:04:57,608-Speed 5052.44 samples/sec Loss 2.2479 Epoch: 12 Global Step: 212800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:07,759-Speed 5043.94 samples/sec Loss 2.2229 Epoch: 12 Global Step: 212850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:17,944-Speed 5027.57 samples/sec Loss 2.2136 Epoch: 12 Global Step: 212900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:28,033-Speed 5075.28 samples/sec Loss 2.2426 Epoch: 12 Global Step: 212950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:37,927-Speed 5175.10 samples/sec Loss 2.2440 Epoch: 12 Global Step: 213000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:47,970-Speed 5098.23 samples/sec Loss 2.2253 Epoch: 12 Global Step: 213050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:05:57,947-Speed 5131.96 samples/sec Loss 2.2413 Epoch: 12 Global Step: 213100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:08,101-Speed 5042.81 samples/sec Loss 2.2290 Epoch: 12 Global Step: 213150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:18,194-Speed 5072.90 samples/sec Loss 2.2360 Epoch: 12 Global Step: 213200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:28,314-Speed 5060.01 samples/sec Loss 2.2219 Epoch: 12 Global Step: 213250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:39,231-Speed 4690.01 samples/sec Loss 2.1842 Epoch: 12 Global Step: 213300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:49,157-Speed 5158.83 samples/sec Loss 2.2191 Epoch: 12 Global Step: 213350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:06:59,102-Speed 5148.42 samples/sec Loss 2.1934 Epoch: 12 Global Step: 213400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:09,101-Speed 5120.59 samples/sec Loss 2.2220 Epoch: 12 Global Step: 213450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:19,273-Speed 5033.87 samples/sec Loss 2.2425 Epoch: 12 Global Step: 213500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:29,344-Speed 5084.26 samples/sec Loss 2.2034 Epoch: 12 Global Step: 213550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:39,305-Speed 5140.01 samples/sec Loss 2.2461 Epoch: 12 Global Step: 213600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:49,318-Speed 5113.72 samples/sec Loss 2.2112 Epoch: 12 Global Step: 213650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:07:59,141-Speed 5212.86 samples/sec Loss 2.2139 Epoch: 12 Global Step: 213700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:08:09,978-Speed 4724.54 samples/sec Loss 2.2032 Epoch: 12 Global Step: 213750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:08:20,154-Speed 5032.17 samples/sec Loss 2.2141 Epoch: 12 Global Step: 213800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:08:30,055-Speed 5171.12 samples/sec Loss 2.2313 Epoch: 12 Global Step: 213850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:08:40,026-Speed 5135.65 samples/sec Loss 2.2339 Epoch: 12 Global Step: 213900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:08:50,151-Speed 5056.90 samples/sec Loss 2.2112 Epoch: 12 Global Step: 213950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:09:00,125-Speed 5133.87 samples/sec Loss 2.2408 Epoch: 12 Global Step: 214000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:09:16,990-[lfw][214000]XNorm: 23.300106 Training: 2021-03-19 10:09:16,990-[lfw][214000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 10:09:16,991-[lfw][214000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:09:35,698-[cfp_fp][214000]XNorm: 19.530208 Training: 2021-03-19 10:09:35,698-[cfp_fp][214000]Accuracy-Flip: 0.97371+-0.00883 Training: 2021-03-19 10:09:35,698-[cfp_fp][214000]Accuracy-Highest: 0.97457 Training: 2021-03-19 10:09:51,902-[agedb_30][214000]XNorm: 22.559852 Training: 2021-03-19 10:09:51,902-[agedb_30][214000]Accuracy-Flip: 0.97317+-0.00855 Training: 2021-03-19 10:09:51,902-[agedb_30][214000]Accuracy-Highest: 0.97583 Training: 2021-03-19 10:10:01,759-Speed 830.72 samples/sec Loss 2.2499 Epoch: 12 Global Step: 214050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:10:11,936-Speed 5031.50 samples/sec Loss 2.2086 Epoch: 12 Global Step: 214100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:10:22,793-Speed 4715.79 samples/sec Loss 2.2217 Epoch: 12 Global Step: 214150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:10:32,936-Speed 5048.51 samples/sec Loss 2.2303 Epoch: 12 Global Step: 214200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:10:42,991-Speed 5092.15 samples/sec Loss 2.2182 Epoch: 12 Global Step: 214250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:10:53,759-Speed 4754.96 samples/sec Loss 2.2184 Epoch: 12 Global Step: 214300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:03,858-Speed 5070.67 samples/sec Loss 2.2321 Epoch: 12 Global Step: 214350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:14,687-Speed 4728.23 samples/sec Loss 2.2090 Epoch: 12 Global Step: 214400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:24,704-Speed 5111.70 samples/sec Loss 2.2357 Epoch: 12 Global Step: 214450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:34,867-Speed 5037.83 samples/sec Loss 2.2419 Epoch: 12 Global Step: 214500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:45,951-Speed 4619.66 samples/sec Loss 2.2056 Epoch: 12 Global Step: 214550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:11:56,050-Speed 5070.30 samples/sec Loss 2.2310 Epoch: 12 Global Step: 214600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:07,052-Speed 4654.08 samples/sec Loss 2.2381 Epoch: 12 Global Step: 214650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:16,909-Speed 5194.52 samples/sec Loss 2.2075 Epoch: 12 Global Step: 214700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:26,743-Speed 5206.46 samples/sec Loss 2.2360 Epoch: 12 Global Step: 214750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:37,418-Speed 4796.39 samples/sec Loss 2.2452 Epoch: 12 Global Step: 214800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:47,366-Speed 5146.98 samples/sec Loss 2.2267 Epoch: 12 Global Step: 214850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:12:57,413-Speed 5096.52 samples/sec Loss 2.2326 Epoch: 12 Global Step: 214900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:07,484-Speed 5084.38 samples/sec Loss 2.2263 Epoch: 12 Global Step: 214950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:17,466-Speed 5129.64 samples/sec Loss 2.1925 Epoch: 12 Global Step: 215000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:27,506-Speed 5099.76 samples/sec Loss 2.2182 Epoch: 12 Global Step: 215050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:37,534-Speed 5106.25 samples/sec Loss 2.2487 Epoch: 12 Global Step: 215100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:47,672-Speed 5050.54 samples/sec Loss 2.2223 Epoch: 12 Global Step: 215150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:13:57,596-Speed 5159.84 samples/sec Loss 2.1904 Epoch: 12 Global Step: 215200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:07,694-Speed 5070.07 samples/sec Loss 2.2364 Epoch: 12 Global Step: 215250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:17,685-Speed 5124.97 samples/sec Loss 2.2209 Epoch: 12 Global Step: 215300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:27,667-Speed 5129.58 samples/sec Loss 2.2009 Epoch: 12 Global Step: 215350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:37,927-Speed 4990.37 samples/sec Loss 2.2248 Epoch: 12 Global Step: 215400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:48,174-Speed 4997.14 samples/sec Loss 2.2291 Epoch: 12 Global Step: 215450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:14:58,254-Speed 5079.66 samples/sec Loss 2.2105 Epoch: 12 Global Step: 215500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:08,304-Speed 5094.61 samples/sec Loss 2.2453 Epoch: 12 Global Step: 215550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:18,224-Speed 5161.44 samples/sec Loss 2.2461 Epoch: 12 Global Step: 215600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:28,248-Speed 5108.26 samples/sec Loss 2.2007 Epoch: 12 Global Step: 215650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:37,971-Speed 5266.36 samples/sec Loss 2.2396 Epoch: 12 Global Step: 215700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:47,831-Speed 5192.72 samples/sec Loss 2.2141 Epoch: 12 Global Step: 215750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:15:57,890-Speed 5090.24 samples/sec Loss 2.2157 Epoch: 12 Global Step: 215800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:16:08,033-Speed 5048.16 samples/sec Loss 2.2614 Epoch: 12 Global Step: 215850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:16:18,104-Speed 5084.28 samples/sec Loss 2.2213 Epoch: 12 Global Step: 215900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:16:27,914-Speed 5219.39 samples/sec Loss 2.2200 Epoch: 12 Global Step: 215950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:16:38,097-Speed 5028.22 samples/sec Loss 2.2363 Epoch: 12 Global Step: 216000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:16:54,862-[lfw][216000]XNorm: 23.496411 Training: 2021-03-19 10:16:54,863-[lfw][216000]Accuracy-Flip: 0.99667+-0.00258 Training: 2021-03-19 10:16:54,863-[lfw][216000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:17:13,491-[cfp_fp][216000]XNorm: 19.691014 Training: 2021-03-19 10:17:13,491-[cfp_fp][216000]Accuracy-Flip: 0.97200+-0.01038 Training: 2021-03-19 10:17:13,491-[cfp_fp][216000]Accuracy-Highest: 0.97457 Training: 2021-03-19 10:17:29,608-[agedb_30][216000]XNorm: 22.840613 Training: 2021-03-19 10:17:29,608-[agedb_30][216000]Accuracy-Flip: 0.97550+-0.00799 Training: 2021-03-19 10:17:29,608-[agedb_30][216000]Accuracy-Highest: 0.97583 Training: 2021-03-19 10:17:39,460-Speed 834.39 samples/sec Loss 2.2222 Epoch: 12 Global Step: 216050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:17:49,504-Speed 5097.65 samples/sec Loss 2.2419 Epoch: 12 Global Step: 216100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-19 10:17:59,581-Speed 5081.06 samples/sec Loss 2.2144 Epoch: 12 Global Step: 216150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:09,787-Speed 5017.05 samples/sec Loss 2.2322 Epoch: 12 Global Step: 216200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:19,807-Speed 5110.18 samples/sec Loss 2.2628 Epoch: 12 Global Step: 216250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:29,850-Speed 5098.13 samples/sec Loss 2.2563 Epoch: 12 Global Step: 216300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:40,172-Speed 4960.98 samples/sec Loss 2.2121 Epoch: 12 Global Step: 216350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:50,197-Speed 5107.55 samples/sec Loss 2.2071 Epoch: 12 Global Step: 216400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:18:59,940-Speed 5255.53 samples/sec Loss 2.2212 Epoch: 12 Global Step: 216450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:19:10,054-Speed 5062.26 samples/sec Loss 2.2302 Epoch: 12 Global Step: 216500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:19:20,201-Speed 5046.30 samples/sec Loss 2.2305 Epoch: 12 Global Step: 216550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:19:30,240-Speed 5100.38 samples/sec Loss 2.1946 Epoch: 12 Global Step: 216600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:19:41,019-Speed 4750.16 samples/sec Loss 2.2517 Epoch: 12 Global Step: 216650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:19:51,024-Speed 5117.78 samples/sec Loss 2.2498 Epoch: 12 Global Step: 216700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:20:00,826-Speed 5224.02 samples/sec Loss 2.2042 Epoch: 12 Global Step: 216750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:20:10,748-Speed 5160.47 samples/sec Loss 2.1916 Epoch: 12 Global Step: 216800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:20:20,858-Speed 5064.55 samples/sec Loss 2.2614 Epoch: 12 Global Step: 216850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:20:31,049-Speed 5024.47 samples/sec Loss 2.1785 Epoch: 12 Global Step: 216900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:20:41,066-Speed 5111.53 samples/sec Loss 2.2287 Epoch: 12 Global Step: 216950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:03,763-Speed 2255.85 samples/sec Loss 2.2034 Epoch: 13 Global Step: 217000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:14,503-Speed 4768.06 samples/sec Loss 2.1096 Epoch: 13 Global Step: 217050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:24,852-Speed 4947.36 samples/sec Loss 2.1457 Epoch: 13 Global Step: 217100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:36,270-Speed 4484.59 samples/sec Loss 2.1211 Epoch: 13 Global Step: 217150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:47,073-Speed 4739.77 samples/sec Loss 2.0818 Epoch: 13 Global Step: 217200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:21:57,666-Speed 4833.53 samples/sec Loss 2.1172 Epoch: 13 Global Step: 217250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:22:08,402-Speed 4769.59 samples/sec Loss 2.0972 Epoch: 13 Global Step: 217300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:22:19,094-Speed 4788.84 samples/sec Loss 2.1109 Epoch: 13 Global Step: 217350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:22:29,831-Speed 4768.76 samples/sec Loss 2.1227 Epoch: 13 Global Step: 217400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:22:40,427-Speed 4832.30 samples/sec Loss 2.0953 Epoch: 13 Global Step: 217450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:22:51,889-Speed 4467.54 samples/sec Loss 2.0840 Epoch: 13 Global Step: 217500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:02,164-Speed 4983.11 samples/sec Loss 2.1067 Epoch: 13 Global Step: 217550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:13,537-Speed 4502.23 samples/sec Loss 2.0948 Epoch: 13 Global Step: 217600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:23,864-Speed 4958.23 samples/sec Loss 2.1075 Epoch: 13 Global Step: 217650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:34,135-Speed 4985.27 samples/sec Loss 2.1511 Epoch: 13 Global Step: 217700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:44,856-Speed 4775.94 samples/sec Loss 2.1217 Epoch: 13 Global Step: 217750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:23:55,269-Speed 4917.54 samples/sec Loss 2.1145 Epoch: 13 Global Step: 217800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:24:06,310-Speed 4637.21 samples/sec Loss 2.1392 Epoch: 13 Global Step: 217850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:24:16,608-Speed 4972.30 samples/sec Loss 2.0989 Epoch: 13 Global Step: 217900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:24:27,112-Speed 4874.60 samples/sec Loss 2.1227 Epoch: 13 Global Step: 217950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:24:37,259-Speed 5046.51 samples/sec Loss 2.1032 Epoch: 13 Global Step: 218000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:24:53,718-[lfw][218000]XNorm: 23.034162 Training: 2021-03-19 10:24:53,718-[lfw][218000]Accuracy-Flip: 0.99650+-0.00302 Training: 2021-03-19 10:24:53,718-[lfw][218000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:25:12,277-[cfp_fp][218000]XNorm: 19.338641 Training: 2021-03-19 10:25:12,278-[cfp_fp][218000]Accuracy-Flip: 0.97329+-0.00992 Training: 2021-03-19 10:25:12,278-[cfp_fp][218000]Accuracy-Highest: 0.97457 Training: 2021-03-19 10:25:28,412-[agedb_30][218000]XNorm: 22.295472 Training: 2021-03-19 10:25:28,413-[agedb_30][218000]Accuracy-Flip: 0.97383+-0.00749 Training: 2021-03-19 10:25:28,413-[agedb_30][218000]Accuracy-Highest: 0.97583 Training: 2021-03-19 10:25:38,363-Speed 837.92 samples/sec Loss 2.1190 Epoch: 13 Global Step: 218050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:25:49,336-Speed 4666.06 samples/sec Loss 2.1149 Epoch: 13 Global Step: 218100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:00,314-Speed 4664.23 samples/sec Loss 2.1252 Epoch: 13 Global Step: 218150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:11,437-Speed 4603.64 samples/sec Loss 2.1355 Epoch: 13 Global Step: 218200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:21,966-Speed 4863.01 samples/sec Loss 2.1159 Epoch: 13 Global Step: 218250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:32,420-Speed 4897.57 samples/sec Loss 2.1512 Epoch: 13 Global Step: 218300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:42,774-Speed 4945.48 samples/sec Loss 2.1312 Epoch: 13 Global Step: 218350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:26:53,004-Speed 5005.33 samples/sec Loss 2.1146 Epoch: 13 Global Step: 218400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:03,643-Speed 4812.57 samples/sec Loss 2.1195 Epoch: 13 Global Step: 218450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:14,400-Speed 4759.95 samples/sec Loss 2.1203 Epoch: 13 Global Step: 218500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:24,700-Speed 4971.60 samples/sec Loss 2.1274 Epoch: 13 Global Step: 218550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:35,206-Speed 4873.87 samples/sec Loss 2.1127 Epoch: 13 Global Step: 218600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:45,857-Speed 4807.13 samples/sec Loss 2.1248 Epoch: 13 Global Step: 218650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:27:56,284-Speed 4910.72 samples/sec Loss 2.1360 Epoch: 13 Global Step: 218700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:06,487-Speed 5018.39 samples/sec Loss 2.1329 Epoch: 13 Global Step: 218750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:16,725-Speed 5001.42 samples/sec Loss 2.1157 Epoch: 13 Global Step: 218800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:27,104-Speed 4933.28 samples/sec Loss 2.1165 Epoch: 13 Global Step: 218850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:37,944-Speed 4723.53 samples/sec Loss 2.1085 Epoch: 13 Global Step: 218900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:48,182-Speed 5001.47 samples/sec Loss 2.1205 Epoch: 13 Global Step: 218950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:28:58,535-Speed 4945.58 samples/sec Loss 2.1265 Epoch: 13 Global Step: 219000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:29:08,945-Speed 4918.92 samples/sec Loss 2.1207 Epoch: 13 Global Step: 219050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:29:19,329-Speed 4930.83 samples/sec Loss 2.1224 Epoch: 13 Global Step: 219100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:29:29,984-Speed 4805.47 samples/sec Loss 2.1468 Epoch: 13 Global Step: 219150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:29:40,576-Speed 4834.24 samples/sec Loss 2.1183 Epoch: 13 Global Step: 219200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:29:50,730-Speed 5042.42 samples/sec Loss 2.1170 Epoch: 13 Global Step: 219250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:01,332-Speed 4829.96 samples/sec Loss 2.1115 Epoch: 13 Global Step: 219300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:11,418-Speed 5076.32 samples/sec Loss 2.1547 Epoch: 13 Global Step: 219350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:21,530-Speed 5063.96 samples/sec Loss 2.1247 Epoch: 13 Global Step: 219400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:31,907-Speed 4934.18 samples/sec Loss 2.1125 Epoch: 13 Global Step: 219450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:42,280-Speed 4936.43 samples/sec Loss 2.1364 Epoch: 13 Global Step: 219500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:30:52,781-Speed 4875.73 samples/sec Loss 2.1346 Epoch: 13 Global Step: 219550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:03,458-Speed 4795.84 samples/sec Loss 2.1357 Epoch: 13 Global Step: 219600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:13,610-Speed 5043.42 samples/sec Loss 2.1439 Epoch: 13 Global Step: 219650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:24,026-Speed 4915.79 samples/sec Loss 2.1535 Epoch: 13 Global Step: 219700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:34,037-Speed 5114.79 samples/sec Loss 2.1269 Epoch: 13 Global Step: 219750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:44,396-Speed 4942.63 samples/sec Loss 2.1188 Epoch: 13 Global Step: 219800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:31:54,543-Speed 5046.37 samples/sec Loss 2.0971 Epoch: 13 Global Step: 219850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:32:04,818-Speed 4983.40 samples/sec Loss 2.1337 Epoch: 13 Global Step: 219900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:32:15,077-Speed 4990.76 samples/sec Loss 2.1444 Epoch: 13 Global Step: 219950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:32:25,282-Speed 5017.33 samples/sec Loss 2.0894 Epoch: 13 Global Step: 220000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:32:42,028-[lfw][220000]XNorm: 23.454123 Training: 2021-03-19 10:32:42,028-[lfw][220000]Accuracy-Flip: 0.99633+-0.00287 Training: 2021-03-19 10:32:42,028-[lfw][220000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:33:00,676-[cfp_fp][220000]XNorm: 19.652305 Training: 2021-03-19 10:33:00,677-[cfp_fp][220000]Accuracy-Flip: 0.97486+-0.01004 Training: 2021-03-19 10:33:00,677-[cfp_fp][220000]Accuracy-Highest: 0.97486 Training: 2021-03-19 10:33:16,916-[agedb_30][220000]XNorm: 22.729728 Training: 2021-03-19 10:33:16,916-[agedb_30][220000]Accuracy-Flip: 0.97633+-0.00843 Training: 2021-03-19 10:33:16,916-[agedb_30][220000]Accuracy-Highest: 0.97633 Training: 2021-03-19 10:33:28,723-Speed 807.06 samples/sec Loss 2.1449 Epoch: 13 Global Step: 220050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:33:38,928-Speed 5017.34 samples/sec Loss 2.1300 Epoch: 13 Global Step: 220100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:33:49,409-Speed 4885.62 samples/sec Loss 2.1262 Epoch: 13 Global Step: 220150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:33:59,433-Speed 5107.93 samples/sec Loss 2.1377 Epoch: 13 Global Step: 220200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:34:09,922-Speed 4881.68 samples/sec Loss 2.1336 Epoch: 13 Global Step: 220250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:34:20,321-Speed 4923.66 samples/sec Loss 2.1324 Epoch: 13 Global Step: 220300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:34:30,840-Speed 4867.45 samples/sec Loss 2.1346 Epoch: 13 Global Step: 220350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:34:41,296-Speed 4897.60 samples/sec Loss 2.1464 Epoch: 13 Global Step: 220400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:34:51,492-Speed 5021.71 samples/sec Loss 2.1390 Epoch: 13 Global Step: 220450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:02,278-Speed 4746.99 samples/sec Loss 2.1197 Epoch: 13 Global Step: 220500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:12,388-Speed 5064.68 samples/sec Loss 2.1479 Epoch: 13 Global Step: 220550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:22,716-Speed 4957.62 samples/sec Loss 2.1292 Epoch: 13 Global Step: 220600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:32,820-Speed 5067.98 samples/sec Loss 2.1121 Epoch: 13 Global Step: 220650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:42,766-Speed 5147.78 samples/sec Loss 2.1554 Epoch: 13 Global Step: 220700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:35:53,434-Speed 4799.49 samples/sec Loss 2.1237 Epoch: 13 Global Step: 220750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:03,512-Speed 5080.67 samples/sec Loss 2.1353 Epoch: 13 Global Step: 220800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:14,499-Speed 4660.39 samples/sec Loss 2.1410 Epoch: 13 Global Step: 220850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:24,642-Speed 5048.49 samples/sec Loss 2.1092 Epoch: 13 Global Step: 220900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:34,804-Speed 5038.68 samples/sec Loss 2.1408 Epoch: 13 Global Step: 220950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:46,026-Speed 4562.70 samples/sec Loss 2.1390 Epoch: 13 Global Step: 221000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:36:56,192-Speed 5036.83 samples/sec Loss 2.1413 Epoch: 13 Global Step: 221050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:06,464-Speed 4984.53 samples/sec Loss 2.1439 Epoch: 13 Global Step: 221100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:17,311-Speed 4720.36 samples/sec Loss 2.1472 Epoch: 13 Global Step: 221150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:28,569-Speed 4547.93 samples/sec Loss 2.1402 Epoch: 13 Global Step: 221200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:38,826-Speed 4992.37 samples/sec Loss 2.1175 Epoch: 13 Global Step: 221250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:49,310-Speed 4883.56 samples/sec Loss 2.1767 Epoch: 13 Global Step: 221300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:37:59,495-Speed 5027.69 samples/sec Loss 2.1184 Epoch: 13 Global Step: 221350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:38:10,533-Speed 4638.78 samples/sec Loss 2.1432 Epoch: 13 Global Step: 221400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:38:20,823-Speed 4976.23 samples/sec Loss 2.1262 Epoch: 13 Global Step: 221450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:38:31,189-Speed 4939.24 samples/sec Loss 2.1422 Epoch: 13 Global Step: 221500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:38:41,296-Speed 5066.22 samples/sec Loss 2.1189 Epoch: 13 Global Step: 221550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:38:52,053-Speed 4759.96 samples/sec Loss 2.1253 Epoch: 13 Global Step: 221600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:02,204-Speed 5044.11 samples/sec Loss 2.1469 Epoch: 13 Global Step: 221650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:12,440-Speed 5002.30 samples/sec Loss 2.1424 Epoch: 13 Global Step: 221700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:22,794-Speed 4945.59 samples/sec Loss 2.1263 Epoch: 13 Global Step: 221750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:33,376-Speed 4838.30 samples/sec Loss 2.1023 Epoch: 13 Global Step: 221800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:43,611-Speed 5002.84 samples/sec Loss 2.1186 Epoch: 13 Global Step: 221850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:39:54,241-Speed 4817.28 samples/sec Loss 2.1589 Epoch: 13 Global Step: 221900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:40:04,651-Speed 4918.54 samples/sec Loss 2.1741 Epoch: 13 Global Step: 221950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:40:15,004-Speed 4945.46 samples/sec Loss 2.1575 Epoch: 13 Global Step: 222000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:40:31,706-[lfw][222000]XNorm: 23.242118 Training: 2021-03-19 10:40:31,706-[lfw][222000]Accuracy-Flip: 0.99717+-0.00248 Training: 2021-03-19 10:40:31,706-[lfw][222000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:40:50,377-[cfp_fp][222000]XNorm: 19.441014 Training: 2021-03-19 10:40:50,377-[cfp_fp][222000]Accuracy-Flip: 0.97386+-0.00933 Training: 2021-03-19 10:40:50,377-[cfp_fp][222000]Accuracy-Highest: 0.97486 Training: 2021-03-19 10:41:06,700-[agedb_30][222000]XNorm: 22.527400 Training: 2021-03-19 10:41:06,701-[agedb_30][222000]Accuracy-Flip: 0.97617+-0.00628 Training: 2021-03-19 10:41:06,701-[agedb_30][222000]Accuracy-Highest: 0.97633 Training: 2021-03-19 10:41:16,639-Speed 830.71 samples/sec Loss 2.1210 Epoch: 13 Global Step: 222050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:41:27,191-Speed 4852.63 samples/sec Loss 2.1328 Epoch: 13 Global Step: 222100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:41:37,268-Speed 5080.88 samples/sec Loss 2.1541 Epoch: 13 Global Step: 222150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:41:47,405-Speed 5051.41 samples/sec Loss 2.1406 Epoch: 13 Global Step: 222200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:41:57,569-Speed 5037.70 samples/sec Loss 2.1224 Epoch: 13 Global Step: 222250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:07,486-Speed 5163.19 samples/sec Loss 2.1273 Epoch: 13 Global Step: 222300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:17,936-Speed 4899.86 samples/sec Loss 2.1215 Epoch: 13 Global Step: 222350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:28,220-Speed 4978.74 samples/sec Loss 2.1348 Epoch: 13 Global Step: 222400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:38,553-Speed 4955.53 samples/sec Loss 2.1473 Epoch: 13 Global Step: 222450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:48,789-Speed 5002.30 samples/sec Loss 2.1783 Epoch: 13 Global Step: 222500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:42:58,938-Speed 5045.18 samples/sec Loss 2.1292 Epoch: 13 Global Step: 222550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:43:09,419-Speed 4885.42 samples/sec Loss 2.1642 Epoch: 13 Global Step: 222600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:43:19,898-Speed 4886.25 samples/sec Loss 2.1464 Epoch: 13 Global Step: 222650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:43:30,095-Speed 5021.41 samples/sec Loss 2.1582 Epoch: 13 Global Step: 222700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:43:40,213-Speed 5060.32 samples/sec Loss 2.1702 Epoch: 13 Global Step: 222750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:43:50,515-Speed 4970.29 samples/sec Loss 2.1305 Epoch: 13 Global Step: 222800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:00,531-Speed 5112.09 samples/sec Loss 2.1174 Epoch: 13 Global Step: 222850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:10,892-Speed 4942.42 samples/sec Loss 2.1336 Epoch: 13 Global Step: 222900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:21,383-Speed 4880.61 samples/sec Loss 2.1443 Epoch: 13 Global Step: 222950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:31,623-Speed 5000.26 samples/sec Loss 2.1407 Epoch: 13 Global Step: 223000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:41,783-Speed 5039.45 samples/sec Loss 2.1496 Epoch: 13 Global Step: 223050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:44:51,797-Speed 5113.43 samples/sec Loss 2.1880 Epoch: 13 Global Step: 223100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:02,177-Speed 4932.77 samples/sec Loss 2.1328 Epoch: 13 Global Step: 223150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:12,479-Speed 4970.27 samples/sec Loss 2.1517 Epoch: 13 Global Step: 223200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:22,905-Speed 4911.25 samples/sec Loss 2.1212 Epoch: 13 Global Step: 223250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:33,320-Speed 4916.49 samples/sec Loss 2.1416 Epoch: 13 Global Step: 223300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:43,706-Speed 4929.91 samples/sec Loss 2.1619 Epoch: 13 Global Step: 223350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:45:53,940-Speed 5002.94 samples/sec Loss 2.1614 Epoch: 13 Global Step: 223400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:05,445-Speed 4450.77 samples/sec Loss 2.1182 Epoch: 13 Global Step: 223450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:15,809-Speed 4940.16 samples/sec Loss 2.1305 Epoch: 13 Global Step: 223500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:25,873-Speed 5088.14 samples/sec Loss 2.1195 Epoch: 13 Global Step: 223550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:36,051-Speed 5030.48 samples/sec Loss 2.1262 Epoch: 13 Global Step: 223600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:46,326-Speed 4983.78 samples/sec Loss 2.1398 Epoch: 13 Global Step: 223650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:46:56,495-Speed 5035.08 samples/sec Loss 2.1430 Epoch: 13 Global Step: 223700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:06,760-Speed 4988.02 samples/sec Loss 2.1627 Epoch: 13 Global Step: 223750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:16,846-Speed 5077.17 samples/sec Loss 2.1507 Epoch: 13 Global Step: 223800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:27,270-Speed 4912.07 samples/sec Loss 2.1540 Epoch: 13 Global Step: 223850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:38,269-Speed 4655.14 samples/sec Loss 2.1427 Epoch: 13 Global Step: 223900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:48,076-Speed 5221.43 samples/sec Loss 2.1592 Epoch: 13 Global Step: 223950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:47:58,039-Speed 5139.45 samples/sec Loss 2.1262 Epoch: 13 Global Step: 224000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:48:14,696-[lfw][224000]XNorm: 23.340932 Training: 2021-03-19 10:48:14,696-[lfw][224000]Accuracy-Flip: 0.99667+-0.00258 Training: 2021-03-19 10:48:14,696-[lfw][224000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:48:33,432-[cfp_fp][224000]XNorm: 19.568592 Training: 2021-03-19 10:48:33,432-[cfp_fp][224000]Accuracy-Flip: 0.97271+-0.00900 Training: 2021-03-19 10:48:33,433-[cfp_fp][224000]Accuracy-Highest: 0.97486 Training: 2021-03-19 10:48:49,470-[agedb_30][224000]XNorm: 22.597881 Training: 2021-03-19 10:48:49,471-[agedb_30][224000]Accuracy-Flip: 0.97483+-0.00825 Training: 2021-03-19 10:48:49,471-[agedb_30][224000]Accuracy-Highest: 0.97633 Training: 2021-03-19 10:48:59,382-Speed 834.66 samples/sec Loss 2.1442 Epoch: 13 Global Step: 224050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:49:09,506-Speed 5057.39 samples/sec Loss 2.1528 Epoch: 13 Global Step: 224100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:49:20,177-Speed 4798.35 samples/sec Loss 2.1493 Epoch: 13 Global Step: 224150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:49:31,137-Speed 4672.02 samples/sec Loss 2.1669 Epoch: 13 Global Step: 224200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:49:41,191-Speed 5092.82 samples/sec Loss 2.1532 Epoch: 13 Global Step: 224250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:49:51,214-Speed 5108.34 samples/sec Loss 2.1343 Epoch: 13 Global Step: 224300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:01,983-Speed 4754.96 samples/sec Loss 2.1386 Epoch: 13 Global Step: 224350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:12,437-Speed 4897.72 samples/sec Loss 2.1397 Epoch: 13 Global Step: 224400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:22,966-Speed 4863.20 samples/sec Loss 2.1430 Epoch: 13 Global Step: 224450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:33,101-Speed 5051.78 samples/sec Loss 2.1174 Epoch: 13 Global Step: 224500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:44,162-Speed 4629.47 samples/sec Loss 2.1653 Epoch: 13 Global Step: 224550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:50:55,028-Speed 4712.12 samples/sec Loss 2.1224 Epoch: 13 Global Step: 224600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:05,235-Speed 5016.60 samples/sec Loss 2.1365 Epoch: 13 Global Step: 224650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:15,305-Speed 5084.61 samples/sec Loss 2.1426 Epoch: 13 Global Step: 224700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:25,620-Speed 4963.89 samples/sec Loss 2.1259 Epoch: 13 Global Step: 224750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:36,823-Speed 4570.64 samples/sec Loss 2.1413 Epoch: 13 Global Step: 224800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:46,990-Speed 5036.30 samples/sec Loss 2.1608 Epoch: 13 Global Step: 224850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:51:57,417-Speed 4910.49 samples/sec Loss 2.1548 Epoch: 13 Global Step: 224900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:52:08,369-Speed 4675.25 samples/sec Loss 2.1538 Epoch: 13 Global Step: 224950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:52:18,872-Speed 4875.09 samples/sec Loss 2.1381 Epoch: 13 Global Step: 225000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:52:29,079-Speed 5016.48 samples/sec Loss 2.1241 Epoch: 13 Global Step: 225050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:52:39,559-Speed 4885.71 samples/sec Loss 2.1716 Epoch: 13 Global Step: 225100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:52:49,988-Speed 4909.84 samples/sec Loss 2.1565 Epoch: 13 Global Step: 225150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:00,079-Speed 5074.47 samples/sec Loss 2.1570 Epoch: 13 Global Step: 225200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:10,227-Speed 5045.42 samples/sec Loss 2.1172 Epoch: 13 Global Step: 225250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:20,476-Speed 4996.29 samples/sec Loss 2.1498 Epoch: 13 Global Step: 225300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:30,925-Speed 4899.71 samples/sec Loss 2.1465 Epoch: 13 Global Step: 225350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:41,356-Speed 4908.75 samples/sec Loss 2.1308 Epoch: 13 Global Step: 225400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:53:51,625-Speed 4986.16 samples/sec Loss 2.1591 Epoch: 13 Global Step: 225450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:01,789-Speed 5037.80 samples/sec Loss 2.1616 Epoch: 13 Global Step: 225500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:12,177-Speed 4929.09 samples/sec Loss 2.1502 Epoch: 13 Global Step: 225550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:22,186-Speed 5115.98 samples/sec Loss 2.1392 Epoch: 13 Global Step: 225600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:32,285-Speed 5070.27 samples/sec Loss 2.1405 Epoch: 13 Global Step: 225650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:42,263-Speed 5131.58 samples/sec Loss 2.1392 Epoch: 13 Global Step: 225700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:54:52,398-Speed 5051.96 samples/sec Loss 2.1473 Epoch: 13 Global Step: 225750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:55:02,523-Speed 5057.26 samples/sec Loss 2.1558 Epoch: 13 Global Step: 225800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:55:12,749-Speed 5006.95 samples/sec Loss 2.1687 Epoch: 13 Global Step: 225850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:55:22,736-Speed 5126.95 samples/sec Loss 2.1495 Epoch: 13 Global Step: 225900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:55:32,989-Speed 4994.07 samples/sec Loss 2.1350 Epoch: 13 Global Step: 225950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:55:43,196-Speed 5016.51 samples/sec Loss 2.1466 Epoch: 13 Global Step: 226000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:56:00,277-[lfw][226000]XNorm: 23.770960 Training: 2021-03-19 10:56:00,277-[lfw][226000]Accuracy-Flip: 0.99650+-0.00283 Training: 2021-03-19 10:56:00,277-[lfw][226000]Accuracy-Highest: 0.99767 Training: 2021-03-19 10:56:19,375-[cfp_fp][226000]XNorm: 19.825632 Training: 2021-03-19 10:56:19,375-[cfp_fp][226000]Accuracy-Flip: 0.97400+-0.00945 Training: 2021-03-19 10:56:19,375-[cfp_fp][226000]Accuracy-Highest: 0.97486 Training: 2021-03-19 10:56:35,652-[agedb_30][226000]XNorm: 22.966916 Training: 2021-03-19 10:56:35,652-[agedb_30][226000]Accuracy-Flip: 0.97550+-0.00789 Training: 2021-03-19 10:56:35,654-[agedb_30][226000]Accuracy-Highest: 0.97633 Training: 2021-03-19 10:56:45,829-Speed 817.47 samples/sec Loss 2.1459 Epoch: 13 Global Step: 226050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:56:56,148-Speed 4962.10 samples/sec Loss 2.1487 Epoch: 13 Global Step: 226100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:06,446-Speed 4972.13 samples/sec Loss 2.1334 Epoch: 13 Global Step: 226150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:17,036-Speed 4835.22 samples/sec Loss 2.1441 Epoch: 13 Global Step: 226200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:27,596-Speed 4848.59 samples/sec Loss 2.1531 Epoch: 13 Global Step: 226250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:37,752-Speed 5041.99 samples/sec Loss 2.1974 Epoch: 13 Global Step: 226300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:47,759-Speed 5116.46 samples/sec Loss 2.1302 Epoch: 13 Global Step: 226350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:57:58,049-Speed 4976.48 samples/sec Loss 2.1380 Epoch: 13 Global Step: 226400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:08,253-Speed 5017.67 samples/sec Loss 2.1375 Epoch: 13 Global Step: 226450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:18,360-Speed 5066.17 samples/sec Loss 2.1394 Epoch: 13 Global Step: 226500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:28,562-Speed 5019.10 samples/sec Loss 2.1505 Epoch: 13 Global Step: 226550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:38,579-Speed 5111.40 samples/sec Loss 2.1628 Epoch: 13 Global Step: 226600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:48,799-Speed 5010.15 samples/sec Loss 2.1584 Epoch: 13 Global Step: 226650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:58:59,150-Speed 4946.77 samples/sec Loss 2.1670 Epoch: 13 Global Step: 226700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:59:09,748-Speed 4831.58 samples/sec Loss 2.1337 Epoch: 13 Global Step: 226750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:59:20,267-Speed 4867.63 samples/sec Loss 2.1648 Epoch: 13 Global Step: 226800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:59:31,363-Speed 4614.57 samples/sec Loss 2.1727 Epoch: 13 Global Step: 226850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:59:41,718-Speed 4944.83 samples/sec Loss 2.1334 Epoch: 13 Global Step: 226900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 10:59:51,909-Speed 5024.33 samples/sec Loss 2.1485 Epoch: 13 Global Step: 226950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:02,130-Speed 5009.60 samples/sec Loss 2.1647 Epoch: 13 Global Step: 227000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:12,330-Speed 5019.72 samples/sec Loss 2.1525 Epoch: 13 Global Step: 227050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:22,596-Speed 4987.69 samples/sec Loss 2.1288 Epoch: 13 Global Step: 227100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:32,848-Speed 4994.61 samples/sec Loss 2.1606 Epoch: 13 Global Step: 227150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:43,899-Speed 4633.19 samples/sec Loss 2.1644 Epoch: 13 Global Step: 227200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:00:54,062-Speed 5038.40 samples/sec Loss 2.1426 Epoch: 13 Global Step: 227250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:04,567-Speed 4873.98 samples/sec Loss 2.1585 Epoch: 13 Global Step: 227300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:14,899-Speed 4955.78 samples/sec Loss 2.1493 Epoch: 13 Global Step: 227350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:25,316-Speed 4915.30 samples/sec Loss 2.1227 Epoch: 13 Global Step: 227400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:35,683-Speed 4939.38 samples/sec Loss 2.1508 Epoch: 13 Global Step: 227450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:45,858-Speed 5032.31 samples/sec Loss 2.1360 Epoch: 13 Global Step: 227500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:01:56,006-Speed 5045.26 samples/sec Loss 2.1379 Epoch: 13 Global Step: 227550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:07,215-Speed 4568.16 samples/sec Loss 2.1604 Epoch: 13 Global Step: 227600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:17,476-Speed 4989.97 samples/sec Loss 2.1808 Epoch: 13 Global Step: 227650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:27,761-Speed 4978.64 samples/sec Loss 2.1607 Epoch: 13 Global Step: 227700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:38,193-Speed 4908.24 samples/sec Loss 2.1595 Epoch: 13 Global Step: 227750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:48,833-Speed 4812.14 samples/sec Loss 2.1651 Epoch: 13 Global Step: 227800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:02:59,225-Speed 4927.03 samples/sec Loss 2.1839 Epoch: 13 Global Step: 227850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:03:10,265-Speed 4637.79 samples/sec Loss 2.1409 Epoch: 13 Global Step: 227900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:03:21,296-Speed 4642.12 samples/sec Loss 2.1646 Epoch: 13 Global Step: 227950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:03:31,502-Speed 5016.61 samples/sec Loss 2.1530 Epoch: 13 Global Step: 228000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:03:47,953-[lfw][228000]XNorm: 23.465638 Training: 2021-03-19 11:03:47,953-[lfw][228000]Accuracy-Flip: 0.99717+-0.00259 Training: 2021-03-19 11:03:47,953-[lfw][228000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:04:06,517-[cfp_fp][228000]XNorm: 19.672331 Training: 2021-03-19 11:04:06,517-[cfp_fp][228000]Accuracy-Flip: 0.96971+-0.00815 Training: 2021-03-19 11:04:06,517-[cfp_fp][228000]Accuracy-Highest: 0.97486 Training: 2021-03-19 11:04:23,049-[agedb_30][228000]XNorm: 22.702170 Training: 2021-03-19 11:04:23,049-[agedb_30][228000]Accuracy-Flip: 0.97517+-0.00762 Training: 2021-03-19 11:04:23,049-[agedb_30][228000]Accuracy-Highest: 0.97633 Training: 2021-03-19 11:04:33,208-Speed 829.75 samples/sec Loss 2.1620 Epoch: 13 Global Step: 228050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:04:43,476-Speed 4986.84 samples/sec Loss 2.1709 Epoch: 13 Global Step: 228100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:04:54,388-Speed 4692.32 samples/sec Loss 2.1919 Epoch: 13 Global Step: 228150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:04,522-Speed 5052.79 samples/sec Loss 2.1629 Epoch: 13 Global Step: 228200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:14,609-Speed 5076.43 samples/sec Loss 2.1822 Epoch: 13 Global Step: 228250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:25,586-Speed 4664.65 samples/sec Loss 2.1463 Epoch: 13 Global Step: 228300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:35,824-Speed 5000.93 samples/sec Loss 2.1685 Epoch: 13 Global Step: 228350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:46,130-Speed 4968.19 samples/sec Loss 2.1251 Epoch: 13 Global Step: 228400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:05:56,189-Speed 5090.60 samples/sec Loss 2.1547 Epoch: 13 Global Step: 228450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:06,297-Speed 5065.44 samples/sec Loss 2.1452 Epoch: 13 Global Step: 228500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:16,526-Speed 5005.96 samples/sec Loss 2.1478 Epoch: 13 Global Step: 228550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:27,145-Speed 4822.13 samples/sec Loss 2.1495 Epoch: 13 Global Step: 228600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:37,494-Speed 4947.38 samples/sec Loss 2.1558 Epoch: 13 Global Step: 228650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:48,090-Speed 4832.21 samples/sec Loss 2.1404 Epoch: 13 Global Step: 228700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:06:58,177-Speed 5076.07 samples/sec Loss 2.1462 Epoch: 13 Global Step: 228750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:08,521-Speed 4950.04 samples/sec Loss 2.1468 Epoch: 13 Global Step: 228800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:18,628-Speed 5066.04 samples/sec Loss 2.1502 Epoch: 13 Global Step: 228850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:28,842-Speed 5013.37 samples/sec Loss 2.1548 Epoch: 13 Global Step: 228900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:38,869-Speed 5106.27 samples/sec Loss 2.1461 Epoch: 13 Global Step: 228950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:49,319-Speed 4899.69 samples/sec Loss 2.1484 Epoch: 13 Global Step: 229000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:07:59,736-Speed 4915.34 samples/sec Loss 2.1337 Epoch: 13 Global Step: 229050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:08:10,330-Speed 4833.38 samples/sec Loss 2.1686 Epoch: 13 Global Step: 229100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:08:20,894-Speed 4846.64 samples/sec Loss 2.1750 Epoch: 13 Global Step: 229150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:08:31,234-Speed 4952.21 samples/sec Loss 2.1568 Epoch: 13 Global Step: 229200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:08:41,496-Speed 4989.47 samples/sec Loss 2.1308 Epoch: 13 Global Step: 229250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:08:51,637-Speed 5049.51 samples/sec Loss 2.1642 Epoch: 13 Global Step: 229300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:02,004-Speed 4938.95 samples/sec Loss 2.1597 Epoch: 13 Global Step: 229350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:12,335-Speed 4956.05 samples/sec Loss 2.1624 Epoch: 13 Global Step: 229400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:22,348-Speed 5113.91 samples/sec Loss 2.1396 Epoch: 13 Global Step: 229450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:32,613-Speed 4988.17 samples/sec Loss 2.1695 Epoch: 13 Global Step: 229500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:42,889-Speed 4982.79 samples/sec Loss 2.1874 Epoch: 13 Global Step: 229550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:09:53,057-Speed 5035.39 samples/sec Loss 2.1493 Epoch: 13 Global Step: 229600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:03,193-Speed 5051.77 samples/sec Loss 2.1361 Epoch: 13 Global Step: 229650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:13,621-Speed 4910.14 samples/sec Loss 2.1723 Epoch: 13 Global Step: 229700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:23,687-Speed 5087.05 samples/sec Loss 2.1699 Epoch: 13 Global Step: 229750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:33,686-Speed 5120.78 samples/sec Loss 2.1763 Epoch: 13 Global Step: 229800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:44,031-Speed 4949.51 samples/sec Loss 2.1428 Epoch: 13 Global Step: 229850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:10:54,169-Speed 5050.39 samples/sec Loss 2.1372 Epoch: 13 Global Step: 229900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:11:04,367-Speed 5021.39 samples/sec Loss 2.1415 Epoch: 13 Global Step: 229950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:11:14,421-Speed 5092.82 samples/sec Loss 2.1572 Epoch: 13 Global Step: 230000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:11:31,256-[lfw][230000]XNorm: 23.119006 Training: 2021-03-19 11:11:31,256-[lfw][230000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 11:11:31,256-[lfw][230000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:11:49,903-[cfp_fp][230000]XNorm: 19.330721 Training: 2021-03-19 11:11:49,903-[cfp_fp][230000]Accuracy-Flip: 0.97614+-0.01052 Training: 2021-03-19 11:11:49,903-[cfp_fp][230000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:12:06,016-[agedb_30][230000]XNorm: 22.393630 Training: 2021-03-19 11:12:06,016-[agedb_30][230000]Accuracy-Flip: 0.97533+-0.00912 Training: 2021-03-19 11:12:06,016-[agedb_30][230000]Accuracy-Highest: 0.97633 Training: 2021-03-19 11:12:16,179-Speed 829.04 samples/sec Loss 2.1455 Epoch: 13 Global Step: 230050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:12:26,404-Speed 5007.50 samples/sec Loss 2.1390 Epoch: 13 Global Step: 230100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:12:36,438-Speed 5103.00 samples/sec Loss 2.1535 Epoch: 13 Global Step: 230150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:12:46,468-Speed 5104.97 samples/sec Loss 2.1618 Epoch: 13 Global Step: 230200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:12:56,756-Speed 4976.93 samples/sec Loss 2.1552 Epoch: 13 Global Step: 230250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:07,635-Speed 4706.62 samples/sec Loss 2.1613 Epoch: 13 Global Step: 230300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:17,831-Speed 5022.06 samples/sec Loss 2.1488 Epoch: 13 Global Step: 230350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:28,174-Speed 4950.13 samples/sec Loss 2.1614 Epoch: 13 Global Step: 230400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:38,317-Speed 5048.21 samples/sec Loss 2.1566 Epoch: 13 Global Step: 230450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:48,477-Speed 5040.15 samples/sec Loss 2.1769 Epoch: 13 Global Step: 230500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:13:59,772-Speed 4533.11 samples/sec Loss 2.1782 Epoch: 13 Global Step: 230550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:14:09,988-Speed 5012.27 samples/sec Loss 2.1679 Epoch: 13 Global Step: 230600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:14:20,460-Speed 4889.34 samples/sec Loss 2.1538 Epoch: 13 Global Step: 230650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:14:30,840-Speed 4932.92 samples/sec Loss 2.1874 Epoch: 13 Global Step: 230700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:14:40,936-Speed 5071.70 samples/sec Loss 2.1726 Epoch: 13 Global Step: 230750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:14:51,537-Speed 4829.95 samples/sec Loss 2.1678 Epoch: 13 Global Step: 230800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:01,700-Speed 5038.14 samples/sec Loss 2.1643 Epoch: 13 Global Step: 230850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:12,422-Speed 4775.43 samples/sec Loss 2.1646 Epoch: 13 Global Step: 230900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:22,763-Speed 4951.43 samples/sec Loss 2.1716 Epoch: 13 Global Step: 230950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:32,943-Speed 5029.71 samples/sec Loss 2.1636 Epoch: 13 Global Step: 231000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:43,066-Speed 5057.83 samples/sec Loss 2.1580 Epoch: 13 Global Step: 231050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:15:53,338-Speed 4985.26 samples/sec Loss 2.1486 Epoch: 13 Global Step: 231100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:04,278-Speed 4680.16 samples/sec Loss 2.1909 Epoch: 13 Global Step: 231150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:14,782-Speed 4874.77 samples/sec Loss 2.1307 Epoch: 13 Global Step: 231200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:25,892-Speed 4608.70 samples/sec Loss 2.1369 Epoch: 13 Global Step: 231250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:37,064-Speed 4583.26 samples/sec Loss 2.1729 Epoch: 13 Global Step: 231300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:47,125-Speed 5089.35 samples/sec Loss 2.1356 Epoch: 13 Global Step: 231350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:16:57,381-Speed 4992.53 samples/sec Loss 2.1490 Epoch: 13 Global Step: 231400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:17:07,939-Speed 4849.84 samples/sec Loss 2.1468 Epoch: 13 Global Step: 231450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:17:19,039-Speed 4612.95 samples/sec Loss 2.1679 Epoch: 13 Global Step: 231500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:17:29,050-Speed 5114.40 samples/sec Loss 2.1485 Epoch: 13 Global Step: 231550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:17:39,364-Speed 4964.33 samples/sec Loss 2.1503 Epoch: 13 Global Step: 231600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:17:49,804-Speed 4904.65 samples/sec Loss 2.1759 Epoch: 13 Global Step: 231650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:18:00,126-Speed 4960.75 samples/sec Loss 2.1598 Epoch: 13 Global Step: 231700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:18:10,830-Speed 4783.36 samples/sec Loss 2.1472 Epoch: 13 Global Step: 231750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:18:21,243-Speed 4917.40 samples/sec Loss 2.1622 Epoch: 13 Global Step: 231800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:18:31,428-Speed 5027.35 samples/sec Loss 2.1667 Epoch: 13 Global Step: 231850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-19 11:18:41,912-Speed 4884.19 samples/sec Loss 2.1676 Epoch: 13 Global Step: 231900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:18:52,378-Speed 4892.13 samples/sec Loss 2.1508 Epoch: 13 Global Step: 231950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:19:02,915-Speed 4859.15 samples/sec Loss 2.1664 Epoch: 13 Global Step: 232000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:19:19,718-[lfw][232000]XNorm: 23.313950 Training: 2021-03-19 11:19:19,719-[lfw][232000]Accuracy-Flip: 0.99717+-0.00269 Training: 2021-03-19 11:19:19,719-[lfw][232000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:19:38,408-[cfp_fp][232000]XNorm: 19.567316 Training: 2021-03-19 11:19:38,408-[cfp_fp][232000]Accuracy-Flip: 0.97286+-0.00862 Training: 2021-03-19 11:19:38,408-[cfp_fp][232000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:19:54,529-[agedb_30][232000]XNorm: 22.496442 Training: 2021-03-19 11:19:54,529-[agedb_30][232000]Accuracy-Flip: 0.97667+-0.00771 Training: 2021-03-19 11:19:54,529-[agedb_30][232000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:20:04,570-Speed 830.45 samples/sec Loss 2.1891 Epoch: 13 Global Step: 232050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:20:14,739-Speed 5035.13 samples/sec Loss 2.1280 Epoch: 13 Global Step: 232100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:20:25,289-Speed 4853.35 samples/sec Loss 2.1988 Epoch: 13 Global Step: 232150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:20:35,466-Speed 5031.46 samples/sec Loss 2.1600 Epoch: 13 Global Step: 232200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:20:45,599-Speed 5053.51 samples/sec Loss 2.1532 Epoch: 13 Global Step: 232250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:20:55,948-Speed 4947.59 samples/sec Loss 2.1604 Epoch: 13 Global Step: 232300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:06,277-Speed 4956.97 samples/sec Loss 2.1633 Epoch: 13 Global Step: 232350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:16,560-Speed 4979.70 samples/sec Loss 2.1670 Epoch: 13 Global Step: 232400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:26,792-Speed 5004.26 samples/sec Loss 2.2015 Epoch: 13 Global Step: 232450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:36,854-Speed 5089.04 samples/sec Loss 2.1856 Epoch: 13 Global Step: 232500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:46,989-Speed 5052.02 samples/sec Loss 2.1474 Epoch: 13 Global Step: 232550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:21:57,486-Speed 4877.58 samples/sec Loss 2.1494 Epoch: 13 Global Step: 232600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:07,844-Speed 4943.87 samples/sec Loss 2.1686 Epoch: 13 Global Step: 232650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:18,291-Speed 4900.89 samples/sec Loss 2.1695 Epoch: 13 Global Step: 232700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:28,414-Speed 5058.04 samples/sec Loss 2.1848 Epoch: 13 Global Step: 232750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:38,424-Speed 5115.57 samples/sec Loss 2.1708 Epoch: 13 Global Step: 232800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:48,991-Speed 4845.47 samples/sec Loss 2.1765 Epoch: 13 Global Step: 232850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:22:59,342-Speed 4946.66 samples/sec Loss 2.1503 Epoch: 13 Global Step: 232900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:23:09,763-Speed 4913.42 samples/sec Loss 2.1541 Epoch: 13 Global Step: 232950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:23:20,112-Speed 4947.50 samples/sec Loss 2.1618 Epoch: 13 Global Step: 233000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:23:30,411-Speed 4971.94 samples/sec Loss 2.1689 Epoch: 13 Global Step: 233050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:23:40,520-Speed 5065.04 samples/sec Loss 2.1806 Epoch: 13 Global Step: 233100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:23:50,819-Speed 4971.53 samples/sec Loss 2.1592 Epoch: 13 Global Step: 233150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:01,071-Speed 4994.84 samples/sec Loss 2.1631 Epoch: 13 Global Step: 233200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:11,549-Speed 4886.88 samples/sec Loss 2.1728 Epoch: 13 Global Step: 233250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:21,944-Speed 4925.56 samples/sec Loss 2.1603 Epoch: 13 Global Step: 233300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:32,147-Speed 5018.56 samples/sec Loss 2.1602 Epoch: 13 Global Step: 233350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:42,486-Speed 4952.45 samples/sec Loss 2.1440 Epoch: 13 Global Step: 233400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:24:52,682-Speed 5021.80 samples/sec Loss 2.1874 Epoch: 13 Global Step: 233450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:25:03,089-Speed 4920.18 samples/sec Loss 2.1485 Epoch: 13 Global Step: 233500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:25:13,506-Speed 4915.37 samples/sec Loss 2.1615 Epoch: 13 Global Step: 233550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:25:23,728-Speed 5009.31 samples/sec Loss 2.1699 Epoch: 13 Global Step: 233600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:25:34,035-Speed 4967.68 samples/sec Loss 2.1401 Epoch: 13 Global Step: 233650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:25:56,745-Speed 2254.58 samples/sec Loss 2.1048 Epoch: 14 Global Step: 233700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:07,573-Speed 4728.93 samples/sec Loss 2.0004 Epoch: 14 Global Step: 233750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:18,656-Speed 4619.83 samples/sec Loss 2.0156 Epoch: 14 Global Step: 233800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:28,816-Speed 5040.01 samples/sec Loss 2.0132 Epoch: 14 Global Step: 233850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:38,856-Speed 5100.12 samples/sec Loss 2.0006 Epoch: 14 Global Step: 233900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:48,900-Speed 5098.20 samples/sec Loss 2.0155 Epoch: 14 Global Step: 233950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:26:59,725-Speed 4729.80 samples/sec Loss 2.0245 Epoch: 14 Global Step: 234000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:27:16,516-[lfw][234000]XNorm: 23.168020 Training: 2021-03-19 11:27:16,516-[lfw][234000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 11:27:16,516-[lfw][234000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:27:35,303-[cfp_fp][234000]XNorm: 19.413744 Training: 2021-03-19 11:27:35,303-[cfp_fp][234000]Accuracy-Flip: 0.97557+-0.01003 Training: 2021-03-19 11:27:35,303-[cfp_fp][234000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:27:51,391-[agedb_30][234000]XNorm: 22.470234 Training: 2021-03-19 11:27:51,391-[agedb_30][234000]Accuracy-Flip: 0.97483+-0.00773 Training: 2021-03-19 11:27:51,391-[agedb_30][234000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:28:01,461-Speed 829.35 samples/sec Loss 1.9734 Epoch: 14 Global Step: 234050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:28:11,444-Speed 5129.27 samples/sec Loss 1.9789 Epoch: 14 Global Step: 234100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:28:21,389-Speed 5148.74 samples/sec Loss 1.9816 Epoch: 14 Global Step: 234150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:28:33,615-Speed 4188.12 samples/sec Loss 1.9736 Epoch: 14 Global Step: 234200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:28:43,443-Speed 5209.46 samples/sec Loss 1.9835 Epoch: 14 Global Step: 234250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:28:53,599-Speed 5041.69 samples/sec Loss 2.0080 Epoch: 14 Global Step: 234300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:03,834-Speed 5002.72 samples/sec Loss 1.9580 Epoch: 14 Global Step: 234350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:13,857-Speed 5108.52 samples/sec Loss 1.9886 Epoch: 14 Global Step: 234400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:24,085-Speed 5006.39 samples/sec Loss 1.9668 Epoch: 14 Global Step: 234450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:34,070-Speed 5127.95 samples/sec Loss 1.9830 Epoch: 14 Global Step: 234500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:45,102-Speed 4641.16 samples/sec Loss 1.9751 Epoch: 14 Global Step: 234550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:29:56,003-Speed 4697.02 samples/sec Loss 1.9411 Epoch: 14 Global Step: 234600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:05,955-Speed 5145.17 samples/sec Loss 1.9448 Epoch: 14 Global Step: 234650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:16,730-Speed 4751.83 samples/sec Loss 1.9429 Epoch: 14 Global Step: 234700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:26,650-Speed 5161.45 samples/sec Loss 1.9683 Epoch: 14 Global Step: 234750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:36,886-Speed 5002.46 samples/sec Loss 1.9817 Epoch: 14 Global Step: 234800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:47,535-Speed 4808.14 samples/sec Loss 1.9846 Epoch: 14 Global Step: 234850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:30:57,687-Speed 5043.67 samples/sec Loss 1.9683 Epoch: 14 Global Step: 234900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:07,503-Speed 5215.90 samples/sec Loss 1.9327 Epoch: 14 Global Step: 234950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:17,795-Speed 4975.21 samples/sec Loss 1.9743 Epoch: 14 Global Step: 235000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:27,933-Speed 5050.61 samples/sec Loss 1.9714 Epoch: 14 Global Step: 235050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:38,092-Speed 5040.36 samples/sec Loss 1.9406 Epoch: 14 Global Step: 235100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:49,104-Speed 4649.42 samples/sec Loss 1.9900 Epoch: 14 Global Step: 235150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:31:59,219-Speed 5062.09 samples/sec Loss 1.9444 Epoch: 14 Global Step: 235200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:09,160-Speed 5151.02 samples/sec Loss 1.9606 Epoch: 14 Global Step: 235250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:19,032-Speed 5186.77 samples/sec Loss 1.9659 Epoch: 14 Global Step: 235300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:29,019-Speed 5126.85 samples/sec Loss 1.9760 Epoch: 14 Global Step: 235350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:38,834-Speed 5216.55 samples/sec Loss 1.9537 Epoch: 14 Global Step: 235400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:48,797-Speed 5139.56 samples/sec Loss 1.9736 Epoch: 14 Global Step: 235450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:32:58,731-Speed 5153.86 samples/sec Loss 1.9574 Epoch: 14 Global Step: 235500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:08,568-Speed 5205.47 samples/sec Loss 1.9646 Epoch: 14 Global Step: 235550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:18,728-Speed 5039.34 samples/sec Loss 1.9610 Epoch: 14 Global Step: 235600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:28,600-Speed 5186.73 samples/sec Loss 1.9665 Epoch: 14 Global Step: 235650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:38,450-Speed 5198.15 samples/sec Loss 1.9964 Epoch: 14 Global Step: 235700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:48,504-Speed 5093.09 samples/sec Loss 1.9637 Epoch: 14 Global Step: 235750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:33:58,521-Speed 5111.54 samples/sec Loss 1.9926 Epoch: 14 Global Step: 235800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:34:08,593-Speed 5083.74 samples/sec Loss 1.9382 Epoch: 14 Global Step: 235850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:34:18,584-Speed 5124.86 samples/sec Loss 1.9527 Epoch: 14 Global Step: 235900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:34:28,742-Speed 5040.46 samples/sec Loss 1.9448 Epoch: 14 Global Step: 235950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:34:38,497-Speed 5248.80 samples/sec Loss 1.9503 Epoch: 14 Global Step: 236000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:34:55,173-[lfw][236000]XNorm: 23.214380 Training: 2021-03-19 11:34:55,173-[lfw][236000]Accuracy-Flip: 0.99717+-0.00248 Training: 2021-03-19 11:34:55,173-[lfw][236000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:35:13,749-[cfp_fp][236000]XNorm: 19.464784 Training: 2021-03-19 11:35:13,749-[cfp_fp][236000]Accuracy-Flip: 0.97571+-0.00926 Training: 2021-03-19 11:35:13,749-[cfp_fp][236000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:35:29,894-[agedb_30][236000]XNorm: 22.449815 Training: 2021-03-19 11:35:29,895-[agedb_30][236000]Accuracy-Flip: 0.97633+-0.00756 Training: 2021-03-19 11:35:29,895-[agedb_30][236000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:35:39,643-Speed 837.35 samples/sec Loss 1.9617 Epoch: 14 Global Step: 236050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:35:49,528-Speed 5180.04 samples/sec Loss 1.9405 Epoch: 14 Global Step: 236100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:35:59,378-Speed 5198.27 samples/sec Loss 1.9830 Epoch: 14 Global Step: 236150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:09,211-Speed 5206.88 samples/sec Loss 1.9511 Epoch: 14 Global Step: 236200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:19,107-Speed 5174.06 samples/sec Loss 1.9779 Epoch: 14 Global Step: 236250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:29,013-Speed 5169.09 samples/sec Loss 1.9487 Epoch: 14 Global Step: 236300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:38,906-Speed 5175.77 samples/sec Loss 1.9449 Epoch: 14 Global Step: 236350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:48,916-Speed 5115.14 samples/sec Loss 1.9558 Epoch: 14 Global Step: 236400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:36:58,886-Speed 5135.62 samples/sec Loss 1.9351 Epoch: 14 Global Step: 236450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:08,968-Speed 5078.61 samples/sec Loss 1.9373 Epoch: 14 Global Step: 236500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:19,089-Speed 5059.48 samples/sec Loss 1.9520 Epoch: 14 Global Step: 236550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:29,128-Speed 5100.34 samples/sec Loss 1.9337 Epoch: 14 Global Step: 236600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:39,127-Speed 5120.71 samples/sec Loss 1.9289 Epoch: 14 Global Step: 236650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:49,297-Speed 5034.49 samples/sec Loss 1.9495 Epoch: 14 Global Step: 236700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:37:59,422-Speed 5057.27 samples/sec Loss 1.9702 Epoch: 14 Global Step: 236750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:09,552-Speed 5054.09 samples/sec Loss 1.9259 Epoch: 14 Global Step: 236800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:19,444-Speed 5176.49 samples/sec Loss 1.9582 Epoch: 14 Global Step: 236850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:29,349-Speed 5169.33 samples/sec Loss 1.9361 Epoch: 14 Global Step: 236900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:39,171-Speed 5213.35 samples/sec Loss 1.9488 Epoch: 14 Global Step: 236950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:48,922-Speed 5251.20 samples/sec Loss 1.9333 Epoch: 14 Global Step: 237000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:38:58,918-Speed 5122.26 samples/sec Loss 1.9295 Epoch: 14 Global Step: 237050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:39:08,909-Speed 5124.74 samples/sec Loss 1.9444 Epoch: 14 Global Step: 237100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:39:19,814-Speed 4695.42 samples/sec Loss 1.9556 Epoch: 14 Global Step: 237150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:39:29,924-Speed 5064.82 samples/sec Loss 1.9604 Epoch: 14 Global Step: 237200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:39:40,773-Speed 4719.39 samples/sec Loss 1.9311 Epoch: 14 Global Step: 237250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:39:50,981-Speed 5016.19 samples/sec Loss 1.9227 Epoch: 14 Global Step: 237300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:00,746-Speed 5243.45 samples/sec Loss 1.9446 Epoch: 14 Global Step: 237350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:11,115-Speed 4937.87 samples/sec Loss 1.9337 Epoch: 14 Global Step: 237400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:21,139-Speed 5108.44 samples/sec Loss 1.9810 Epoch: 14 Global Step: 237450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:30,982-Speed 5201.84 samples/sec Loss 1.9539 Epoch: 14 Global Step: 237500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:41,792-Speed 4736.91 samples/sec Loss 1.9390 Epoch: 14 Global Step: 237550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:40:51,616-Speed 5212.24 samples/sec Loss 1.9206 Epoch: 14 Global Step: 237600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:01,619-Speed 5118.45 samples/sec Loss 1.9466 Epoch: 14 Global Step: 237650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:11,663-Speed 5098.02 samples/sec Loss 1.9400 Epoch: 14 Global Step: 237700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:21,657-Speed 5123.27 samples/sec Loss 1.9316 Epoch: 14 Global Step: 237750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:31,860-Speed 5018.60 samples/sec Loss 1.9595 Epoch: 14 Global Step: 237800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:42,297-Speed 4906.14 samples/sec Loss 1.9440 Epoch: 14 Global Step: 237850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:41:52,449-Speed 5043.59 samples/sec Loss 1.9226 Epoch: 14 Global Step: 237900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:42:03,239-Speed 4745.29 samples/sec Loss 1.9420 Epoch: 14 Global Step: 237950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:42:14,238-Speed 4655.14 samples/sec Loss 1.9476 Epoch: 14 Global Step: 238000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:42:30,563-[lfw][238000]XNorm: 23.327937 Training: 2021-03-19 11:42:30,563-[lfw][238000]Accuracy-Flip: 0.99717+-0.00259 Training: 2021-03-19 11:42:30,563-[lfw][238000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:42:49,216-[cfp_fp][238000]XNorm: 19.581991 Training: 2021-03-19 11:42:49,216-[cfp_fp][238000]Accuracy-Flip: 0.97514+-0.00867 Training: 2021-03-19 11:42:49,216-[cfp_fp][238000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:43:05,353-[agedb_30][238000]XNorm: 22.620227 Training: 2021-03-19 11:43:05,353-[agedb_30][238000]Accuracy-Flip: 0.97517+-0.00841 Training: 2021-03-19 11:43:05,353-[agedb_30][238000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:43:16,019-Speed 828.75 samples/sec Loss 1.9382 Epoch: 14 Global Step: 238050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:43:26,175-Speed 5041.50 samples/sec Loss 1.9317 Epoch: 14 Global Step: 238100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:43:36,167-Speed 5124.35 samples/sec Loss 1.9613 Epoch: 14 Global Step: 238150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:43:47,142-Speed 4665.13 samples/sec Loss 1.9461 Epoch: 14 Global Step: 238200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:43:56,935-Speed 5228.98 samples/sec Loss 1.9675 Epoch: 14 Global Step: 238250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:06,881-Speed 5148.02 samples/sec Loss 1.9306 Epoch: 14 Global Step: 238300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:16,998-Speed 5060.79 samples/sec Loss 1.9396 Epoch: 14 Global Step: 238350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:26,777-Speed 5235.95 samples/sec Loss 1.9170 Epoch: 14 Global Step: 238400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:36,699-Speed 5160.59 samples/sec Loss 1.9446 Epoch: 14 Global Step: 238450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:46,689-Speed 5125.19 samples/sec Loss 1.9527 Epoch: 14 Global Step: 238500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:44:57,422-Speed 4770.56 samples/sec Loss 1.9548 Epoch: 14 Global Step: 238550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:07,414-Speed 5124.32 samples/sec Loss 1.9353 Epoch: 14 Global Step: 238600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:17,536-Speed 5059.04 samples/sec Loss 1.9160 Epoch: 14 Global Step: 238650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:27,915-Speed 4933.00 samples/sec Loss 1.9440 Epoch: 14 Global Step: 238700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:37,760-Speed 5201.08 samples/sec Loss 1.9596 Epoch: 14 Global Step: 238750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:47,812-Speed 5094.11 samples/sec Loss 1.9434 Epoch: 14 Global Step: 238800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:45:57,703-Speed 5176.38 samples/sec Loss 1.9444 Epoch: 14 Global Step: 238850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:07,657-Speed 5144.18 samples/sec Loss 1.9318 Epoch: 14 Global Step: 238900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:17,731-Speed 5082.57 samples/sec Loss 1.9562 Epoch: 14 Global Step: 238950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:27,788-Speed 5091.23 samples/sec Loss 1.9464 Epoch: 14 Global Step: 239000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:37,801-Speed 5113.63 samples/sec Loss 1.9701 Epoch: 14 Global Step: 239050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:47,571-Speed 5240.61 samples/sec Loss 1.9620 Epoch: 14 Global Step: 239100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:46:57,702-Speed 5054.59 samples/sec Loss 1.9491 Epoch: 14 Global Step: 239150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:07,550-Speed 5199.24 samples/sec Loss 1.9508 Epoch: 14 Global Step: 239200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:17,621-Speed 5084.00 samples/sec Loss 1.9392 Epoch: 14 Global Step: 239250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:27,563-Speed 5150.25 samples/sec Loss 1.9519 Epoch: 14 Global Step: 239300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:37,684-Speed 5059.23 samples/sec Loss 1.9452 Epoch: 14 Global Step: 239350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:47,777-Speed 5073.26 samples/sec Loss 1.9495 Epoch: 14 Global Step: 239400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:47:58,126-Speed 4947.69 samples/sec Loss 1.9381 Epoch: 14 Global Step: 239450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:08,325-Speed 5020.28 samples/sec Loss 1.9289 Epoch: 14 Global Step: 239500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:18,279-Speed 5143.96 samples/sec Loss 1.9598 Epoch: 14 Global Step: 239550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:28,316-Speed 5101.69 samples/sec Loss 1.9676 Epoch: 14 Global Step: 239600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:38,219-Speed 5170.18 samples/sec Loss 1.9386 Epoch: 14 Global Step: 239650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:48,237-Speed 5111.03 samples/sec Loss 1.9246 Epoch: 14 Global Step: 239700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:48:58,235-Speed 5121.40 samples/sec Loss 1.9432 Epoch: 14 Global Step: 239750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:49:08,281-Speed 5097.05 samples/sec Loss 1.9192 Epoch: 14 Global Step: 239800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:49:18,516-Speed 5002.82 samples/sec Loss 1.9269 Epoch: 14 Global Step: 239850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:49:28,664-Speed 5045.42 samples/sec Loss 1.9500 Epoch: 14 Global Step: 239900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:49:38,698-Speed 5102.92 samples/sec Loss 1.9292 Epoch: 14 Global Step: 239950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:49:48,632-Speed 5154.46 samples/sec Loss 1.9354 Epoch: 14 Global Step: 240000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:50:05,464-[lfw][240000]XNorm: 23.194585 Training: 2021-03-19 11:50:05,464-[lfw][240000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-19 11:50:05,465-[lfw][240000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:50:24,202-[cfp_fp][240000]XNorm: 19.472981 Training: 2021-03-19 11:50:24,202-[cfp_fp][240000]Accuracy-Flip: 0.97529+-0.00913 Training: 2021-03-19 11:50:24,202-[cfp_fp][240000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:50:40,365-[agedb_30][240000]XNorm: 22.444285 Training: 2021-03-19 11:50:40,365-[agedb_30][240000]Accuracy-Flip: 0.97433+-0.00797 Training: 2021-03-19 11:50:40,365-[agedb_30][240000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:50:50,118-Speed 832.72 samples/sec Loss 1.9263 Epoch: 14 Global Step: 240050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:00,040-Speed 5160.38 samples/sec Loss 1.9434 Epoch: 14 Global Step: 240100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:09,926-Speed 5179.39 samples/sec Loss 1.9358 Epoch: 14 Global Step: 240150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:19,947-Speed 5109.76 samples/sec Loss 1.9251 Epoch: 14 Global Step: 240200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:29,948-Speed 5119.78 samples/sec Loss 1.9622 Epoch: 14 Global Step: 240250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:39,840-Speed 5176.13 samples/sec Loss 1.9112 Epoch: 14 Global Step: 240300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:50,037-Speed 5020.98 samples/sec Loss 1.9323 Epoch: 14 Global Step: 240350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:51:59,878-Speed 5203.09 samples/sec Loss 1.9369 Epoch: 14 Global Step: 240400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:52:09,562-Speed 5287.57 samples/sec Loss 1.9312 Epoch: 14 Global Step: 240450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:52:20,116-Speed 4851.71 samples/sec Loss 1.9381 Epoch: 14 Global Step: 240500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:52:30,351-Speed 5002.68 samples/sec Loss 1.9367 Epoch: 14 Global Step: 240550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:52:41,066-Speed 4778.45 samples/sec Loss 1.9474 Epoch: 14 Global Step: 240600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:52:51,238-Speed 5033.68 samples/sec Loss 1.9630 Epoch: 14 Global Step: 240650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:01,270-Speed 5104.32 samples/sec Loss 1.9220 Epoch: 14 Global Step: 240700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:11,234-Speed 5138.82 samples/sec Loss 1.9074 Epoch: 14 Global Step: 240750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:21,365-Speed 5053.92 samples/sec Loss 1.9221 Epoch: 14 Global Step: 240800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:31,524-Speed 5040.53 samples/sec Loss 1.9523 Epoch: 14 Global Step: 240850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:41,396-Speed 5186.65 samples/sec Loss 1.9271 Epoch: 14 Global Step: 240900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:53:51,452-Speed 5091.82 samples/sec Loss 1.9244 Epoch: 14 Global Step: 240950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:02,212-Speed 4758.95 samples/sec Loss 1.9228 Epoch: 14 Global Step: 241000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:12,058-Speed 5200.02 samples/sec Loss 1.9367 Epoch: 14 Global Step: 241050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:21,983-Speed 5159.09 samples/sec Loss 1.9215 Epoch: 14 Global Step: 241100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:31,815-Speed 5207.90 samples/sec Loss 1.9266 Epoch: 14 Global Step: 241150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:41,811-Speed 5122.45 samples/sec Loss 1.9336 Epoch: 14 Global Step: 241200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:54:51,667-Speed 5194.71 samples/sec Loss 1.9489 Epoch: 14 Global Step: 241250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:02,353-Speed 4792.10 samples/sec Loss 1.9325 Epoch: 14 Global Step: 241300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:13,206-Speed 4717.66 samples/sec Loss 1.9403 Epoch: 14 Global Step: 241350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:23,959-Speed 4761.53 samples/sec Loss 1.9559 Epoch: 14 Global Step: 241400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:33,889-Speed 5156.83 samples/sec Loss 1.9769 Epoch: 14 Global Step: 241450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:44,819-Speed 4684.52 samples/sec Loss 1.9284 Epoch: 14 Global Step: 241500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:55:54,526-Speed 5274.61 samples/sec Loss 1.9329 Epoch: 14 Global Step: 241550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:04,459-Speed 5154.95 samples/sec Loss 1.9186 Epoch: 14 Global Step: 241600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:14,450-Speed 5125.13 samples/sec Loss 1.9573 Epoch: 14 Global Step: 241650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:24,473-Speed 5108.23 samples/sec Loss 1.9271 Epoch: 14 Global Step: 241700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:34,333-Speed 5192.95 samples/sec Loss 1.9396 Epoch: 14 Global Step: 241750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:44,169-Speed 5205.82 samples/sec Loss 1.9363 Epoch: 14 Global Step: 241800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:56:54,224-Speed 5092.00 samples/sec Loss 1.9586 Epoch: 14 Global Step: 241850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:57:04,498-Speed 4983.88 samples/sec Loss 1.9581 Epoch: 14 Global Step: 241900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:57:15,259-Speed 4758.05 samples/sec Loss 1.9376 Epoch: 14 Global Step: 241950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:57:25,394-Speed 5052.07 samples/sec Loss 1.9375 Epoch: 14 Global Step: 242000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:57:42,086-[lfw][242000]XNorm: 23.162753 Training: 2021-03-19 11:57:42,086-[lfw][242000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 11:57:42,086-[lfw][242000]Accuracy-Highest: 0.99767 Training: 2021-03-19 11:58:00,771-[cfp_fp][242000]XNorm: 19.443699 Training: 2021-03-19 11:58:00,772-[cfp_fp][242000]Accuracy-Flip: 0.97300+-0.00882 Training: 2021-03-19 11:58:00,772-[cfp_fp][242000]Accuracy-Highest: 0.97614 Training: 2021-03-19 11:58:16,940-[agedb_30][242000]XNorm: 22.454156 Training: 2021-03-19 11:58:16,940-[agedb_30][242000]Accuracy-Flip: 0.97467+-0.00849 Training: 2021-03-19 11:58:16,940-[agedb_30][242000]Accuracy-Highest: 0.97667 Training: 2021-03-19 11:58:26,866-Speed 832.91 samples/sec Loss 1.9447 Epoch: 14 Global Step: 242050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:58:36,899-Speed 5103.70 samples/sec Loss 1.9297 Epoch: 14 Global Step: 242100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:58:46,914-Speed 5112.64 samples/sec Loss 1.9288 Epoch: 14 Global Step: 242150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:58:57,235-Speed 4960.71 samples/sec Loss 1.9627 Epoch: 14 Global Step: 242200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:07,367-Speed 5053.71 samples/sec Loss 1.9185 Epoch: 14 Global Step: 242250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:17,492-Speed 5056.83 samples/sec Loss 1.9306 Epoch: 14 Global Step: 242300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:27,782-Speed 4976.08 samples/sec Loss 1.9480 Epoch: 14 Global Step: 242350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:37,908-Speed 5056.92 samples/sec Loss 1.9408 Epoch: 14 Global Step: 242400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:47,998-Speed 5074.34 samples/sec Loss 1.9309 Epoch: 14 Global Step: 242450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 11:59:57,891-Speed 5175.85 samples/sec Loss 1.9542 Epoch: 14 Global Step: 242500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:07,889-Speed 5121.62 samples/sec Loss 1.9503 Epoch: 14 Global Step: 242550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:17,775-Speed 5179.05 samples/sec Loss 1.9370 Epoch: 14 Global Step: 242600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:27,777-Speed 5119.19 samples/sec Loss 1.9674 Epoch: 14 Global Step: 242650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:38,069-Speed 4975.11 samples/sec Loss 1.9351 Epoch: 14 Global Step: 242700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:47,905-Speed 5205.22 samples/sec Loss 1.9203 Epoch: 14 Global Step: 242750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:00:57,873-Speed 5137.12 samples/sec Loss 1.9168 Epoch: 14 Global Step: 242800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:07,803-Speed 5156.14 samples/sec Loss 1.9338 Epoch: 14 Global Step: 242850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:18,333-Speed 4862.38 samples/sec Loss 1.9368 Epoch: 14 Global Step: 242900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:28,301-Speed 5136.93 samples/sec Loss 1.9508 Epoch: 14 Global Step: 242950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:38,186-Speed 5180.03 samples/sec Loss 1.9185 Epoch: 14 Global Step: 243000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:48,203-Speed 5111.39 samples/sec Loss 1.9158 Epoch: 14 Global Step: 243050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:01:58,337-Speed 5052.65 samples/sec Loss 1.9466 Epoch: 14 Global Step: 243100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:08,470-Speed 5053.37 samples/sec Loss 1.9620 Epoch: 14 Global Step: 243150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:18,459-Speed 5125.67 samples/sec Loss 1.9264 Epoch: 14 Global Step: 243200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:28,734-Speed 4983.42 samples/sec Loss 1.9588 Epoch: 14 Global Step: 243250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:38,836-Speed 5068.15 samples/sec Loss 1.9427 Epoch: 14 Global Step: 243300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:48,716-Speed 5182.60 samples/sec Loss 1.9397 Epoch: 14 Global Step: 243350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:02:58,666-Speed 5146.08 samples/sec Loss 1.8918 Epoch: 14 Global Step: 243400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:08,649-Speed 5129.20 samples/sec Loss 1.9410 Epoch: 14 Global Step: 243450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:18,582-Speed 5154.97 samples/sec Loss 1.9440 Epoch: 14 Global Step: 243500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:28,474-Speed 5176.19 samples/sec Loss 1.9381 Epoch: 14 Global Step: 243550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:38,299-Speed 5211.27 samples/sec Loss 1.9401 Epoch: 14 Global Step: 243600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:48,431-Speed 5053.80 samples/sec Loss 1.9342 Epoch: 14 Global Step: 243650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:03:58,370-Speed 5151.63 samples/sec Loss 1.9081 Epoch: 14 Global Step: 243700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:04:08,388-Speed 5110.90 samples/sec Loss 1.9466 Epoch: 14 Global Step: 243750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:04:18,488-Speed 5069.92 samples/sec Loss 1.9345 Epoch: 14 Global Step: 243800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:04:28,372-Speed 5180.27 samples/sec Loss 1.9258 Epoch: 14 Global Step: 243850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:04:39,333-Speed 4671.60 samples/sec Loss 1.9263 Epoch: 14 Global Step: 243900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:04:50,357-Speed 4644.63 samples/sec Loss 1.9058 Epoch: 14 Global Step: 243950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:05:00,184-Speed 5210.14 samples/sec Loss 1.9546 Epoch: 14 Global Step: 244000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:05:16,830-[lfw][244000]XNorm: 23.335346 Training: 2021-03-19 12:05:16,830-[lfw][244000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 12:05:16,830-[lfw][244000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:05:35,553-[cfp_fp][244000]XNorm: 19.608338 Training: 2021-03-19 12:05:35,553-[cfp_fp][244000]Accuracy-Flip: 0.97529+-0.00965 Training: 2021-03-19 12:05:35,553-[cfp_fp][244000]Accuracy-Highest: 0.97614 Training: 2021-03-19 12:05:51,746-[agedb_30][244000]XNorm: 22.589933 Training: 2021-03-19 12:05:51,746-[agedb_30][244000]Accuracy-Flip: 0.97517+-0.00740 Training: 2021-03-19 12:05:51,746-[agedb_30][244000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:06:01,589-Speed 833.82 samples/sec Loss 1.9346 Epoch: 14 Global Step: 244050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:06:11,776-Speed 5026.64 samples/sec Loss 1.9558 Epoch: 14 Global Step: 244100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:06:21,865-Speed 5075.11 samples/sec Loss 1.9539 Epoch: 14 Global Step: 244150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:06:31,722-Speed 5194.63 samples/sec Loss 1.9270 Epoch: 14 Global Step: 244200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:06:41,560-Speed 5204.40 samples/sec Loss 1.9402 Epoch: 14 Global Step: 244250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:06:51,343-Speed 5233.94 samples/sec Loss 1.9183 Epoch: 14 Global Step: 244300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:02,248-Speed 4695.35 samples/sec Loss 1.9424 Epoch: 14 Global Step: 244350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:12,182-Speed 5154.62 samples/sec Loss 1.9331 Epoch: 14 Global Step: 244400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:22,259-Speed 5080.96 samples/sec Loss 1.9494 Epoch: 14 Global Step: 244450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:32,219-Speed 5141.05 samples/sec Loss 1.9397 Epoch: 14 Global Step: 244500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:42,423-Speed 5017.68 samples/sec Loss 1.9473 Epoch: 14 Global Step: 244550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:07:52,518-Speed 5072.46 samples/sec Loss 1.9202 Epoch: 14 Global Step: 244600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:02,476-Speed 5141.55 samples/sec Loss 1.9080 Epoch: 14 Global Step: 244650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:14,019-Speed 4436.05 samples/sec Loss 1.9158 Epoch: 14 Global Step: 244700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:23,968-Speed 5146.16 samples/sec Loss 1.9371 Epoch: 14 Global Step: 244750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:34,577-Speed 4826.59 samples/sec Loss 1.9367 Epoch: 14 Global Step: 244800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:45,124-Speed 4854.29 samples/sec Loss 1.9378 Epoch: 14 Global Step: 244850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:08:55,229-Speed 5067.57 samples/sec Loss 1.9451 Epoch: 14 Global Step: 244900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:05,365-Speed 5051.59 samples/sec Loss 1.9507 Epoch: 14 Global Step: 244950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:15,410-Speed 5097.28 samples/sec Loss 1.9316 Epoch: 14 Global Step: 245000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:25,445-Speed 5102.49 samples/sec Loss 1.9468 Epoch: 14 Global Step: 245050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:35,304-Speed 5193.35 samples/sec Loss 1.9217 Epoch: 14 Global Step: 245100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:45,317-Speed 5113.90 samples/sec Loss 1.9459 Epoch: 14 Global Step: 245150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:09:55,401-Speed 5077.34 samples/sec Loss 1.9380 Epoch: 14 Global Step: 245200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:05,377-Speed 5132.70 samples/sec Loss 1.9432 Epoch: 14 Global Step: 245250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:15,405-Speed 5105.80 samples/sec Loss 1.9133 Epoch: 14 Global Step: 245300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:26,113-Speed 4781.71 samples/sec Loss 1.9423 Epoch: 14 Global Step: 245350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:36,302-Speed 5025.33 samples/sec Loss 1.9325 Epoch: 14 Global Step: 245400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:46,532-Speed 5005.45 samples/sec Loss 1.9364 Epoch: 14 Global Step: 245450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:10:56,781-Speed 4995.84 samples/sec Loss 1.9413 Epoch: 14 Global Step: 245500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:06,820-Speed 5100.32 samples/sec Loss 1.9426 Epoch: 14 Global Step: 245550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:16,879-Speed 5090.12 samples/sec Loss 1.9292 Epoch: 14 Global Step: 245600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:27,127-Speed 4996.21 samples/sec Loss 1.9538 Epoch: 14 Global Step: 245650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:37,104-Speed 5132.14 samples/sec Loss 1.9093 Epoch: 14 Global Step: 245700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:47,110-Speed 5117.33 samples/sec Loss 1.9143 Epoch: 14 Global Step: 245750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:11:57,296-Speed 5026.88 samples/sec Loss 1.9234 Epoch: 14 Global Step: 245800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:12:07,533-Speed 5002.01 samples/sec Loss 1.9392 Epoch: 14 Global Step: 245850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:12:17,610-Speed 5080.82 samples/sec Loss 1.9259 Epoch: 14 Global Step: 245900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:12:27,865-Speed 4993.17 samples/sec Loss 1.9462 Epoch: 14 Global Step: 245950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:12:37,922-Speed 5091.50 samples/sec Loss 1.9472 Epoch: 14 Global Step: 246000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:12:54,632-[lfw][246000]XNorm: 23.316055 Training: 2021-03-19 12:12:54,633-[lfw][246000]Accuracy-Flip: 0.99700+-0.00306 Training: 2021-03-19 12:12:54,633-[lfw][246000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:13:13,254-[cfp_fp][246000]XNorm: 19.645916 Training: 2021-03-19 12:13:13,254-[cfp_fp][246000]Accuracy-Flip: 0.97614+-0.00941 Training: 2021-03-19 12:13:13,254-[cfp_fp][246000]Accuracy-Highest: 0.97614 Training: 2021-03-19 12:13:29,384-[agedb_30][246000]XNorm: 22.621852 Training: 2021-03-19 12:13:29,384-[agedb_30][246000]Accuracy-Flip: 0.97650+-0.00797 Training: 2021-03-19 12:13:29,384-[agedb_30][246000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:13:39,106-Speed 836.82 samples/sec Loss 1.9092 Epoch: 14 Global Step: 246050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:13:49,492-Speed 4930.29 samples/sec Loss 1.9109 Epoch: 14 Global Step: 246100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:13:59,715-Speed 5008.30 samples/sec Loss 1.9266 Epoch: 14 Global Step: 246150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:14:09,755-Speed 5099.97 samples/sec Loss 1.9415 Epoch: 14 Global Step: 246200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:14:19,693-Speed 5152.08 samples/sec Loss 1.9159 Epoch: 14 Global Step: 246250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:14:30,091-Speed 4924.43 samples/sec Loss 1.9467 Epoch: 14 Global Step: 246300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:14:40,392-Speed 4970.67 samples/sec Loss 1.9160 Epoch: 14 Global Step: 246350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:14:50,469-Speed 5081.30 samples/sec Loss 1.9185 Epoch: 14 Global Step: 246400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:00,350-Speed 5182.35 samples/sec Loss 1.9151 Epoch: 14 Global Step: 246450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:10,547-Speed 5021.33 samples/sec Loss 1.9300 Epoch: 14 Global Step: 246500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:20,659-Speed 5063.46 samples/sec Loss 1.9149 Epoch: 14 Global Step: 246550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:30,543-Speed 5180.37 samples/sec Loss 1.9219 Epoch: 14 Global Step: 246600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:40,732-Speed 5025.52 samples/sec Loss 1.9178 Epoch: 14 Global Step: 246650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:15:50,633-Speed 5171.21 samples/sec Loss 1.9208 Epoch: 14 Global Step: 246700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:00,685-Speed 5094.17 samples/sec Loss 1.9183 Epoch: 14 Global Step: 246750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:10,760-Speed 5082.18 samples/sec Loss 1.9306 Epoch: 14 Global Step: 246800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:20,604-Speed 5201.35 samples/sec Loss 1.9558 Epoch: 14 Global Step: 246850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:30,754-Speed 5044.75 samples/sec Loss 1.9359 Epoch: 14 Global Step: 246900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:41,059-Speed 4969.07 samples/sec Loss 1.9426 Epoch: 14 Global Step: 246950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:16:50,879-Speed 5213.77 samples/sec Loss 1.9405 Epoch: 14 Global Step: 247000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:00,992-Speed 5063.09 samples/sec Loss 1.9326 Epoch: 14 Global Step: 247050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:10,983-Speed 5125.04 samples/sec Loss 1.9145 Epoch: 14 Global Step: 247100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:20,851-Speed 5189.05 samples/sec Loss 1.9466 Epoch: 14 Global Step: 247150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:30,995-Speed 5047.44 samples/sec Loss 1.9525 Epoch: 14 Global Step: 247200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:41,074-Speed 5080.27 samples/sec Loss 1.9264 Epoch: 14 Global Step: 247250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:17:52,825-Speed 4357.23 samples/sec Loss 1.9408 Epoch: 14 Global Step: 247300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:02,916-Speed 5073.95 samples/sec Loss 1.9368 Epoch: 14 Global Step: 247350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:12,651-Speed 5259.74 samples/sec Loss 1.9204 Epoch: 14 Global Step: 247400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:22,818-Speed 5036.33 samples/sec Loss 1.9429 Epoch: 14 Global Step: 247450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:32,950-Speed 5053.38 samples/sec Loss 1.9339 Epoch: 14 Global Step: 247500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:42,746-Speed 5227.02 samples/sec Loss 1.9255 Epoch: 14 Global Step: 247550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-19 12:18:52,627-Speed 5182.16 samples/sec Loss 1.9165 Epoch: 14 Global Step: 247600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:02,551-Speed 5159.25 samples/sec Loss 1.9240 Epoch: 14 Global Step: 247650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:13,397-Speed 4720.82 samples/sec Loss 1.9555 Epoch: 14 Global Step: 247700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:23,470-Speed 5083.13 samples/sec Loss 1.9191 Epoch: 14 Global Step: 247750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:33,443-Speed 5134.09 samples/sec Loss 1.9523 Epoch: 14 Global Step: 247800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:43,777-Speed 4954.58 samples/sec Loss 1.9379 Epoch: 14 Global Step: 247850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:19:53,898-Speed 5059.11 samples/sec Loss 1.9096 Epoch: 14 Global Step: 247900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:20:03,921-Speed 5108.71 samples/sec Loss 1.9144 Epoch: 14 Global Step: 247950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:20:13,808-Speed 5178.64 samples/sec Loss 1.9206 Epoch: 14 Global Step: 248000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:20:30,599-[lfw][248000]XNorm: 23.217025 Training: 2021-03-19 12:20:30,599-[lfw][248000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-19 12:20:30,599-[lfw][248000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:20:49,449-[cfp_fp][248000]XNorm: 19.488873 Training: 2021-03-19 12:20:49,449-[cfp_fp][248000]Accuracy-Flip: 0.97686+-0.00914 Training: 2021-03-19 12:20:49,449-[cfp_fp][248000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:21:05,746-[agedb_30][248000]XNorm: 22.505844 Training: 2021-03-19 12:21:05,746-[agedb_30][248000]Accuracy-Flip: 0.97600+-0.00793 Training: 2021-03-19 12:21:05,746-[agedb_30][248000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:21:15,447-Speed 830.65 samples/sec Loss 1.9281 Epoch: 14 Global Step: 248050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:21:27,123-Speed 4385.27 samples/sec Loss 1.9193 Epoch: 14 Global Step: 248100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:21:37,763-Speed 4812.05 samples/sec Loss 1.9249 Epoch: 14 Global Step: 248150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:21:47,938-Speed 5032.40 samples/sec Loss 1.9574 Epoch: 14 Global Step: 248200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:21:58,701-Speed 4757.29 samples/sec Loss 1.9064 Epoch: 14 Global Step: 248250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:08,610-Speed 5167.53 samples/sec Loss 1.9553 Epoch: 14 Global Step: 248300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:18,536-Speed 5158.21 samples/sec Loss 1.9275 Epoch: 14 Global Step: 248350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:28,342-Speed 5221.81 samples/sec Loss 1.9223 Epoch: 14 Global Step: 248400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:38,551-Speed 5015.70 samples/sec Loss 1.9095 Epoch: 14 Global Step: 248450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:48,327-Speed 5237.52 samples/sec Loss 1.9386 Epoch: 14 Global Step: 248500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:22:58,552-Speed 5007.65 samples/sec Loss 1.9440 Epoch: 14 Global Step: 248550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:08,501-Speed 5146.30 samples/sec Loss 1.9357 Epoch: 14 Global Step: 248600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:18,483-Speed 5129.50 samples/sec Loss 1.9249 Epoch: 14 Global Step: 248650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:29,343-Speed 4715.01 samples/sec Loss 1.9444 Epoch: 14 Global Step: 248700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:39,277-Speed 5154.36 samples/sec Loss 1.9223 Epoch: 14 Global Step: 248750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:49,289-Speed 5114.41 samples/sec Loss 1.9291 Epoch: 14 Global Step: 248800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:23:59,423-Speed 5052.42 samples/sec Loss 1.9352 Epoch: 14 Global Step: 248850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:09,348-Speed 5158.84 samples/sec Loss 1.9244 Epoch: 14 Global Step: 248900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:19,489-Speed 5049.11 samples/sec Loss 1.9344 Epoch: 14 Global Step: 248950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:29,463-Speed 5134.03 samples/sec Loss 1.9367 Epoch: 14 Global Step: 249000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:39,562-Speed 5069.72 samples/sec Loss 1.9207 Epoch: 14 Global Step: 249050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:49,635-Speed 5083.37 samples/sec Loss 1.9320 Epoch: 14 Global Step: 249100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:24:59,760-Speed 5057.00 samples/sec Loss 1.9449 Epoch: 14 Global Step: 249150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:25:09,932-Speed 5033.69 samples/sec Loss 1.9185 Epoch: 14 Global Step: 249200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:25:20,077-Speed 5047.34 samples/sec Loss 1.9176 Epoch: 14 Global Step: 249250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:25:30,287-Speed 5014.71 samples/sec Loss 1.9324 Epoch: 14 Global Step: 249300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:25:40,336-Speed 5095.11 samples/sec Loss 1.9337 Epoch: 14 Global Step: 249350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:25:50,417-Speed 5079.12 samples/sec Loss 1.9399 Epoch: 14 Global Step: 249400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:00,409-Speed 5124.50 samples/sec Loss 1.9423 Epoch: 14 Global Step: 249450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:10,556-Speed 5046.16 samples/sec Loss 1.8894 Epoch: 14 Global Step: 249500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:20,813-Speed 4992.00 samples/sec Loss 1.9329 Epoch: 14 Global Step: 249550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:31,036-Speed 5008.54 samples/sec Loss 1.9455 Epoch: 14 Global Step: 249600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:41,142-Speed 5066.99 samples/sec Loss 1.9405 Epoch: 14 Global Step: 249650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:26:51,098-Speed 5142.92 samples/sec Loss 1.9179 Epoch: 14 Global Step: 249700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:01,452-Speed 4944.92 samples/sec Loss 1.9552 Epoch: 14 Global Step: 249750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:11,593-Speed 5049.06 samples/sec Loss 1.9185 Epoch: 14 Global Step: 249800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:21,704-Speed 5064.25 samples/sec Loss 1.9325 Epoch: 14 Global Step: 249850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:31,741-Speed 5101.45 samples/sec Loss 1.9162 Epoch: 14 Global Step: 249900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:41,599-Speed 5193.81 samples/sec Loss 1.9178 Epoch: 14 Global Step: 249950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:27:51,757-Speed 5040.96 samples/sec Loss 1.9385 Epoch: 14 Global Step: 250000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:28:08,612-[lfw][250000]XNorm: 23.265305 Training: 2021-03-19 12:28:08,612-[lfw][250000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-19 12:28:08,612-[lfw][250000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:28:27,394-[cfp_fp][250000]XNorm: 19.523770 Training: 2021-03-19 12:28:27,394-[cfp_fp][250000]Accuracy-Flip: 0.97586+-0.00773 Training: 2021-03-19 12:28:27,394-[cfp_fp][250000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:28:43,666-[agedb_30][250000]XNorm: 22.530855 Training: 2021-03-19 12:28:43,667-[agedb_30][250000]Accuracy-Flip: 0.97433+-0.00768 Training: 2021-03-19 12:28:43,667-[agedb_30][250000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:28:53,687-Speed 826.74 samples/sec Loss 1.9344 Epoch: 14 Global Step: 250050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:03,900-Speed 5013.35 samples/sec Loss 1.9196 Epoch: 14 Global Step: 250100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:13,638-Speed 5258.39 samples/sec Loss 1.9568 Epoch: 14 Global Step: 250150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:23,513-Speed 5185.28 samples/sec Loss 1.9210 Epoch: 14 Global Step: 250200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:33,666-Speed 5042.89 samples/sec Loss 1.9178 Epoch: 14 Global Step: 250250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:43,795-Speed 5055.17 samples/sec Loss 1.9199 Epoch: 14 Global Step: 250300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:29:53,914-Speed 5059.95 samples/sec Loss 1.9451 Epoch: 14 Global Step: 250350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:30:16,689-Speed 2248.17 samples/sec Loss 1.9000 Epoch: 15 Global Step: 250400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:30:27,672-Speed 4662.21 samples/sec Loss 1.9019 Epoch: 15 Global Step: 250450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:30:38,506-Speed 4726.24 samples/sec Loss 1.9183 Epoch: 15 Global Step: 250500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:30:49,113-Speed 4827.28 samples/sec Loss 1.9102 Epoch: 15 Global Step: 250550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:30:59,570-Speed 4896.64 samples/sec Loss 1.9199 Epoch: 15 Global Step: 250600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:31:11,209-Speed 4399.21 samples/sec Loss 1.9260 Epoch: 15 Global Step: 250650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:31:22,123-Speed 4691.67 samples/sec Loss 1.9152 Epoch: 15 Global Step: 250700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:31:34,098-Speed 4275.87 samples/sec Loss 1.9122 Epoch: 15 Global Step: 250750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:31:45,081-Speed 4661.72 samples/sec Loss 1.9343 Epoch: 15 Global Step: 250800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:31:55,526-Speed 4902.58 samples/sec Loss 1.9058 Epoch: 15 Global Step: 250850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:06,214-Speed 4790.46 samples/sec Loss 1.9123 Epoch: 15 Global Step: 250900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:16,612-Speed 4924.61 samples/sec Loss 1.9320 Epoch: 15 Global Step: 250950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:27,088-Speed 4887.49 samples/sec Loss 1.8882 Epoch: 15 Global Step: 251000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:37,823-Speed 4769.57 samples/sec Loss 1.9203 Epoch: 15 Global Step: 251050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:49,133-Speed 4527.39 samples/sec Loss 1.9097 Epoch: 15 Global Step: 251100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:32:59,787-Speed 4805.93 samples/sec Loss 1.9055 Epoch: 15 Global Step: 251150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:33:10,131-Speed 4949.99 samples/sec Loss 1.9147 Epoch: 15 Global Step: 251200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:33:20,928-Speed 4742.51 samples/sec Loss 1.9048 Epoch: 15 Global Step: 251250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:33:31,473-Speed 4855.44 samples/sec Loss 1.9058 Epoch: 15 Global Step: 251300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:33:41,786-Speed 4964.85 samples/sec Loss 1.9149 Epoch: 15 Global Step: 251350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:33:52,300-Speed 4870.19 samples/sec Loss 1.9239 Epoch: 15 Global Step: 251400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:05,180-Speed 3975.33 samples/sec Loss 1.9127 Epoch: 15 Global Step: 251450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:16,495-Speed 4525.48 samples/sec Loss 1.9323 Epoch: 15 Global Step: 251500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:27,711-Speed 4565.17 samples/sec Loss 1.9153 Epoch: 15 Global Step: 251550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:38,266-Speed 4850.99 samples/sec Loss 1.9383 Epoch: 15 Global Step: 251600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:49,340-Speed 4623.86 samples/sec Loss 1.9083 Epoch: 15 Global Step: 251650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:34:59,552-Speed 5013.73 samples/sec Loss 1.8963 Epoch: 15 Global Step: 251700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:35:09,705-Speed 5043.57 samples/sec Loss 1.9171 Epoch: 15 Global Step: 251750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:35:19,851-Speed 5046.32 samples/sec Loss 1.9028 Epoch: 15 Global Step: 251800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:35:30,047-Speed 5021.68 samples/sec Loss 1.9051 Epoch: 15 Global Step: 251850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:35:40,356-Speed 4966.69 samples/sec Loss 1.9111 Epoch: 15 Global Step: 251900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:35:50,392-Speed 5102.22 samples/sec Loss 1.9230 Epoch: 15 Global Step: 251950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:36:00,653-Speed 4990.16 samples/sec Loss 1.9112 Epoch: 15 Global Step: 252000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:36:17,320-[lfw][252000]XNorm: 23.291739 Training: 2021-03-19 12:36:17,320-[lfw][252000]Accuracy-Flip: 0.99717+-0.00279 Training: 2021-03-19 12:36:17,320-[lfw][252000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:36:36,050-[cfp_fp][252000]XNorm: 19.550662 Training: 2021-03-19 12:36:36,050-[cfp_fp][252000]Accuracy-Flip: 0.97371+-0.01033 Training: 2021-03-19 12:36:36,050-[cfp_fp][252000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:36:52,208-[agedb_30][252000]XNorm: 22.549164 Training: 2021-03-19 12:36:52,209-[agedb_30][252000]Accuracy-Flip: 0.97433+-0.00810 Training: 2021-03-19 12:36:52,209-[agedb_30][252000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:37:02,092-Speed 833.36 samples/sec Loss 1.8823 Epoch: 15 Global Step: 252050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:37:12,888-Speed 4742.79 samples/sec Loss 1.9229 Epoch: 15 Global Step: 252100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:37:22,972-Speed 5077.88 samples/sec Loss 1.9239 Epoch: 15 Global Step: 252150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:37:33,029-Speed 5091.77 samples/sec Loss 1.8934 Epoch: 15 Global Step: 252200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:37:43,180-Speed 5043.73 samples/sec Loss 1.9130 Epoch: 15 Global Step: 252250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:37:53,254-Speed 5083.06 samples/sec Loss 1.8998 Epoch: 15 Global Step: 252300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:03,051-Speed 5226.22 samples/sec Loss 1.9188 Epoch: 15 Global Step: 252350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:13,093-Speed 5098.72 samples/sec Loss 1.8826 Epoch: 15 Global Step: 252400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:22,992-Speed 5172.59 samples/sec Loss 1.9023 Epoch: 15 Global Step: 252450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:33,211-Speed 5010.63 samples/sec Loss 1.8938 Epoch: 15 Global Step: 252500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:43,343-Speed 5053.64 samples/sec Loss 1.9173 Epoch: 15 Global Step: 252550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:38:53,519-Speed 5031.66 samples/sec Loss 1.9063 Epoch: 15 Global Step: 252600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:03,622-Speed 5068.08 samples/sec Loss 1.9049 Epoch: 15 Global Step: 252650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:13,774-Speed 5043.72 samples/sec Loss 1.9238 Epoch: 15 Global Step: 252700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:23,744-Speed 5136.21 samples/sec Loss 1.9156 Epoch: 15 Global Step: 252750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:33,862-Speed 5060.28 samples/sec Loss 1.9257 Epoch: 15 Global Step: 252800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:43,959-Speed 5071.33 samples/sec Loss 1.9112 Epoch: 15 Global Step: 252850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:39:54,172-Speed 5013.12 samples/sec Loss 1.9243 Epoch: 15 Global Step: 252900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:04,249-Speed 5081.36 samples/sec Loss 1.9122 Epoch: 15 Global Step: 252950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:14,378-Speed 5055.13 samples/sec Loss 1.9137 Epoch: 15 Global Step: 253000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:24,481-Speed 5068.03 samples/sec Loss 1.9186 Epoch: 15 Global Step: 253050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:34,518-Speed 5101.40 samples/sec Loss 1.9046 Epoch: 15 Global Step: 253100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:44,530-Speed 5114.37 samples/sec Loss 1.8929 Epoch: 15 Global Step: 253150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:40:54,769-Speed 5000.87 samples/sec Loss 1.9062 Epoch: 15 Global Step: 253200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:05,065-Speed 4973.14 samples/sec Loss 1.8991 Epoch: 15 Global Step: 253250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:15,010-Speed 5148.20 samples/sec Loss 1.9278 Epoch: 15 Global Step: 253300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:25,381-Speed 4937.29 samples/sec Loss 1.8955 Epoch: 15 Global Step: 253350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:35,645-Speed 4988.37 samples/sec Loss 1.9130 Epoch: 15 Global Step: 253400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:45,749-Speed 5068.02 samples/sec Loss 1.8777 Epoch: 15 Global Step: 253450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:41:55,988-Speed 5000.48 samples/sec Loss 1.9047 Epoch: 15 Global Step: 253500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:06,111-Speed 5058.27 samples/sec Loss 1.8956 Epoch: 15 Global Step: 253550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:16,116-Speed 5117.88 samples/sec Loss 1.9136 Epoch: 15 Global Step: 253600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:25,743-Speed 5318.46 samples/sec Loss 1.9157 Epoch: 15 Global Step: 253650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:35,847-Speed 5067.43 samples/sec Loss 1.9268 Epoch: 15 Global Step: 253700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:45,815-Speed 5137.06 samples/sec Loss 1.9382 Epoch: 15 Global Step: 253750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:42:55,989-Speed 5032.55 samples/sec Loss 1.8848 Epoch: 15 Global Step: 253800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:43:06,097-Speed 5065.50 samples/sec Loss 1.9024 Epoch: 15 Global Step: 253850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:43:16,204-Speed 5066.32 samples/sec Loss 1.9116 Epoch: 15 Global Step: 253900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:43:26,344-Speed 5049.40 samples/sec Loss 1.9109 Epoch: 15 Global Step: 253950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:43:37,832-Speed 4457.35 samples/sec Loss 1.9042 Epoch: 15 Global Step: 254000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:43:54,615-[lfw][254000]XNorm: 23.146666 Training: 2021-03-19 12:43:54,616-[lfw][254000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 12:43:54,616-[lfw][254000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:44:13,317-[cfp_fp][254000]XNorm: 19.446431 Training: 2021-03-19 12:44:13,318-[cfp_fp][254000]Accuracy-Flip: 0.97443+-0.00882 Training: 2021-03-19 12:44:13,318-[cfp_fp][254000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:44:29,483-[agedb_30][254000]XNorm: 22.444705 Training: 2021-03-19 12:44:29,484-[agedb_30][254000]Accuracy-Flip: 0.97500+-0.00785 Training: 2021-03-19 12:44:29,484-[agedb_30][254000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:44:39,547-Speed 829.62 samples/sec Loss 1.9061 Epoch: 15 Global Step: 254050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:44:49,472-Speed 5159.03 samples/sec Loss 1.9162 Epoch: 15 Global Step: 254100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:00,498-Speed 4643.93 samples/sec Loss 1.9087 Epoch: 15 Global Step: 254150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:10,574-Speed 5082.01 samples/sec Loss 1.9264 Epoch: 15 Global Step: 254200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:20,603-Speed 5105.17 samples/sec Loss 1.9276 Epoch: 15 Global Step: 254250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:30,635-Speed 5104.09 samples/sec Loss 1.9272 Epoch: 15 Global Step: 254300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:40,808-Speed 5033.03 samples/sec Loss 1.9137 Epoch: 15 Global Step: 254350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:45:50,784-Speed 5132.88 samples/sec Loss 1.9229 Epoch: 15 Global Step: 254400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:01,784-Speed 4654.75 samples/sec Loss 1.9126 Epoch: 15 Global Step: 254450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:11,810-Speed 5107.33 samples/sec Loss 1.9071 Epoch: 15 Global Step: 254500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:21,707-Speed 5173.82 samples/sec Loss 1.9318 Epoch: 15 Global Step: 254550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:31,802-Speed 5071.88 samples/sec Loss 1.8993 Epoch: 15 Global Step: 254600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:41,710-Speed 5168.14 samples/sec Loss 1.8971 Epoch: 15 Global Step: 254650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:46:51,956-Speed 4997.34 samples/sec Loss 1.9200 Epoch: 15 Global Step: 254700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:02,949-Speed 4657.94 samples/sec Loss 1.9141 Epoch: 15 Global Step: 254750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:12,855-Speed 5168.45 samples/sec Loss 1.9083 Epoch: 15 Global Step: 254800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:22,919-Speed 5087.88 samples/sec Loss 1.8918 Epoch: 15 Global Step: 254850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:34,362-Speed 4474.64 samples/sec Loss 1.9186 Epoch: 15 Global Step: 254900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:44,438-Speed 5081.59 samples/sec Loss 1.9203 Epoch: 15 Global Step: 254950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:47:54,697-Speed 4990.83 samples/sec Loss 1.8988 Epoch: 15 Global Step: 255000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:04,951-Speed 4993.76 samples/sec Loss 1.9061 Epoch: 15 Global Step: 255050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:15,762-Speed 4736.15 samples/sec Loss 1.8992 Epoch: 15 Global Step: 255100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:25,753-Speed 5124.91 samples/sec Loss 1.8827 Epoch: 15 Global Step: 255150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:35,904-Speed 5044.24 samples/sec Loss 1.9083 Epoch: 15 Global Step: 255200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:46,039-Speed 5051.87 samples/sec Loss 1.9024 Epoch: 15 Global Step: 255250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:48:56,260-Speed 5009.52 samples/sec Loss 1.8860 Epoch: 15 Global Step: 255300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:06,292-Speed 5104.13 samples/sec Loss 1.8930 Epoch: 15 Global Step: 255350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:16,037-Speed 5254.63 samples/sec Loss 1.9064 Epoch: 15 Global Step: 255400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:26,198-Speed 5039.25 samples/sec Loss 1.9367 Epoch: 15 Global Step: 255450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:36,997-Speed 4741.42 samples/sec Loss 1.9065 Epoch: 15 Global Step: 255500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:47,118-Speed 5059.03 samples/sec Loss 1.8964 Epoch: 15 Global Step: 255550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:49:57,086-Speed 5136.74 samples/sec Loss 1.9311 Epoch: 15 Global Step: 255600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:06,996-Speed 5166.61 samples/sec Loss 1.9003 Epoch: 15 Global Step: 255650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:16,857-Speed 5192.68 samples/sec Loss 1.9188 Epoch: 15 Global Step: 255700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:26,835-Speed 5131.45 samples/sec Loss 1.9186 Epoch: 15 Global Step: 255750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:37,028-Speed 5023.30 samples/sec Loss 1.9269 Epoch: 15 Global Step: 255800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:47,132-Speed 5067.63 samples/sec Loss 1.9073 Epoch: 15 Global Step: 255850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:50:57,074-Speed 5150.28 samples/sec Loss 1.9383 Epoch: 15 Global Step: 255900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:51:07,307-Speed 5003.66 samples/sec Loss 1.8872 Epoch: 15 Global Step: 255950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:51:17,205-Speed 5172.87 samples/sec Loss 1.8997 Epoch: 15 Global Step: 256000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:51:34,150-[lfw][256000]XNorm: 23.311678 Training: 2021-03-19 12:51:34,150-[lfw][256000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-19 12:51:34,150-[lfw][256000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:51:52,895-[cfp_fp][256000]XNorm: 19.543204 Training: 2021-03-19 12:51:52,895-[cfp_fp][256000]Accuracy-Flip: 0.97500+-0.00846 Training: 2021-03-19 12:51:52,896-[cfp_fp][256000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:52:09,098-[agedb_30][256000]XNorm: 22.590416 Training: 2021-03-19 12:52:09,098-[agedb_30][256000]Accuracy-Flip: 0.97450+-0.00764 Training: 2021-03-19 12:52:09,098-[agedb_30][256000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:52:18,979-Speed 828.85 samples/sec Loss 1.9654 Epoch: 15 Global Step: 256050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:52:29,070-Speed 5074.28 samples/sec Loss 1.9003 Epoch: 15 Global Step: 256100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:52:39,097-Speed 5106.50 samples/sec Loss 1.9130 Epoch: 15 Global Step: 256150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:52:49,069-Speed 5134.67 samples/sec Loss 1.9176 Epoch: 15 Global Step: 256200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:52:59,389-Speed 4961.06 samples/sec Loss 1.9313 Epoch: 15 Global Step: 256250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:09,483-Speed 5072.59 samples/sec Loss 1.9596 Epoch: 15 Global Step: 256300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:19,469-Speed 5127.62 samples/sec Loss 1.8672 Epoch: 15 Global Step: 256350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:29,480-Speed 5114.53 samples/sec Loss 1.9338 Epoch: 15 Global Step: 256400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:39,396-Speed 5163.83 samples/sec Loss 1.9114 Epoch: 15 Global Step: 256450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:49,614-Speed 5011.17 samples/sec Loss 1.8882 Epoch: 15 Global Step: 256500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:53:59,492-Speed 5183.44 samples/sec Loss 1.8834 Epoch: 15 Global Step: 256550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:09,514-Speed 5109.20 samples/sec Loss 1.9234 Epoch: 15 Global Step: 256600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:19,530-Speed 5112.14 samples/sec Loss 1.9071 Epoch: 15 Global Step: 256650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:29,486-Speed 5142.91 samples/sec Loss 1.8990 Epoch: 15 Global Step: 256700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:39,430-Speed 5149.00 samples/sec Loss 1.9040 Epoch: 15 Global Step: 256750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:49,409-Speed 5131.13 samples/sec Loss 1.8983 Epoch: 15 Global Step: 256800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:54:59,536-Speed 5056.31 samples/sec Loss 1.9119 Epoch: 15 Global Step: 256850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:09,525-Speed 5125.69 samples/sec Loss 1.9022 Epoch: 15 Global Step: 256900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:19,470-Speed 5148.79 samples/sec Loss 1.9153 Epoch: 15 Global Step: 256950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:29,396-Speed 5158.30 samples/sec Loss 1.9161 Epoch: 15 Global Step: 257000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:39,590-Speed 5022.90 samples/sec Loss 1.8892 Epoch: 15 Global Step: 257050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:49,700-Speed 5064.54 samples/sec Loss 1.9026 Epoch: 15 Global Step: 257100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:55:59,674-Speed 5133.77 samples/sec Loss 1.9178 Epoch: 15 Global Step: 257150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:56:09,907-Speed 5003.97 samples/sec Loss 1.8920 Epoch: 15 Global Step: 257200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:56:19,797-Speed 5176.90 samples/sec Loss 1.9042 Epoch: 15 Global Step: 257250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:56:29,643-Speed 5200.59 samples/sec Loss 1.9030 Epoch: 15 Global Step: 257300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:56:40,878-Speed 4557.11 samples/sec Loss 1.9145 Epoch: 15 Global Step: 257350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:56:51,088-Speed 5015.15 samples/sec Loss 1.9005 Epoch: 15 Global Step: 257400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:01,051-Speed 5139.13 samples/sec Loss 1.8926 Epoch: 15 Global Step: 257450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:11,129-Speed 5080.70 samples/sec Loss 1.9236 Epoch: 15 Global Step: 257500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:21,354-Speed 5007.89 samples/sec Loss 1.9143 Epoch: 15 Global Step: 257550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:31,948-Speed 4833.12 samples/sec Loss 1.9119 Epoch: 15 Global Step: 257600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:41,885-Speed 5152.77 samples/sec Loss 1.8918 Epoch: 15 Global Step: 257650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:57:51,979-Speed 5072.48 samples/sec Loss 1.8887 Epoch: 15 Global Step: 257700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:02,845-Speed 4712.48 samples/sec Loss 1.9150 Epoch: 15 Global Step: 257750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:12,862-Speed 5111.37 samples/sec Loss 1.9195 Epoch: 15 Global Step: 257800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:22,882-Speed 5110.48 samples/sec Loss 1.9365 Epoch: 15 Global Step: 257850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:32,944-Speed 5088.74 samples/sec Loss 1.9175 Epoch: 15 Global Step: 257900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:42,743-Speed 5225.44 samples/sec Loss 1.8974 Epoch: 15 Global Step: 257950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:58:52,593-Speed 5198.23 samples/sec Loss 1.9259 Epoch: 15 Global Step: 258000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 12:59:09,354-[lfw][258000]XNorm: 23.057409 Training: 2021-03-19 12:59:09,355-[lfw][258000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 12:59:09,355-[lfw][258000]Accuracy-Highest: 0.99767 Training: 2021-03-19 12:59:27,990-[cfp_fp][258000]XNorm: 19.396078 Training: 2021-03-19 12:59:27,990-[cfp_fp][258000]Accuracy-Flip: 0.97543+-0.00880 Training: 2021-03-19 12:59:27,990-[cfp_fp][258000]Accuracy-Highest: 0.97686 Training: 2021-03-19 12:59:44,201-[agedb_30][258000]XNorm: 22.344299 Training: 2021-03-19 12:59:44,201-[agedb_30][258000]Accuracy-Flip: 0.97433+-0.00772 Training: 2021-03-19 12:59:44,201-[agedb_30][258000]Accuracy-Highest: 0.97667 Training: 2021-03-19 12:59:54,838-Speed 822.57 samples/sec Loss 1.9088 Epoch: 15 Global Step: 258050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:04,773-Speed 5153.77 samples/sec Loss 1.9283 Epoch: 15 Global Step: 258100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:14,874-Speed 5069.18 samples/sec Loss 1.9283 Epoch: 15 Global Step: 258150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:25,024-Speed 5044.77 samples/sec Loss 1.8887 Epoch: 15 Global Step: 258200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:35,914-Speed 4701.56 samples/sec Loss 1.9173 Epoch: 15 Global Step: 258250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:46,648-Speed 4770.32 samples/sec Loss 1.8779 Epoch: 15 Global Step: 258300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:00:56,844-Speed 5021.97 samples/sec Loss 1.9142 Epoch: 15 Global Step: 258350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:06,967-Speed 5057.89 samples/sec Loss 1.9187 Epoch: 15 Global Step: 258400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:17,893-Speed 4686.46 samples/sec Loss 1.8959 Epoch: 15 Global Step: 258450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:28,169-Speed 4982.92 samples/sec Loss 1.9110 Epoch: 15 Global Step: 258500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:38,149-Speed 5130.21 samples/sec Loss 1.9156 Epoch: 15 Global Step: 258550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:48,224-Speed 5082.16 samples/sec Loss 1.9225 Epoch: 15 Global Step: 258600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:01:58,414-Speed 5025.22 samples/sec Loss 1.8973 Epoch: 15 Global Step: 258650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:08,387-Speed 5133.87 samples/sec Loss 1.9244 Epoch: 15 Global Step: 258700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:18,715-Speed 4957.79 samples/sec Loss 1.9157 Epoch: 15 Global Step: 258750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:29,368-Speed 4806.61 samples/sec Loss 1.9084 Epoch: 15 Global Step: 258800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:39,417-Speed 5095.40 samples/sec Loss 1.9054 Epoch: 15 Global Step: 258850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:49,542-Speed 5057.13 samples/sec Loss 1.9392 Epoch: 15 Global Step: 258900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:02:59,775-Speed 5003.24 samples/sec Loss 1.8906 Epoch: 15 Global Step: 258950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:03:09,751-Speed 5132.85 samples/sec Loss 1.8915 Epoch: 15 Global Step: 259000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:03:19,935-Speed 5027.82 samples/sec Loss 1.9017 Epoch: 15 Global Step: 259050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:03:30,194-Speed 4991.09 samples/sec Loss 1.8993 Epoch: 15 Global Step: 259100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:03:40,253-Speed 5090.47 samples/sec Loss 1.9117 Epoch: 15 Global Step: 259150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:03:50,641-Speed 4928.88 samples/sec Loss 1.9061 Epoch: 15 Global Step: 259200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:00,680-Speed 5100.45 samples/sec Loss 1.9006 Epoch: 15 Global Step: 259250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:10,916-Speed 5002.23 samples/sec Loss 1.9245 Epoch: 15 Global Step: 259300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:20,956-Speed 5099.88 samples/sec Loss 1.8878 Epoch: 15 Global Step: 259350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:31,175-Speed 5010.99 samples/sec Loss 1.9086 Epoch: 15 Global Step: 259400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:41,187-Speed 5113.75 samples/sec Loss 1.9181 Epoch: 15 Global Step: 259450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:04:51,255-Speed 5085.81 samples/sec Loss 1.8949 Epoch: 15 Global Step: 259500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:01,262-Speed 5116.50 samples/sec Loss 1.9120 Epoch: 15 Global Step: 259550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:11,204-Speed 5150.21 samples/sec Loss 1.9199 Epoch: 15 Global Step: 259600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:21,270-Speed 5086.97 samples/sec Loss 1.8947 Epoch: 15 Global Step: 259650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:31,551-Speed 4980.41 samples/sec Loss 1.8969 Epoch: 15 Global Step: 259700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:41,576-Speed 5107.38 samples/sec Loss 1.9140 Epoch: 15 Global Step: 259750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:05:51,715-Speed 5050.04 samples/sec Loss 1.8942 Epoch: 15 Global Step: 259800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:06:01,562-Speed 5199.69 samples/sec Loss 1.9222 Epoch: 15 Global Step: 259850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:06:11,554-Speed 5124.88 samples/sec Loss 1.8972 Epoch: 15 Global Step: 259900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:06:21,549-Speed 5122.39 samples/sec Loss 1.9270 Epoch: 15 Global Step: 259950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:06:31,423-Speed 5185.95 samples/sec Loss 1.8993 Epoch: 15 Global Step: 260000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:06:48,034-[lfw][260000]XNorm: 23.173752 Training: 2021-03-19 13:06:48,035-[lfw][260000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 13:06:48,035-[lfw][260000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:07:06,667-[cfp_fp][260000]XNorm: 19.541319 Training: 2021-03-19 13:07:06,667-[cfp_fp][260000]Accuracy-Flip: 0.97571+-0.00869 Training: 2021-03-19 13:07:06,668-[cfp_fp][260000]Accuracy-Highest: 0.97686 Training: 2021-03-19 13:07:22,825-[agedb_30][260000]XNorm: 22.492211 Training: 2021-03-19 13:07:22,825-[agedb_30][260000]Accuracy-Flip: 0.97433+-0.00807 Training: 2021-03-19 13:07:22,825-[agedb_30][260000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:07:32,711-Speed 835.40 samples/sec Loss 1.9298 Epoch: 15 Global Step: 260050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:07:42,984-Speed 4984.50 samples/sec Loss 1.9269 Epoch: 15 Global Step: 260100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:07:53,272-Speed 4977.25 samples/sec Loss 1.8906 Epoch: 15 Global Step: 260150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:03,262-Speed 5125.40 samples/sec Loss 1.9130 Epoch: 15 Global Step: 260200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:13,333-Speed 5084.23 samples/sec Loss 1.8747 Epoch: 15 Global Step: 260250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:23,263-Speed 5156.32 samples/sec Loss 1.9032 Epoch: 15 Global Step: 260300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:33,403-Speed 5049.47 samples/sec Loss 1.9244 Epoch: 15 Global Step: 260350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:43,360-Speed 5142.47 samples/sec Loss 1.9185 Epoch: 15 Global Step: 260400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:08:53,290-Speed 5156.41 samples/sec Loss 1.9015 Epoch: 15 Global Step: 260450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:03,516-Speed 5007.16 samples/sec Loss 1.9327 Epoch: 15 Global Step: 260500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:13,575-Speed 5090.25 samples/sec Loss 1.8912 Epoch: 15 Global Step: 260550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:23,490-Speed 5164.15 samples/sec Loss 1.9102 Epoch: 15 Global Step: 260600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:33,402-Speed 5165.96 samples/sec Loss 1.9028 Epoch: 15 Global Step: 260650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:43,393-Speed 5124.53 samples/sec Loss 1.9033 Epoch: 15 Global Step: 260700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:09:54,338-Speed 4678.30 samples/sec Loss 1.9002 Epoch: 15 Global Step: 260750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:04,274-Speed 5153.11 samples/sec Loss 1.9047 Epoch: 15 Global Step: 260800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:14,418-Speed 5047.51 samples/sec Loss 1.9281 Epoch: 15 Global Step: 260850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:24,403-Speed 5128.55 samples/sec Loss 1.9468 Epoch: 15 Global Step: 260900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:35,371-Speed 4668.25 samples/sec Loss 1.9028 Epoch: 15 Global Step: 260950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:45,517-Speed 5046.39 samples/sec Loss 1.8857 Epoch: 15 Global Step: 261000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:10:55,565-Speed 5096.04 samples/sec Loss 1.9129 Epoch: 15 Global Step: 261050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:05,635-Speed 5084.74 samples/sec Loss 1.9143 Epoch: 15 Global Step: 261100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:15,697-Speed 5088.85 samples/sec Loss 1.8809 Epoch: 15 Global Step: 261150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:26,330-Speed 4815.61 samples/sec Loss 1.9045 Epoch: 15 Global Step: 261200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:36,586-Speed 4992.28 samples/sec Loss 1.9081 Epoch: 15 Global Step: 261250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:46,626-Speed 5100.15 samples/sec Loss 1.9176 Epoch: 15 Global Step: 261300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:11:56,674-Speed 5095.60 samples/sec Loss 1.8943 Epoch: 15 Global Step: 261350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:07,605-Speed 4684.14 samples/sec Loss 1.9054 Epoch: 15 Global Step: 261400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:17,480-Speed 5185.51 samples/sec Loss 1.9065 Epoch: 15 Global Step: 261450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:27,826-Speed 4948.57 samples/sec Loss 1.9049 Epoch: 15 Global Step: 261500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:37,694-Speed 5189.17 samples/sec Loss 1.9097 Epoch: 15 Global Step: 261550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:48,630-Speed 4682.06 samples/sec Loss 1.9398 Epoch: 15 Global Step: 261600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:12:58,886-Speed 4992.27 samples/sec Loss 1.8848 Epoch: 15 Global Step: 261650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:13:08,934-Speed 5095.95 samples/sec Loss 1.9457 Epoch: 15 Global Step: 261700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:13:19,802-Speed 4711.15 samples/sec Loss 1.9170 Epoch: 15 Global Step: 261750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:13:29,840-Speed 5101.03 samples/sec Loss 1.8822 Epoch: 15 Global Step: 261800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:13:40,563-Speed 4774.73 samples/sec Loss 1.9108 Epoch: 15 Global Step: 261850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:13:50,418-Speed 5195.45 samples/sec Loss 1.9013 Epoch: 15 Global Step: 261900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:14:00,438-Speed 5110.21 samples/sec Loss 1.8912 Epoch: 15 Global Step: 261950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:14:10,595-Speed 5041.40 samples/sec Loss 1.8968 Epoch: 15 Global Step: 262000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:14:27,475-[lfw][262000]XNorm: 23.285289 Training: 2021-03-19 13:14:27,475-[lfw][262000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 13:14:27,475-[lfw][262000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:14:46,214-[cfp_fp][262000]XNorm: 19.625423 Training: 2021-03-19 13:14:46,214-[cfp_fp][262000]Accuracy-Flip: 0.97700+-0.00841 Training: 2021-03-19 13:14:46,214-[cfp_fp][262000]Accuracy-Highest: 0.97700 Training: 2021-03-19 13:15:02,493-[agedb_30][262000]XNorm: 22.569447 Training: 2021-03-19 13:15:02,493-[agedb_30][262000]Accuracy-Flip: 0.97417+-0.00743 Training: 2021-03-19 13:15:02,494-[agedb_30][262000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:15:12,315-Speed 829.56 samples/sec Loss 1.9065 Epoch: 15 Global Step: 262050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:15:22,499-Speed 5027.71 samples/sec Loss 1.8946 Epoch: 15 Global Step: 262100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:15:33,124-Speed 4819.23 samples/sec Loss 1.8895 Epoch: 15 Global Step: 262150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:15:43,472-Speed 4947.73 samples/sec Loss 1.9038 Epoch: 15 Global Step: 262200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:15:53,285-Speed 5217.90 samples/sec Loss 1.8878 Epoch: 15 Global Step: 262250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:03,464-Speed 5030.23 samples/sec Loss 1.9081 Epoch: 15 Global Step: 262300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:13,552-Speed 5076.10 samples/sec Loss 1.9078 Epoch: 15 Global Step: 262350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:23,588-Speed 5102.00 samples/sec Loss 1.9232 Epoch: 15 Global Step: 262400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:33,530-Speed 5149.79 samples/sec Loss 1.9099 Epoch: 15 Global Step: 262450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:43,654-Speed 5057.87 samples/sec Loss 1.9153 Epoch: 15 Global Step: 262500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:16:53,599-Speed 5148.47 samples/sec Loss 1.9119 Epoch: 15 Global Step: 262550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:03,794-Speed 5022.10 samples/sec Loss 1.9087 Epoch: 15 Global Step: 262600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:13,719-Speed 5159.00 samples/sec Loss 1.9185 Epoch: 15 Global Step: 262650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:23,909-Speed 5024.77 samples/sec Loss 1.9119 Epoch: 15 Global Step: 262700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:34,005-Speed 5071.83 samples/sec Loss 1.9359 Epoch: 15 Global Step: 262750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:43,992-Speed 5126.73 samples/sec Loss 1.9118 Epoch: 15 Global Step: 262800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:17:54,008-Speed 5112.30 samples/sec Loss 1.8890 Epoch: 15 Global Step: 262850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:03,911-Speed 5170.06 samples/sec Loss 1.9129 Epoch: 15 Global Step: 262900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:14,033-Speed 5058.82 samples/sec Loss 1.9101 Epoch: 15 Global Step: 262950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:24,038-Speed 5118.14 samples/sec Loss 1.8857 Epoch: 15 Global Step: 263000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:33,966-Speed 5157.69 samples/sec Loss 1.9160 Epoch: 15 Global Step: 263050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:44,123-Speed 5041.11 samples/sec Loss 1.9108 Epoch: 15 Global Step: 263100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:18:54,162-Speed 5100.35 samples/sec Loss 1.9266 Epoch: 15 Global Step: 263150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:19:04,494-Speed 4955.61 samples/sec Loss 1.9139 Epoch: 15 Global Step: 263200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:19:14,231-Speed 5258.49 samples/sec Loss 1.9052 Epoch: 15 Global Step: 263250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-19 13:19:24,342-Speed 5064.35 samples/sec Loss 1.9236 Epoch: 15 Global Step: 263300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:19:34,387-Speed 5096.88 samples/sec Loss 1.8936 Epoch: 15 Global Step: 263350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:19:44,645-Speed 4991.91 samples/sec Loss 1.9031 Epoch: 15 Global Step: 263400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:19:54,749-Speed 5067.38 samples/sec Loss 1.8955 Epoch: 15 Global Step: 263450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:04,772-Speed 5108.80 samples/sec Loss 1.9156 Epoch: 15 Global Step: 263500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:14,816-Speed 5097.88 samples/sec Loss 1.9123 Epoch: 15 Global Step: 263550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:24,786-Speed 5135.70 samples/sec Loss 1.9034 Epoch: 15 Global Step: 263600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:35,091-Speed 4968.31 samples/sec Loss 1.9232 Epoch: 15 Global Step: 263650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:44,946-Speed 5195.83 samples/sec Loss 1.9065 Epoch: 15 Global Step: 263700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:20:55,017-Speed 5084.31 samples/sec Loss 1.9210 Epoch: 15 Global Step: 263750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:21:05,041-Speed 5108.12 samples/sec Loss 1.9153 Epoch: 15 Global Step: 263800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:21:15,149-Speed 5065.53 samples/sec Loss 1.9080 Epoch: 15 Global Step: 263850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:21:25,184-Speed 5102.26 samples/sec Loss 1.9033 Epoch: 15 Global Step: 263900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:21:35,283-Speed 5070.02 samples/sec Loss 1.8965 Epoch: 15 Global Step: 263950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:21:45,129-Speed 5200.16 samples/sec Loss 1.9126 Epoch: 15 Global Step: 264000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:22:01,967-[lfw][264000]XNorm: 23.309164 Training: 2021-03-19 13:22:01,967-[lfw][264000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-19 13:22:01,967-[lfw][264000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:22:20,708-[cfp_fp][264000]XNorm: 19.586296 Training: 2021-03-19 13:22:20,708-[cfp_fp][264000]Accuracy-Flip: 0.97529+-0.00801 Training: 2021-03-19 13:22:20,708-[cfp_fp][264000]Accuracy-Highest: 0.97700 Training: 2021-03-19 13:22:36,918-[agedb_30][264000]XNorm: 22.584019 Training: 2021-03-19 13:22:36,919-[agedb_30][264000]Accuracy-Flip: 0.97417+-0.00708 Training: 2021-03-19 13:22:36,925-[agedb_30][264000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:22:46,756-Speed 830.81 samples/sec Loss 1.9008 Epoch: 15 Global Step: 264050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:22:57,685-Speed 4685.25 samples/sec Loss 1.8956 Epoch: 15 Global Step: 264100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:07,994-Speed 4966.69 samples/sec Loss 1.9055 Epoch: 15 Global Step: 264150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:17,931-Speed 5153.11 samples/sec Loss 1.8847 Epoch: 15 Global Step: 264200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:28,269-Speed 4952.75 samples/sec Loss 1.9229 Epoch: 15 Global Step: 264250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:38,212-Speed 5149.52 samples/sec Loss 1.9203 Epoch: 15 Global Step: 264300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:48,862-Speed 4807.82 samples/sec Loss 1.9265 Epoch: 15 Global Step: 264350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:23:58,916-Speed 5092.74 samples/sec Loss 1.8946 Epoch: 15 Global Step: 264400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:24:09,092-Speed 5031.48 samples/sec Loss 1.9067 Epoch: 15 Global Step: 264450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:24:19,147-Speed 5092.52 samples/sec Loss 1.9235 Epoch: 15 Global Step: 264500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:24:29,382-Speed 5002.75 samples/sec Loss 1.9091 Epoch: 15 Global Step: 264550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:24:40,097-Speed 4778.57 samples/sec Loss 1.8846 Epoch: 15 Global Step: 264600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:24:50,326-Speed 5005.54 samples/sec Loss 1.9045 Epoch: 15 Global Step: 264650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:00,152-Speed 5210.80 samples/sec Loss 1.9011 Epoch: 15 Global Step: 264700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:10,051-Speed 5172.54 samples/sec Loss 1.9255 Epoch: 15 Global Step: 264750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:20,961-Speed 4693.21 samples/sec Loss 1.9055 Epoch: 15 Global Step: 264800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:31,301-Speed 4951.87 samples/sec Loss 1.8973 Epoch: 15 Global Step: 264850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:41,520-Speed 5011.10 samples/sec Loss 1.9330 Epoch: 15 Global Step: 264900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:25:52,439-Speed 4689.20 samples/sec Loss 1.9025 Epoch: 15 Global Step: 264950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:02,340-Speed 5171.52 samples/sec Loss 1.8994 Epoch: 15 Global Step: 265000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:12,326-Speed 5127.37 samples/sec Loss 1.9201 Epoch: 15 Global Step: 265050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:22,391-Speed 5087.05 samples/sec Loss 1.9016 Epoch: 15 Global Step: 265100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:33,153-Speed 4758.08 samples/sec Loss 1.8876 Epoch: 15 Global Step: 265150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:43,828-Speed 4796.03 samples/sec Loss 1.9233 Epoch: 15 Global Step: 265200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:26:53,776-Speed 5147.46 samples/sec Loss 1.9013 Epoch: 15 Global Step: 265250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:03,770-Speed 5123.21 samples/sec Loss 1.9277 Epoch: 15 Global Step: 265300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:13,955-Speed 5027.62 samples/sec Loss 1.9289 Epoch: 15 Global Step: 265350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:24,343-Speed 4929.02 samples/sec Loss 1.9027 Epoch: 15 Global Step: 265400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:34,457-Speed 5062.60 samples/sec Loss 1.8942 Epoch: 15 Global Step: 265450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:45,297-Speed 4723.63 samples/sec Loss 1.8922 Epoch: 15 Global Step: 265500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:27:55,461-Speed 5037.52 samples/sec Loss 1.9072 Epoch: 15 Global Step: 265550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:05,570-Speed 5065.22 samples/sec Loss 1.9087 Epoch: 15 Global Step: 265600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:15,695-Speed 5057.37 samples/sec Loss 1.9032 Epoch: 15 Global Step: 265650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:25,900-Speed 5017.38 samples/sec Loss 1.9173 Epoch: 15 Global Step: 265700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:36,183-Speed 4979.48 samples/sec Loss 1.8922 Epoch: 15 Global Step: 265750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:46,203-Speed 5109.83 samples/sec Loss 1.8809 Epoch: 15 Global Step: 265800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:28:56,095-Speed 5176.22 samples/sec Loss 1.8906 Epoch: 15 Global Step: 265850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:29:06,346-Speed 4994.98 samples/sec Loss 1.9118 Epoch: 15 Global Step: 265900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:29:16,159-Speed 5217.94 samples/sec Loss 1.9050 Epoch: 15 Global Step: 265950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:29:26,014-Speed 5195.45 samples/sec Loss 1.9050 Epoch: 15 Global Step: 266000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:29:42,883-[lfw][266000]XNorm: 23.260312 Training: 2021-03-19 13:29:42,884-[lfw][266000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-19 13:29:42,884-[lfw][266000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:30:01,691-[cfp_fp][266000]XNorm: 19.539561 Training: 2021-03-19 13:30:01,692-[cfp_fp][266000]Accuracy-Flip: 0.97571+-0.00960 Training: 2021-03-19 13:30:01,692-[cfp_fp][266000]Accuracy-Highest: 0.97700 Training: 2021-03-19 13:30:17,826-[agedb_30][266000]XNorm: 22.521007 Training: 2021-03-19 13:30:17,826-[agedb_30][266000]Accuracy-Flip: 0.97400+-0.00735 Training: 2021-03-19 13:30:17,826-[agedb_30][266000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:30:27,816-Speed 828.47 samples/sec Loss 1.8957 Epoch: 15 Global Step: 266050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:30:38,060-Speed 4998.16 samples/sec Loss 1.9139 Epoch: 15 Global Step: 266100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:30:48,182-Speed 5058.79 samples/sec Loss 1.8643 Epoch: 15 Global Step: 266150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:30:58,236-Speed 5092.80 samples/sec Loss 1.8577 Epoch: 15 Global Step: 266200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:08,312-Speed 5081.54 samples/sec Loss 1.8980 Epoch: 15 Global Step: 266250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:18,337-Speed 5107.15 samples/sec Loss 1.9051 Epoch: 15 Global Step: 266300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:28,311-Speed 5133.95 samples/sec Loss 1.8976 Epoch: 15 Global Step: 266350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:38,197-Speed 5179.32 samples/sec Loss 1.8909 Epoch: 15 Global Step: 266400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:48,031-Speed 5206.72 samples/sec Loss 1.9071 Epoch: 15 Global Step: 266450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:31:58,003-Speed 5134.73 samples/sec Loss 1.8983 Epoch: 15 Global Step: 266500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:08,012-Speed 5115.77 samples/sec Loss 1.9115 Epoch: 15 Global Step: 266550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:18,034-Speed 5108.77 samples/sec Loss 1.9022 Epoch: 15 Global Step: 266600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:28,009-Speed 5133.40 samples/sec Loss 1.9288 Epoch: 15 Global Step: 266650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:38,029-Speed 5109.89 samples/sec Loss 1.9165 Epoch: 15 Global Step: 266700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:48,022-Speed 5123.85 samples/sec Loss 1.9078 Epoch: 15 Global Step: 266750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:32:58,059-Speed 5101.49 samples/sec Loss 1.9328 Epoch: 15 Global Step: 266800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:33:08,047-Speed 5126.31 samples/sec Loss 1.9117 Epoch: 15 Global Step: 266850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:33:17,700-Speed 5304.56 samples/sec Loss 1.8922 Epoch: 15 Global Step: 266900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:33:27,747-Speed 5096.18 samples/sec Loss 1.9100 Epoch: 15 Global Step: 266950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:33:37,748-Speed 5119.95 samples/sec Loss 1.9078 Epoch: 15 Global Step: 267000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:33:47,979-Speed 5004.77 samples/sec Loss 1.9107 Epoch: 15 Global Step: 267050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:34:10,242-Speed 2299.81 samples/sec Loss 1.8946 Epoch: 16 Global Step: 267100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:34:20,925-Speed 4793.15 samples/sec Loss 1.9001 Epoch: 16 Global Step: 267150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:34:31,695-Speed 4754.26 samples/sec Loss 1.8924 Epoch: 16 Global Step: 267200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:34:42,120-Speed 4911.45 samples/sec Loss 1.8613 Epoch: 16 Global Step: 267250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:34:52,556-Speed 4906.27 samples/sec Loss 1.8687 Epoch: 16 Global Step: 267300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:03,100-Speed 4856.37 samples/sec Loss 1.8944 Epoch: 16 Global Step: 267350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:13,458-Speed 4943.12 samples/sec Loss 1.8722 Epoch: 16 Global Step: 267400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:23,913-Speed 4897.59 samples/sec Loss 1.9122 Epoch: 16 Global Step: 267450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:34,778-Speed 4712.56 samples/sec Loss 1.8630 Epoch: 16 Global Step: 267500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:45,201-Speed 4912.34 samples/sec Loss 1.9103 Epoch: 16 Global Step: 267550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:35:55,362-Speed 5039.53 samples/sec Loss 1.8663 Epoch: 16 Global Step: 267600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:06,136-Speed 4752.39 samples/sec Loss 1.9052 Epoch: 16 Global Step: 267650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:16,512-Speed 4934.83 samples/sec Loss 1.9093 Epoch: 16 Global Step: 267700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:27,870-Speed 4508.27 samples/sec Loss 1.8599 Epoch: 16 Global Step: 267750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:38,411-Speed 4857.11 samples/sec Loss 1.8789 Epoch: 16 Global Step: 267800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:48,593-Speed 5029.19 samples/sec Loss 1.9003 Epoch: 16 Global Step: 267850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:36:59,026-Speed 4907.38 samples/sec Loss 1.8691 Epoch: 16 Global Step: 267900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:37:09,062-Speed 5102.31 samples/sec Loss 1.8772 Epoch: 16 Global Step: 267950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:37:19,395-Speed 4954.96 samples/sec Loss 1.8943 Epoch: 16 Global Step: 268000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:37:36,405-[lfw][268000]XNorm: 23.242159 Training: 2021-03-19 13:37:36,406-[lfw][268000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 13:37:36,406-[lfw][268000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:37:55,351-[cfp_fp][268000]XNorm: 19.540777 Training: 2021-03-19 13:37:55,351-[cfp_fp][268000]Accuracy-Flip: 0.97500+-0.00888 Training: 2021-03-19 13:37:55,351-[cfp_fp][268000]Accuracy-Highest: 0.97700 Training: 2021-03-19 13:38:11,519-[agedb_30][268000]XNorm: 22.484453 Training: 2021-03-19 13:38:11,519-[agedb_30][268000]Accuracy-Flip: 0.97400+-0.00750 Training: 2021-03-19 13:38:11,519-[agedb_30][268000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:38:22,962-Speed 805.47 samples/sec Loss 1.9170 Epoch: 16 Global Step: 268050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:38:33,829-Speed 4711.86 samples/sec Loss 1.8845 Epoch: 16 Global Step: 268100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:38:44,316-Speed 4882.48 samples/sec Loss 1.8713 Epoch: 16 Global Step: 268150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:38:55,250-Speed 4682.83 samples/sec Loss 1.8808 Epoch: 16 Global Step: 268200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:05,551-Speed 4970.92 samples/sec Loss 1.8760 Epoch: 16 Global Step: 268250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:15,750-Speed 5020.54 samples/sec Loss 1.9159 Epoch: 16 Global Step: 268300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:26,786-Speed 4639.30 samples/sec Loss 1.8907 Epoch: 16 Global Step: 268350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:36,900-Speed 5063.04 samples/sec Loss 1.8844 Epoch: 16 Global Step: 268400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:47,231-Speed 4956.15 samples/sec Loss 1.8668 Epoch: 16 Global Step: 268450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:39:58,316-Speed 4619.30 samples/sec Loss 1.8976 Epoch: 16 Global Step: 268500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:40:09,395-Speed 4621.25 samples/sec Loss 1.8830 Epoch: 16 Global Step: 268550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:40:19,549-Speed 5042.98 samples/sec Loss 1.8814 Epoch: 16 Global Step: 268600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:40:29,869-Speed 4961.16 samples/sec Loss 1.8806 Epoch: 16 Global Step: 268650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:40:40,147-Speed 4982.18 samples/sec Loss 1.8977 Epoch: 16 Global Step: 268700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:40:50,295-Speed 5045.57 samples/sec Loss 1.8702 Epoch: 16 Global Step: 268750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:00,875-Speed 4839.44 samples/sec Loss 1.8833 Epoch: 16 Global Step: 268800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:11,992-Speed 4605.91 samples/sec Loss 1.8729 Epoch: 16 Global Step: 268850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:22,202-Speed 5015.02 samples/sec Loss 1.9054 Epoch: 16 Global Step: 268900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:32,696-Speed 4879.79 samples/sec Loss 1.8813 Epoch: 16 Global Step: 268950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:43,004-Speed 4967.33 samples/sec Loss 1.8811 Epoch: 16 Global Step: 269000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:41:53,260-Speed 4992.57 samples/sec Loss 1.8788 Epoch: 16 Global Step: 269050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:03,590-Speed 4956.64 samples/sec Loss 1.9062 Epoch: 16 Global Step: 269100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:14,102-Speed 4870.91 samples/sec Loss 1.8893 Epoch: 16 Global Step: 269150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:24,357-Speed 4993.20 samples/sec Loss 1.9161 Epoch: 16 Global Step: 269200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:34,910-Speed 4852.17 samples/sec Loss 1.9008 Epoch: 16 Global Step: 269250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:45,318-Speed 4919.44 samples/sec Loss 1.8863 Epoch: 16 Global Step: 269300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:42:56,041-Speed 4775.09 samples/sec Loss 1.9045 Epoch: 16 Global Step: 269350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:06,452-Speed 4918.20 samples/sec Loss 1.8805 Epoch: 16 Global Step: 269400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:16,626-Speed 5032.93 samples/sec Loss 1.9049 Epoch: 16 Global Step: 269450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:27,132-Speed 4873.64 samples/sec Loss 1.9065 Epoch: 16 Global Step: 269500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:37,403-Speed 4985.25 samples/sec Loss 1.9020 Epoch: 16 Global Step: 269550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:47,948-Speed 4855.57 samples/sec Loss 1.9177 Epoch: 16 Global Step: 269600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:43:58,208-Speed 4990.42 samples/sec Loss 1.9028 Epoch: 16 Global Step: 269650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:44:08,536-Speed 4958.09 samples/sec Loss 1.9095 Epoch: 16 Global Step: 269700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:44:19,094-Speed 4849.85 samples/sec Loss 1.9118 Epoch: 16 Global Step: 269750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:44:29,235-Speed 5048.74 samples/sec Loss 1.9150 Epoch: 16 Global Step: 269800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:44:39,685-Speed 4899.90 samples/sec Loss 1.9071 Epoch: 16 Global Step: 269850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:44:50,183-Speed 4877.72 samples/sec Loss 1.8790 Epoch: 16 Global Step: 269900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:45:00,778-Speed 4832.87 samples/sec Loss 1.8919 Epoch: 16 Global Step: 269950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:45:11,389-Speed 4825.55 samples/sec Loss 1.9169 Epoch: 16 Global Step: 270000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:45:28,435-[lfw][270000]XNorm: 23.247112 Training: 2021-03-19 13:45:28,435-[lfw][270000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 13:45:28,435-[lfw][270000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:45:47,148-[cfp_fp][270000]XNorm: 19.565434 Training: 2021-03-19 13:45:47,149-[cfp_fp][270000]Accuracy-Flip: 0.97714+-0.00826 Training: 2021-03-19 13:45:47,149-[cfp_fp][270000]Accuracy-Highest: 0.97714 Training: 2021-03-19 13:46:03,302-[agedb_30][270000]XNorm: 22.504135 Training: 2021-03-19 13:46:03,302-[agedb_30][270000]Accuracy-Flip: 0.97500+-0.00778 Training: 2021-03-19 13:46:03,302-[agedb_30][270000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:46:13,394-Speed 825.75 samples/sec Loss 1.8888 Epoch: 16 Global Step: 270050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:46:23,492-Speed 5070.34 samples/sec Loss 1.8820 Epoch: 16 Global Step: 270100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:46:33,805-Speed 4965.26 samples/sec Loss 1.8705 Epoch: 16 Global Step: 270150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:46:44,386-Speed 4839.22 samples/sec Loss 1.8929 Epoch: 16 Global Step: 270200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:46:54,632-Speed 4997.00 samples/sec Loss 1.9205 Epoch: 16 Global Step: 270250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:04,882-Speed 4995.59 samples/sec Loss 1.8702 Epoch: 16 Global Step: 270300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:15,087-Speed 5017.46 samples/sec Loss 1.8551 Epoch: 16 Global Step: 270350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:25,369-Speed 4980.22 samples/sec Loss 1.9008 Epoch: 16 Global Step: 270400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:35,872-Speed 4875.28 samples/sec Loss 1.8886 Epoch: 16 Global Step: 270450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:46,277-Speed 4921.14 samples/sec Loss 1.8976 Epoch: 16 Global Step: 270500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:47:56,514-Speed 5001.62 samples/sec Loss 1.8915 Epoch: 16 Global Step: 270550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:06,746-Speed 5004.11 samples/sec Loss 1.8934 Epoch: 16 Global Step: 270600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:16,986-Speed 5000.70 samples/sec Loss 1.8801 Epoch: 16 Global Step: 270650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:27,136-Speed 5044.68 samples/sec Loss 1.8814 Epoch: 16 Global Step: 270700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:37,594-Speed 4896.00 samples/sec Loss 1.8901 Epoch: 16 Global Step: 270750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:48,059-Speed 4892.61 samples/sec Loss 1.8956 Epoch: 16 Global Step: 270800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:48:59,627-Speed 4426.49 samples/sec Loss 1.9037 Epoch: 16 Global Step: 270850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:49:10,024-Speed 4924.36 samples/sec Loss 1.8887 Epoch: 16 Global Step: 270900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:49:20,274-Speed 4995.77 samples/sec Loss 1.9071 Epoch: 16 Global Step: 270950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:49:30,625-Speed 4946.38 samples/sec Loss 1.8692 Epoch: 16 Global Step: 271000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:49:40,760-Speed 5052.16 samples/sec Loss 1.9095 Epoch: 16 Global Step: 271050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:49:51,175-Speed 4916.44 samples/sec Loss 1.8964 Epoch: 16 Global Step: 271100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:01,917-Speed 4766.92 samples/sec Loss 1.9046 Epoch: 16 Global Step: 271150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:12,272-Speed 4944.75 samples/sec Loss 1.8919 Epoch: 16 Global Step: 271200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:22,734-Speed 4894.25 samples/sec Loss 1.9337 Epoch: 16 Global Step: 271250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:32,969-Speed 5002.81 samples/sec Loss 1.8781 Epoch: 16 Global Step: 271300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:43,586-Speed 4822.55 samples/sec Loss 1.8923 Epoch: 16 Global Step: 271350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:50:55,101-Speed 4446.49 samples/sec Loss 1.8782 Epoch: 16 Global Step: 271400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:05,462-Speed 4942.03 samples/sec Loss 1.9004 Epoch: 16 Global Step: 271450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:16,839-Speed 4500.46 samples/sec Loss 1.8845 Epoch: 16 Global Step: 271500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:27,184-Speed 4949.76 samples/sec Loss 1.8776 Epoch: 16 Global Step: 271550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:37,508-Speed 4959.86 samples/sec Loss 1.8984 Epoch: 16 Global Step: 271600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:48,794-Speed 4537.05 samples/sec Loss 1.8927 Epoch: 16 Global Step: 271650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:51:59,297-Speed 4875.11 samples/sec Loss 1.8869 Epoch: 16 Global Step: 271700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:52:09,929-Speed 4816.22 samples/sec Loss 1.9149 Epoch: 16 Global Step: 271750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:52:20,228-Speed 4971.64 samples/sec Loss 1.9170 Epoch: 16 Global Step: 271800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:52:30,563-Speed 4954.26 samples/sec Loss 1.9195 Epoch: 16 Global Step: 271850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:52:41,768-Speed 4569.80 samples/sec Loss 1.8903 Epoch: 16 Global Step: 271900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:52:52,799-Speed 4641.76 samples/sec Loss 1.8906 Epoch: 16 Global Step: 271950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:53:02,864-Speed 5087.15 samples/sec Loss 1.8810 Epoch: 16 Global Step: 272000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:53:19,778-[lfw][272000]XNorm: 23.232589 Training: 2021-03-19 13:53:19,778-[lfw][272000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 13:53:19,778-[lfw][272000]Accuracy-Highest: 0.99767 Training: 2021-03-19 13:53:38,563-[cfp_fp][272000]XNorm: 19.504262 Training: 2021-03-19 13:53:38,563-[cfp_fp][272000]Accuracy-Flip: 0.97643+-0.00872 Training: 2021-03-19 13:53:38,564-[cfp_fp][272000]Accuracy-Highest: 0.97714 Training: 2021-03-19 13:53:54,851-[agedb_30][272000]XNorm: 22.494213 Training: 2021-03-19 13:53:54,851-[agedb_30][272000]Accuracy-Flip: 0.97383+-0.00749 Training: 2021-03-19 13:53:54,851-[agedb_30][272000]Accuracy-Highest: 0.97667 Training: 2021-03-19 13:54:05,132-Speed 822.26 samples/sec Loss 1.8861 Epoch: 16 Global Step: 272050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:54:15,791-Speed 4803.46 samples/sec Loss 1.9020 Epoch: 16 Global Step: 272100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:54:25,894-Speed 5068.14 samples/sec Loss 1.8829 Epoch: 16 Global Step: 272150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:54:36,465-Speed 4844.08 samples/sec Loss 1.9083 Epoch: 16 Global Step: 272200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:54:46,704-Speed 5000.83 samples/sec Loss 1.9308 Epoch: 16 Global Step: 272250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:54:57,669-Speed 4669.36 samples/sec Loss 1.8724 Epoch: 16 Global Step: 272300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:07,736-Speed 5086.46 samples/sec Loss 1.8900 Epoch: 16 Global Step: 272350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:18,071-Speed 4953.99 samples/sec Loss 1.8986 Epoch: 16 Global Step: 272400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:28,778-Speed 4782.41 samples/sec Loss 1.8992 Epoch: 16 Global Step: 272450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:38,718-Speed 5151.22 samples/sec Loss 1.8540 Epoch: 16 Global Step: 272500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:49,429-Speed 4780.70 samples/sec Loss 1.8889 Epoch: 16 Global Step: 272550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:55:59,747-Speed 4962.45 samples/sec Loss 1.9018 Epoch: 16 Global Step: 272600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:56:10,562-Speed 4734.39 samples/sec Loss 1.9009 Epoch: 16 Global Step: 272650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:56:21,473-Speed 4693.04 samples/sec Loss 1.8993 Epoch: 16 Global Step: 272700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:56:32,089-Speed 4823.50 samples/sec Loss 1.8751 Epoch: 16 Global Step: 272750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:56:42,500-Speed 4917.92 samples/sec Loss 1.9012 Epoch: 16 Global Step: 272800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:56:52,859-Speed 4942.80 samples/sec Loss 1.9126 Epoch: 16 Global Step: 272850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:03,584-Speed 4774.63 samples/sec Loss 1.8854 Epoch: 16 Global Step: 272900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:13,862-Speed 4981.70 samples/sec Loss 1.8854 Epoch: 16 Global Step: 272950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:24,649-Speed 4746.54 samples/sec Loss 1.9132 Epoch: 16 Global Step: 273000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:34,895-Speed 4997.45 samples/sec Loss 1.8981 Epoch: 16 Global Step: 273050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:45,436-Speed 4858.08 samples/sec Loss 1.9003 Epoch: 16 Global Step: 273100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:57:55,652-Speed 5011.78 samples/sec Loss 1.8571 Epoch: 16 Global Step: 273150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:05,864-Speed 5014.42 samples/sec Loss 1.8982 Epoch: 16 Global Step: 273200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:16,643-Speed 4750.12 samples/sec Loss 1.8967 Epoch: 16 Global Step: 273250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:26,967-Speed 4959.84 samples/sec Loss 1.9040 Epoch: 16 Global Step: 273300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:37,481-Speed 4869.81 samples/sec Loss 1.8767 Epoch: 16 Global Step: 273350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:47,904-Speed 4912.66 samples/sec Loss 1.9030 Epoch: 16 Global Step: 273400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:58:58,563-Speed 4803.92 samples/sec Loss 1.8999 Epoch: 16 Global Step: 273450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:59:09,030-Speed 4891.83 samples/sec Loss 1.9109 Epoch: 16 Global Step: 273500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:59:19,486-Speed 4896.91 samples/sec Loss 1.8884 Epoch: 16 Global Step: 273550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:59:30,151-Speed 4801.21 samples/sec Loss 1.8846 Epoch: 16 Global Step: 273600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:59:40,652-Speed 4875.85 samples/sec Loss 1.8612 Epoch: 16 Global Step: 273650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 13:59:51,229-Speed 4840.95 samples/sec Loss 1.8885 Epoch: 16 Global Step: 273700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:01,454-Speed 5007.62 samples/sec Loss 1.9172 Epoch: 16 Global Step: 273750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:12,038-Speed 4837.95 samples/sec Loss 1.8850 Epoch: 16 Global Step: 273800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:22,372-Speed 4954.59 samples/sec Loss 1.8654 Epoch: 16 Global Step: 273850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:32,463-Speed 5074.18 samples/sec Loss 1.8802 Epoch: 16 Global Step: 273900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:42,720-Speed 4992.18 samples/sec Loss 1.8858 Epoch: 16 Global Step: 273950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:00:53,156-Speed 4906.52 samples/sec Loss 1.8976 Epoch: 16 Global Step: 274000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:01:10,122-[lfw][274000]XNorm: 23.200434 Training: 2021-03-19 14:01:10,123-[lfw][274000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 14:01:10,123-[lfw][274000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:01:29,131-[cfp_fp][274000]XNorm: 19.540240 Training: 2021-03-19 14:01:29,131-[cfp_fp][274000]Accuracy-Flip: 0.97557+-0.00886 Training: 2021-03-19 14:01:29,131-[cfp_fp][274000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:01:45,355-[agedb_30][274000]XNorm: 22.499754 Training: 2021-03-19 14:01:45,355-[agedb_30][274000]Accuracy-Flip: 0.97517+-0.00751 Training: 2021-03-19 14:01:45,355-[agedb_30][274000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:01:55,342-Speed 823.34 samples/sec Loss 1.8890 Epoch: 16 Global Step: 274050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:05,530-Speed 5025.74 samples/sec Loss 1.9044 Epoch: 16 Global Step: 274100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:16,019-Speed 4881.53 samples/sec Loss 1.8961 Epoch: 16 Global Step: 274150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:26,565-Speed 4855.05 samples/sec Loss 1.8820 Epoch: 16 Global Step: 274200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:37,019-Speed 4898.12 samples/sec Loss 1.9187 Epoch: 16 Global Step: 274250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:48,303-Speed 4537.86 samples/sec Loss 1.8909 Epoch: 16 Global Step: 274300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:02:59,044-Speed 4767.36 samples/sec Loss 1.9027 Epoch: 16 Global Step: 274350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:03:09,616-Speed 4843.26 samples/sec Loss 1.9140 Epoch: 16 Global Step: 274400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:03:21,042-Speed 4481.09 samples/sec Loss 1.9087 Epoch: 16 Global Step: 274450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:03:31,102-Speed 5090.06 samples/sec Loss 1.9003 Epoch: 16 Global Step: 274500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:03:41,439-Speed 4953.22 samples/sec Loss 1.8685 Epoch: 16 Global Step: 274550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:03:51,963-Speed 4865.40 samples/sec Loss 1.8990 Epoch: 16 Global Step: 274600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:01,976-Speed 5113.69 samples/sec Loss 1.8856 Epoch: 16 Global Step: 274650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:12,652-Speed 4796.10 samples/sec Loss 1.8812 Epoch: 16 Global Step: 274700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:23,054-Speed 4922.69 samples/sec Loss 1.8911 Epoch: 16 Global Step: 274750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:34,312-Speed 4547.97 samples/sec Loss 1.8723 Epoch: 16 Global Step: 274800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:45,343-Speed 4641.58 samples/sec Loss 1.8949 Epoch: 16 Global Step: 274850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:04:55,882-Speed 4858.42 samples/sec Loss 1.8669 Epoch: 16 Global Step: 274900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:06,195-Speed 4965.34 samples/sec Loss 1.8939 Epoch: 16 Global Step: 274950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:16,608-Speed 4917.66 samples/sec Loss 1.9009 Epoch: 16 Global Step: 275000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:27,730-Speed 4603.86 samples/sec Loss 1.9170 Epoch: 16 Global Step: 275050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:37,690-Speed 5140.87 samples/sec Loss 1.8946 Epoch: 16 Global Step: 275100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:47,860-Speed 5034.51 samples/sec Loss 1.8995 Epoch: 16 Global Step: 275150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:05:58,005-Speed 5047.48 samples/sec Loss 1.8677 Epoch: 16 Global Step: 275200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:06:08,361-Speed 4944.02 samples/sec Loss 1.8867 Epoch: 16 Global Step: 275250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:06:19,327-Speed 4669.49 samples/sec Loss 1.8737 Epoch: 16 Global Step: 275300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:06:30,217-Speed 4702.04 samples/sec Loss 1.9044 Epoch: 16 Global Step: 275350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:06:40,643-Speed 4911.02 samples/sec Loss 1.8741 Epoch: 16 Global Step: 275400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:06:50,837-Speed 5022.80 samples/sec Loss 1.8853 Epoch: 16 Global Step: 275450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:01,428-Speed 4834.80 samples/sec Loss 1.8982 Epoch: 16 Global Step: 275500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:11,677-Speed 4995.70 samples/sec Loss 1.9125 Epoch: 16 Global Step: 275550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:22,006-Speed 4957.49 samples/sec Loss 1.8843 Epoch: 16 Global Step: 275600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:32,456-Speed 4899.55 samples/sec Loss 1.9182 Epoch: 16 Global Step: 275650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:42,993-Speed 4859.66 samples/sec Loss 1.9187 Epoch: 16 Global Step: 275700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:07:53,984-Speed 4658.66 samples/sec Loss 1.8828 Epoch: 16 Global Step: 275750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:08:04,470-Speed 4883.12 samples/sec Loss 1.9004 Epoch: 16 Global Step: 275800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:08:15,156-Speed 4792.27 samples/sec Loss 1.9166 Epoch: 16 Global Step: 275850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:08:25,591-Speed 4907.02 samples/sec Loss 1.8750 Epoch: 16 Global Step: 275900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:08:35,992-Speed 4923.66 samples/sec Loss 1.8853 Epoch: 16 Global Step: 275950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:08:46,436-Speed 4902.84 samples/sec Loss 1.8919 Epoch: 16 Global Step: 276000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:09:04,581-[lfw][276000]XNorm: 23.241852 Training: 2021-03-19 14:09:04,582-[lfw][276000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 14:09:04,582-[lfw][276000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:09:25,724-[cfp_fp][276000]XNorm: 19.536028 Training: 2021-03-19 14:09:25,725-[cfp_fp][276000]Accuracy-Flip: 0.97629+-0.00890 Training: 2021-03-19 14:09:25,725-[cfp_fp][276000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:09:43,894-[agedb_30][276000]XNorm: 22.486978 Training: 2021-03-19 14:09:43,894-[agedb_30][276000]Accuracy-Flip: 0.97450+-0.00742 Training: 2021-03-19 14:09:43,894-[agedb_30][276000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:09:54,312-Speed 754.33 samples/sec Loss 1.9011 Epoch: 16 Global Step: 276050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:04,975-Speed 4801.98 samples/sec Loss 1.8803 Epoch: 16 Global Step: 276100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:15,648-Speed 4797.41 samples/sec Loss 1.8887 Epoch: 16 Global Step: 276150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:26,545-Speed 4698.94 samples/sec Loss 1.9064 Epoch: 16 Global Step: 276200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:37,108-Speed 4847.49 samples/sec Loss 1.8948 Epoch: 16 Global Step: 276250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:47,842-Speed 4770.09 samples/sec Loss 1.9077 Epoch: 16 Global Step: 276300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:10:58,536-Speed 4788.21 samples/sec Loss 1.8997 Epoch: 16 Global Step: 276350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:11:09,150-Speed 4823.84 samples/sec Loss 1.8940 Epoch: 16 Global Step: 276400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:11:19,770-Speed 4821.68 samples/sec Loss 1.8827 Epoch: 16 Global Step: 276450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:11:30,408-Speed 4812.89 samples/sec Loss 1.8866 Epoch: 16 Global Step: 276500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:11:41,088-Speed 4794.49 samples/sec Loss 1.8834 Epoch: 16 Global Step: 276550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:11:52,128-Speed 4637.79 samples/sec Loss 1.8882 Epoch: 16 Global Step: 276600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:02,659-Speed 4862.31 samples/sec Loss 1.9081 Epoch: 16 Global Step: 276650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:13,587-Speed 4685.84 samples/sec Loss 1.8903 Epoch: 16 Global Step: 276700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:24,257-Speed 4798.72 samples/sec Loss 1.9173 Epoch: 16 Global Step: 276750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:35,079-Speed 4731.38 samples/sec Loss 1.9055 Epoch: 16 Global Step: 276800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:45,847-Speed 4755.07 samples/sec Loss 1.9097 Epoch: 16 Global Step: 276850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:12:56,367-Speed 4867.44 samples/sec Loss 1.9159 Epoch: 16 Global Step: 276900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:13:07,070-Speed 4783.94 samples/sec Loss 1.8866 Epoch: 16 Global Step: 276950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:13:17,837-Speed 4755.48 samples/sec Loss 1.8668 Epoch: 16 Global Step: 277000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:13:28,522-Speed 4793.19 samples/sec Loss 1.9034 Epoch: 16 Global Step: 277050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:13:39,457-Speed 4682.51 samples/sec Loss 1.8844 Epoch: 16 Global Step: 277100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:13:50,197-Speed 4767.47 samples/sec Loss 1.8839 Epoch: 16 Global Step: 277150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:01,025-Speed 4728.97 samples/sec Loss 1.8990 Epoch: 16 Global Step: 277200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:11,903-Speed 4706.87 samples/sec Loss 1.8844 Epoch: 16 Global Step: 277250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:22,674-Speed 4754.87 samples/sec Loss 1.8867 Epoch: 16 Global Step: 277300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:33,609-Speed 4682.37 samples/sec Loss 1.8817 Epoch: 16 Global Step: 277350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:44,389-Speed 4750.00 samples/sec Loss 1.9051 Epoch: 16 Global Step: 277400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:14:55,290-Speed 4696.88 samples/sec Loss 1.8899 Epoch: 16 Global Step: 277450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:15:06,122-Speed 4726.86 samples/sec Loss 1.8866 Epoch: 16 Global Step: 277500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:15:16,899-Speed 4751.16 samples/sec Loss 1.9167 Epoch: 16 Global Step: 277550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:15:27,678-Speed 4750.43 samples/sec Loss 1.8792 Epoch: 16 Global Step: 277600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:15:39,372-Speed 4378.71 samples/sec Loss 1.9126 Epoch: 16 Global Step: 277650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:15:50,317-Speed 4677.96 samples/sec Loss 1.8751 Epoch: 16 Global Step: 277700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:01,861-Speed 4435.75 samples/sec Loss 1.8995 Epoch: 16 Global Step: 277750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:12,569-Speed 4781.48 samples/sec Loss 1.9005 Epoch: 16 Global Step: 277800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:23,329-Speed 4758.56 samples/sec Loss 1.9049 Epoch: 16 Global Step: 277850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:34,167-Speed 4724.55 samples/sec Loss 1.8988 Epoch: 16 Global Step: 277900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:45,121-Speed 4674.39 samples/sec Loss 1.8729 Epoch: 16 Global Step: 277950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:16:55,939-Speed 4733.19 samples/sec Loss 1.9165 Epoch: 16 Global Step: 278000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:17:14,283-[lfw][278000]XNorm: 23.226189 Training: 2021-03-19 14:17:14,283-[lfw][278000]Accuracy-Flip: 0.99650+-0.00273 Training: 2021-03-19 14:17:14,283-[lfw][278000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:17:34,943-[cfp_fp][278000]XNorm: 19.512714 Training: 2021-03-19 14:17:34,943-[cfp_fp][278000]Accuracy-Flip: 0.97600+-0.00925 Training: 2021-03-19 14:17:34,944-[cfp_fp][278000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:17:52,611-[agedb_30][278000]XNorm: 22.465284 Training: 2021-03-19 14:17:52,611-[agedb_30][278000]Accuracy-Flip: 0.97517+-0.00790 Training: 2021-03-19 14:17:52,611-[agedb_30][278000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:18:03,375-Speed 759.24 samples/sec Loss 1.9006 Epoch: 16 Global Step: 278050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:18:14,351-Speed 4665.10 samples/sec Loss 1.9091 Epoch: 16 Global Step: 278100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:18:25,860-Speed 4448.77 samples/sec Loss 1.9048 Epoch: 16 Global Step: 278150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:18:37,460-Speed 4414.36 samples/sec Loss 1.8978 Epoch: 16 Global Step: 278200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:18:48,229-Speed 4755.53 samples/sec Loss 1.8815 Epoch: 16 Global Step: 278250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:18:59,205-Speed 4664.86 samples/sec Loss 1.8805 Epoch: 16 Global Step: 278300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:19:10,777-Speed 4424.67 samples/sec Loss 1.8600 Epoch: 16 Global Step: 278350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:19:21,691-Speed 4691.95 samples/sec Loss 1.8920 Epoch: 16 Global Step: 278400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:19:32,446-Speed 4760.48 samples/sec Loss 1.8810 Epoch: 16 Global Step: 278450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:19:43,246-Speed 4741.38 samples/sec Loss 1.8901 Epoch: 16 Global Step: 278500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:19:54,864-Speed 4406.93 samples/sec Loss 1.8844 Epoch: 16 Global Step: 278550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:20:06,380-Speed 4446.29 samples/sec Loss 1.8884 Epoch: 16 Global Step: 278600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:20:17,191-Speed 4736.00 samples/sec Loss 1.8897 Epoch: 16 Global Step: 278650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:20:28,040-Speed 4719.87 samples/sec Loss 1.8993 Epoch: 16 Global Step: 278700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:20:38,916-Speed 4707.60 samples/sec Loss 1.8698 Epoch: 16 Global Step: 278750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:20:49,669-Speed 4761.75 samples/sec Loss 1.9093 Epoch: 16 Global Step: 278800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:00,470-Speed 4740.39 samples/sec Loss 1.9035 Epoch: 16 Global Step: 278850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:11,244-Speed 4753.32 samples/sec Loss 1.8756 Epoch: 16 Global Step: 278900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:21,958-Speed 4779.01 samples/sec Loss 1.9071 Epoch: 16 Global Step: 278950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:33,626-Speed 4388.48 samples/sec Loss 1.9033 Epoch: 16 Global Step: 279000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:44,388-Speed 4758.19 samples/sec Loss 1.9117 Epoch: 16 Global Step: 279050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-19 14:21:55,080-Speed 4789.18 samples/sec Loss 1.8858 Epoch: 16 Global Step: 279100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:05,782-Speed 4784.18 samples/sec Loss 1.8894 Epoch: 16 Global Step: 279150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:16,551-Speed 4755.30 samples/sec Loss 1.8804 Epoch: 16 Global Step: 279200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:27,426-Speed 4708.59 samples/sec Loss 1.8857 Epoch: 16 Global Step: 279250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:38,223-Speed 4742.19 samples/sec Loss 1.8933 Epoch: 16 Global Step: 279300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:48,980-Speed 4759.86 samples/sec Loss 1.9007 Epoch: 16 Global Step: 279350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:22:59,663-Speed 4793.20 samples/sec Loss 1.8982 Epoch: 16 Global Step: 279400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:23:10,576-Speed 4692.04 samples/sec Loss 1.8830 Epoch: 16 Global Step: 279450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:23:21,310-Speed 4770.19 samples/sec Loss 1.8926 Epoch: 16 Global Step: 279500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:23:31,807-Speed 4877.96 samples/sec Loss 1.9096 Epoch: 16 Global Step: 279550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:23:42,612-Speed 4739.10 samples/sec Loss 1.8815 Epoch: 16 Global Step: 279600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:23:53,410-Speed 4742.14 samples/sec Loss 1.9044 Epoch: 16 Global Step: 279650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:04,343-Speed 4683.11 samples/sec Loss 1.8712 Epoch: 16 Global Step: 279700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:15,056-Speed 4779.70 samples/sec Loss 1.8891 Epoch: 16 Global Step: 279750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:25,982-Speed 4686.30 samples/sec Loss 1.8933 Epoch: 16 Global Step: 279800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:36,660-Speed 4795.28 samples/sec Loss 1.9165 Epoch: 16 Global Step: 279850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:47,456-Speed 4743.63 samples/sec Loss 1.9033 Epoch: 16 Global Step: 279900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:24:58,190-Speed 4771.08 samples/sec Loss 1.8994 Epoch: 16 Global Step: 279950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:25:08,929-Speed 4767.96 samples/sec Loss 1.8814 Epoch: 16 Global Step: 280000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:25:27,747-[lfw][280000]XNorm: 23.141174 Training: 2021-03-19 14:25:27,747-[lfw][280000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 14:25:27,747-[lfw][280000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:25:48,094-[cfp_fp][280000]XNorm: 19.481838 Training: 2021-03-19 14:25:48,095-[cfp_fp][280000]Accuracy-Flip: 0.97686+-0.00971 Training: 2021-03-19 14:25:48,095-[cfp_fp][280000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:26:05,697-[agedb_30][280000]XNorm: 22.427401 Training: 2021-03-19 14:26:05,698-[agedb_30][280000]Accuracy-Flip: 0.97417+-0.00750 Training: 2021-03-19 14:26:05,698-[agedb_30][280000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:26:16,388-Speed 759.00 samples/sec Loss 1.9199 Epoch: 16 Global Step: 280050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:26:27,146-Speed 4759.59 samples/sec Loss 1.8788 Epoch: 16 Global Step: 280100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:26:38,022-Speed 4708.00 samples/sec Loss 1.8806 Epoch: 16 Global Step: 280150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:26:48,841-Speed 4732.51 samples/sec Loss 1.8977 Epoch: 16 Global Step: 280200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:26:59,346-Speed 4873.99 samples/sec Loss 1.8745 Epoch: 16 Global Step: 280250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:27:10,039-Speed 4788.46 samples/sec Loss 1.8944 Epoch: 16 Global Step: 280300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:27:20,744-Speed 4783.41 samples/sec Loss 1.8745 Epoch: 16 Global Step: 280350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:27:31,577-Speed 4726.53 samples/sec Loss 1.9190 Epoch: 16 Global Step: 280400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:27:42,123-Speed 4854.85 samples/sec Loss 1.8850 Epoch: 16 Global Step: 280450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:27:52,945-Speed 4731.93 samples/sec Loss 1.8753 Epoch: 16 Global Step: 280500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:03,733-Speed 4746.26 samples/sec Loss 1.8986 Epoch: 16 Global Step: 280550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:14,697-Speed 4669.82 samples/sec Loss 1.8939 Epoch: 16 Global Step: 280600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:25,412-Speed 4778.86 samples/sec Loss 1.8706 Epoch: 16 Global Step: 280650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:36,235-Speed 4731.03 samples/sec Loss 1.8895 Epoch: 16 Global Step: 280700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:47,120-Speed 4704.13 samples/sec Loss 1.8808 Epoch: 16 Global Step: 280750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:28:57,861-Speed 4766.89 samples/sec Loss 1.8998 Epoch: 16 Global Step: 280800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:29:08,567-Speed 4783.13 samples/sec Loss 1.8755 Epoch: 16 Global Step: 280850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:29:19,293-Speed 4773.65 samples/sec Loss 1.8816 Epoch: 16 Global Step: 280900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:29:30,249-Speed 4673.50 samples/sec Loss 1.8998 Epoch: 16 Global Step: 280950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:29:41,779-Speed 4441.09 samples/sec Loss 1.9160 Epoch: 16 Global Step: 281000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:29:53,133-Speed 4509.61 samples/sec Loss 1.8857 Epoch: 16 Global Step: 281050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:30:04,479-Speed 4512.86 samples/sec Loss 1.9010 Epoch: 16 Global Step: 281100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:30:16,188-Speed 4373.10 samples/sec Loss 1.9153 Epoch: 16 Global Step: 281150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:30:27,460-Speed 4542.23 samples/sec Loss 1.8874 Epoch: 16 Global Step: 281200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:30:38,628-Speed 4584.86 samples/sec Loss 1.8919 Epoch: 16 Global Step: 281250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:30:50,569-Speed 4288.09 samples/sec Loss 1.9072 Epoch: 16 Global Step: 281300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:01,860-Speed 4534.70 samples/sec Loss 1.8867 Epoch: 16 Global Step: 281350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:13,005-Speed 4594.79 samples/sec Loss 1.8986 Epoch: 16 Global Step: 281400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:24,703-Speed 4376.84 samples/sec Loss 1.8883 Epoch: 16 Global Step: 281450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:36,052-Speed 4511.65 samples/sec Loss 1.8996 Epoch: 16 Global Step: 281500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:47,681-Speed 4403.03 samples/sec Loss 1.8690 Epoch: 16 Global Step: 281550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:31:59,077-Speed 4493.19 samples/sec Loss 1.9040 Epoch: 16 Global Step: 281600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:32:10,254-Speed 4581.47 samples/sec Loss 1.8824 Epoch: 16 Global Step: 281650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:32:22,144-Speed 4306.24 samples/sec Loss 1.9119 Epoch: 16 Global Step: 281700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:32:32,997-Speed 4717.59 samples/sec Loss 1.9042 Epoch: 16 Global Step: 281750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:32:44,976-Speed 4274.74 samples/sec Loss 1.8869 Epoch: 16 Global Step: 281800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:32:55,750-Speed 4752.41 samples/sec Loss 1.8968 Epoch: 16 Global Step: 281850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:33:06,174-Speed 4911.94 samples/sec Loss 1.8799 Epoch: 16 Global Step: 281900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:33:17,271-Speed 4614.23 samples/sec Loss 1.8929 Epoch: 16 Global Step: 281950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:33:27,738-Speed 4891.86 samples/sec Loss 1.8902 Epoch: 16 Global Step: 282000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:33:44,465-[lfw][282000]XNorm: 23.249178 Training: 2021-03-19 14:33:44,466-[lfw][282000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-19 14:33:44,466-[lfw][282000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:34:03,226-[cfp_fp][282000]XNorm: 19.566843 Training: 2021-03-19 14:34:03,227-[cfp_fp][282000]Accuracy-Flip: 0.97714+-0.00798 Training: 2021-03-19 14:34:03,227-[cfp_fp][282000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:34:19,507-[agedb_30][282000]XNorm: 22.517329 Training: 2021-03-19 14:34:19,507-[agedb_30][282000]Accuracy-Flip: 0.97550+-0.00778 Training: 2021-03-19 14:34:19,507-[agedb_30][282000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:34:29,788-Speed 825.16 samples/sec Loss 1.8735 Epoch: 16 Global Step: 282050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:34:40,256-Speed 4891.29 samples/sec Loss 1.9024 Epoch: 16 Global Step: 282100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:34:51,484-Speed 4560.12 samples/sec Loss 1.8857 Epoch: 16 Global Step: 282150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:01,985-Speed 4875.92 samples/sec Loss 1.8817 Epoch: 16 Global Step: 282200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:12,301-Speed 4963.66 samples/sec Loss 1.9170 Epoch: 16 Global Step: 282250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:22,747-Speed 4901.62 samples/sec Loss 1.8782 Epoch: 16 Global Step: 282300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:33,107-Speed 4942.62 samples/sec Loss 1.9162 Epoch: 16 Global Step: 282350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:43,506-Speed 4923.52 samples/sec Loss 1.9097 Epoch: 16 Global Step: 282400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:35:53,928-Speed 4913.15 samples/sec Loss 1.8818 Epoch: 16 Global Step: 282450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:04,217-Speed 4976.17 samples/sec Loss 1.8795 Epoch: 16 Global Step: 282500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:14,808-Speed 4834.56 samples/sec Loss 1.8935 Epoch: 16 Global Step: 282550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:25,360-Speed 4852.38 samples/sec Loss 1.9090 Epoch: 16 Global Step: 282600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:35,709-Speed 4947.70 samples/sec Loss 1.8881 Epoch: 16 Global Step: 282650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:46,001-Speed 4974.95 samples/sec Loss 1.9046 Epoch: 16 Global Step: 282700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:36:56,483-Speed 4884.79 samples/sec Loss 1.8712 Epoch: 16 Global Step: 282750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:06,913-Speed 4909.08 samples/sec Loss 1.9076 Epoch: 16 Global Step: 282800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:17,331-Speed 4915.05 samples/sec Loss 1.8740 Epoch: 16 Global Step: 282850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:27,950-Speed 4821.86 samples/sec Loss 1.8985 Epoch: 16 Global Step: 282900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:38,554-Speed 4828.88 samples/sec Loss 1.8830 Epoch: 16 Global Step: 282950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:49,197-Speed 4810.88 samples/sec Loss 1.8938 Epoch: 16 Global Step: 283000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:37:59,725-Speed 4863.62 samples/sec Loss 1.8948 Epoch: 16 Global Step: 283050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:38:10,191-Speed 4892.52 samples/sec Loss 1.8871 Epoch: 16 Global Step: 283100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:38:20,632-Speed 4903.86 samples/sec Loss 1.8934 Epoch: 16 Global Step: 283150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:38:30,875-Speed 4998.65 samples/sec Loss 1.9077 Epoch: 16 Global Step: 283200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:38:41,388-Speed 4870.61 samples/sec Loss 1.8898 Epoch: 16 Global Step: 283250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:38:52,024-Speed 4814.36 samples/sec Loss 1.8963 Epoch: 16 Global Step: 283300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:02,448-Speed 4912.12 samples/sec Loss 1.8944 Epoch: 16 Global Step: 283350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:12,867-Speed 4914.23 samples/sec Loss 1.8983 Epoch: 16 Global Step: 283400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:23,418-Speed 4852.74 samples/sec Loss 1.8921 Epoch: 16 Global Step: 283450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:33,978-Speed 4848.86 samples/sec Loss 1.9055 Epoch: 16 Global Step: 283500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:44,418-Speed 4904.74 samples/sec Loss 1.9039 Epoch: 16 Global Step: 283550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:39:54,637-Speed 5010.59 samples/sec Loss 1.8892 Epoch: 16 Global Step: 283600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:40:05,254-Speed 4822.91 samples/sec Loss 1.8752 Epoch: 16 Global Step: 283650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:40:15,522-Speed 4986.81 samples/sec Loss 1.8936 Epoch: 16 Global Step: 283700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:40:37,649-Speed 2313.96 samples/sec Loss 1.8840 Epoch: 17 Global Step: 283750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:40:48,190-Speed 4857.76 samples/sec Loss 1.8631 Epoch: 17 Global Step: 283800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:40:58,629-Speed 4904.82 samples/sec Loss 1.8719 Epoch: 17 Global Step: 283850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:41:09,287-Speed 4804.04 samples/sec Loss 1.8883 Epoch: 17 Global Step: 283900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:41:19,591-Speed 4969.48 samples/sec Loss 1.9036 Epoch: 17 Global Step: 283950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:41:29,822-Speed 5004.58 samples/sec Loss 1.8393 Epoch: 17 Global Step: 284000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:41:46,572-[lfw][284000]XNorm: 23.362445 Training: 2021-03-19 14:41:46,572-[lfw][284000]Accuracy-Flip: 0.99700+-0.00314 Training: 2021-03-19 14:41:46,573-[lfw][284000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:42:05,272-[cfp_fp][284000]XNorm: 19.677985 Training: 2021-03-19 14:42:05,272-[cfp_fp][284000]Accuracy-Flip: 0.97529+-0.00860 Training: 2021-03-19 14:42:05,272-[cfp_fp][284000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:42:21,500-[agedb_30][284000]XNorm: 22.636560 Training: 2021-03-19 14:42:21,500-[agedb_30][284000]Accuracy-Flip: 0.97550+-0.00742 Training: 2021-03-19 14:42:21,500-[agedb_30][284000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:42:31,917-Speed 824.56 samples/sec Loss 1.8626 Epoch: 17 Global Step: 284050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:42:41,983-Speed 5086.58 samples/sec Loss 1.8885 Epoch: 17 Global Step: 284100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:42:52,312-Speed 4957.35 samples/sec Loss 1.8746 Epoch: 17 Global Step: 284150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:02,606-Speed 4974.30 samples/sec Loss 1.8693 Epoch: 17 Global Step: 284200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:12,598-Speed 5124.02 samples/sec Loss 1.8794 Epoch: 17 Global Step: 284250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:22,622-Speed 5108.30 samples/sec Loss 1.8687 Epoch: 17 Global Step: 284300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:32,633-Speed 5114.80 samples/sec Loss 1.8861 Epoch: 17 Global Step: 284350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:42,729-Speed 5071.24 samples/sec Loss 1.8670 Epoch: 17 Global Step: 284400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:43:53,746-Speed 4648.01 samples/sec Loss 1.8765 Epoch: 17 Global Step: 284450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:03,771-Speed 5107.08 samples/sec Loss 1.8491 Epoch: 17 Global Step: 284500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:15,285-Speed 4447.21 samples/sec Loss 1.8761 Epoch: 17 Global Step: 284550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:25,461-Speed 5031.76 samples/sec Loss 1.8655 Epoch: 17 Global Step: 284600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:35,752-Speed 4975.55 samples/sec Loss 1.8778 Epoch: 17 Global Step: 284650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:46,902-Speed 4592.22 samples/sec Loss 1.8646 Epoch: 17 Global Step: 284700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:44:57,185-Speed 4979.22 samples/sec Loss 1.8835 Epoch: 17 Global Step: 284750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:07,316-Speed 5054.48 samples/sec Loss 1.8682 Epoch: 17 Global Step: 284800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:18,384-Speed 4626.05 samples/sec Loss 1.8517 Epoch: 17 Global Step: 284850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:28,546-Speed 5038.73 samples/sec Loss 1.8400 Epoch: 17 Global Step: 284900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:38,634-Speed 5075.67 samples/sec Loss 1.8701 Epoch: 17 Global Step: 284950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:48,559-Speed 5159.01 samples/sec Loss 1.8907 Epoch: 17 Global Step: 285000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:45:59,253-Speed 4787.97 samples/sec Loss 1.8605 Epoch: 17 Global Step: 285050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:46:09,394-Speed 5048.82 samples/sec Loss 1.8593 Epoch: 17 Global Step: 285100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:46:19,422-Speed 5106.13 samples/sec Loss 1.8842 Epoch: 17 Global Step: 285150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:46:29,585-Speed 5038.29 samples/sec Loss 1.8845 Epoch: 17 Global Step: 285200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:46:40,585-Speed 4654.52 samples/sec Loss 1.8668 Epoch: 17 Global Step: 285250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:46:51,346-Speed 4758.42 samples/sec Loss 1.8838 Epoch: 17 Global Step: 285300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:01,656-Speed 4966.42 samples/sec Loss 1.9055 Epoch: 17 Global Step: 285350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:11,653-Speed 5121.63 samples/sec Loss 1.8364 Epoch: 17 Global Step: 285400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:22,236-Speed 4838.23 samples/sec Loss 1.8680 Epoch: 17 Global Step: 285450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:32,185-Speed 5146.46 samples/sec Loss 1.8705 Epoch: 17 Global Step: 285500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:42,321-Speed 5051.98 samples/sec Loss 1.8694 Epoch: 17 Global Step: 285550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:47:53,164-Speed 4722.02 samples/sec Loss 1.8628 Epoch: 17 Global Step: 285600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:03,264-Speed 5069.42 samples/sec Loss 1.8558 Epoch: 17 Global Step: 285650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:13,551-Speed 4977.48 samples/sec Loss 1.8565 Epoch: 17 Global Step: 285700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:23,513-Speed 5140.01 samples/sec Loss 1.8760 Epoch: 17 Global Step: 285750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:33,581-Speed 5085.73 samples/sec Loss 1.8874 Epoch: 17 Global Step: 285800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:43,666-Speed 5077.15 samples/sec Loss 1.8394 Epoch: 17 Global Step: 285850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:48:53,639-Speed 5133.78 samples/sec Loss 1.8850 Epoch: 17 Global Step: 285900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:49:03,847-Speed 5015.95 samples/sec Loss 1.8935 Epoch: 17 Global Step: 285950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:49:14,075-Speed 5006.30 samples/sec Loss 1.8575 Epoch: 17 Global Step: 286000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:49:30,949-[lfw][286000]XNorm: 23.345192 Training: 2021-03-19 14:49:30,949-[lfw][286000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 14:49:30,949-[lfw][286000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:49:49,707-[cfp_fp][286000]XNorm: 19.658560 Training: 2021-03-19 14:49:49,707-[cfp_fp][286000]Accuracy-Flip: 0.97543+-0.00800 Training: 2021-03-19 14:49:49,707-[cfp_fp][286000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:50:05,860-[agedb_30][286000]XNorm: 22.638798 Training: 2021-03-19 14:50:05,860-[agedb_30][286000]Accuracy-Flip: 0.97450+-0.00719 Training: 2021-03-19 14:50:05,860-[agedb_30][286000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:50:16,049-Speed 826.15 samples/sec Loss 1.8806 Epoch: 17 Global Step: 286050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:50:25,988-Speed 5151.99 samples/sec Loss 1.8608 Epoch: 17 Global Step: 286100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:50:36,370-Speed 4932.06 samples/sec Loss 1.8585 Epoch: 17 Global Step: 286150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:50:46,712-Speed 4950.78 samples/sec Loss 1.8686 Epoch: 17 Global Step: 286200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:50:56,707-Speed 5123.12 samples/sec Loss 1.8764 Epoch: 17 Global Step: 286250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:06,573-Speed 5189.78 samples/sec Loss 1.8770 Epoch: 17 Global Step: 286300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:16,604-Speed 5104.48 samples/sec Loss 1.8736 Epoch: 17 Global Step: 286350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:26,595-Speed 5124.56 samples/sec Loss 1.8544 Epoch: 17 Global Step: 286400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:36,993-Speed 4924.50 samples/sec Loss 1.8977 Epoch: 17 Global Step: 286450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:47,023-Speed 5104.82 samples/sec Loss 1.8759 Epoch: 17 Global Step: 286500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:51:57,268-Speed 4998.20 samples/sec Loss 1.8545 Epoch: 17 Global Step: 286550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:07,388-Speed 5059.11 samples/sec Loss 1.8744 Epoch: 17 Global Step: 286600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:17,561-Speed 5033.30 samples/sec Loss 1.8589 Epoch: 17 Global Step: 286650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:27,677-Speed 5061.52 samples/sec Loss 1.8570 Epoch: 17 Global Step: 286700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:37,820-Speed 5048.41 samples/sec Loss 1.8667 Epoch: 17 Global Step: 286750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:47,800-Speed 5130.38 samples/sec Loss 1.8931 Epoch: 17 Global Step: 286800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:52:58,126-Speed 4958.53 samples/sec Loss 1.8786 Epoch: 17 Global Step: 286850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:08,359-Speed 5003.71 samples/sec Loss 1.8582 Epoch: 17 Global Step: 286900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:18,532-Speed 5033.30 samples/sec Loss 1.8381 Epoch: 17 Global Step: 286950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:28,735-Speed 5018.41 samples/sec Loss 1.8853 Epoch: 17 Global Step: 287000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:38,865-Speed 5054.73 samples/sec Loss 1.8730 Epoch: 17 Global Step: 287050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:49,120-Speed 4993.07 samples/sec Loss 1.8864 Epoch: 17 Global Step: 287100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:53:59,063-Speed 5149.33 samples/sec Loss 1.8588 Epoch: 17 Global Step: 287150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:54:09,254-Speed 5024.60 samples/sec Loss 1.8750 Epoch: 17 Global Step: 287200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:54:19,280-Speed 5106.79 samples/sec Loss 1.8912 Epoch: 17 Global Step: 287250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:54:29,632-Speed 4946.20 samples/sec Loss 1.8858 Epoch: 17 Global Step: 287300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:54:39,840-Speed 5015.86 samples/sec Loss 1.8475 Epoch: 17 Global Step: 287350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:54:49,939-Speed 5070.52 samples/sec Loss 1.8676 Epoch: 17 Global Step: 287400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:00,354-Speed 4916.12 samples/sec Loss 1.8560 Epoch: 17 Global Step: 287450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:10,459-Speed 5067.24 samples/sec Loss 1.8690 Epoch: 17 Global Step: 287500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:20,527-Speed 5085.21 samples/sec Loss 1.8653 Epoch: 17 Global Step: 287550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:30,690-Speed 5038.45 samples/sec Loss 1.8524 Epoch: 17 Global Step: 287600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:40,792-Speed 5068.38 samples/sec Loss 1.8554 Epoch: 17 Global Step: 287650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:55:50,814-Speed 5108.93 samples/sec Loss 1.8791 Epoch: 17 Global Step: 287700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:00,736-Speed 5161.14 samples/sec Loss 1.8765 Epoch: 17 Global Step: 287750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:11,737-Speed 4654.43 samples/sec Loss 1.8934 Epoch: 17 Global Step: 287800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:21,777-Speed 5099.73 samples/sec Loss 1.8853 Epoch: 17 Global Step: 287850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:31,894-Speed 5061.04 samples/sec Loss 1.8664 Epoch: 17 Global Step: 287900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:43,138-Speed 4553.98 samples/sec Loss 1.8610 Epoch: 17 Global Step: 287950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:56:53,102-Speed 5138.67 samples/sec Loss 1.8635 Epoch: 17 Global Step: 288000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:57:11,001-[lfw][288000]XNorm: 23.225608 Training: 2021-03-19 14:57:11,001-[lfw][288000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 14:57:11,001-[lfw][288000]Accuracy-Highest: 0.99767 Training: 2021-03-19 14:57:29,786-[cfp_fp][288000]XNorm: 19.574388 Training: 2021-03-19 14:57:29,786-[cfp_fp][288000]Accuracy-Flip: 0.97586+-0.00942 Training: 2021-03-19 14:57:29,786-[cfp_fp][288000]Accuracy-Highest: 0.97714 Training: 2021-03-19 14:57:45,960-[agedb_30][288000]XNorm: 22.547904 Training: 2021-03-19 14:57:45,960-[agedb_30][288000]Accuracy-Flip: 0.97533+-0.00788 Training: 2021-03-19 14:57:45,960-[agedb_30][288000]Accuracy-Highest: 0.97667 Training: 2021-03-19 14:57:56,666-Speed 805.49 samples/sec Loss 1.8484 Epoch: 17 Global Step: 288050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:06,707-Speed 5099.27 samples/sec Loss 1.8656 Epoch: 17 Global Step: 288100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:16,843-Speed 5051.88 samples/sec Loss 1.8696 Epoch: 17 Global Step: 288150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:26,953-Speed 5064.68 samples/sec Loss 1.8673 Epoch: 17 Global Step: 288200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:37,039-Speed 5076.97 samples/sec Loss 1.8685 Epoch: 17 Global Step: 288250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:47,156-Speed 5060.92 samples/sec Loss 1.8812 Epoch: 17 Global Step: 288300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:58:58,027-Speed 4710.32 samples/sec Loss 1.8336 Epoch: 17 Global Step: 288350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:59:08,176-Speed 5044.79 samples/sec Loss 1.8736 Epoch: 17 Global Step: 288400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:59:18,331-Speed 5042.14 samples/sec Loss 1.8796 Epoch: 17 Global Step: 288450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:59:29,171-Speed 4723.65 samples/sec Loss 1.8583 Epoch: 17 Global Step: 288500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:59:39,002-Speed 5208.43 samples/sec Loss 1.8680 Epoch: 17 Global Step: 288550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 14:59:49,940-Speed 4681.36 samples/sec Loss 1.8834 Epoch: 17 Global Step: 288600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:00,794-Speed 4717.26 samples/sec Loss 1.8636 Epoch: 17 Global Step: 288650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:10,971-Speed 5031.10 samples/sec Loss 1.8592 Epoch: 17 Global Step: 288700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:21,045-Speed 5082.89 samples/sec Loss 1.8596 Epoch: 17 Global Step: 288750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:31,096-Speed 5094.17 samples/sec Loss 1.8693 Epoch: 17 Global Step: 288800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:41,212-Speed 5061.52 samples/sec Loss 1.8612 Epoch: 17 Global Step: 288850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:00:51,495-Speed 4979.16 samples/sec Loss 1.8593 Epoch: 17 Global Step: 288900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:02,290-Speed 4743.20 samples/sec Loss 1.8553 Epoch: 17 Global Step: 288950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:12,376-Speed 5076.68 samples/sec Loss 1.8806 Epoch: 17 Global Step: 289000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:22,855-Speed 4886.27 samples/sec Loss 1.8505 Epoch: 17 Global Step: 289050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:33,059-Speed 5018.08 samples/sec Loss 1.8891 Epoch: 17 Global Step: 289100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:43,253-Speed 5022.67 samples/sec Loss 1.8727 Epoch: 17 Global Step: 289150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:01:53,479-Speed 5007.23 samples/sec Loss 1.8424 Epoch: 17 Global Step: 289200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:03,815-Speed 4953.67 samples/sec Loss 1.8633 Epoch: 17 Global Step: 289250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:13,911-Speed 5071.78 samples/sec Loss 1.8562 Epoch: 17 Global Step: 289300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:24,048-Speed 5050.71 samples/sec Loss 1.8674 Epoch: 17 Global Step: 289350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:34,343-Speed 4973.86 samples/sec Loss 1.8778 Epoch: 17 Global Step: 289400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:44,344-Speed 5119.51 samples/sec Loss 1.8913 Epoch: 17 Global Step: 289450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:02:54,273-Speed 5156.95 samples/sec Loss 1.8438 Epoch: 17 Global Step: 289500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:04,397-Speed 5057.46 samples/sec Loss 1.8789 Epoch: 17 Global Step: 289550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:14,233-Speed 5205.79 samples/sec Loss 1.8628 Epoch: 17 Global Step: 289600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:24,578-Speed 4949.59 samples/sec Loss 1.8665 Epoch: 17 Global Step: 289650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:34,464-Speed 5179.28 samples/sec Loss 1.8755 Epoch: 17 Global Step: 289700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:44,685-Speed 5009.59 samples/sec Loss 1.8753 Epoch: 17 Global Step: 289750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:03:54,734-Speed 5095.25 samples/sec Loss 1.8795 Epoch: 17 Global Step: 289800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:04:04,999-Speed 4988.12 samples/sec Loss 1.8711 Epoch: 17 Global Step: 289850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:04:15,032-Speed 5103.94 samples/sec Loss 1.8457 Epoch: 17 Global Step: 289900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:04:25,199-Speed 5035.83 samples/sec Loss 1.8676 Epoch: 17 Global Step: 289950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:04:35,711-Speed 4871.28 samples/sec Loss 1.8703 Epoch: 17 Global Step: 290000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:04:52,475-[lfw][290000]XNorm: 23.168515 Training: 2021-03-19 15:04:52,475-[lfw][290000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 15:04:52,475-[lfw][290000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:05:11,163-[cfp_fp][290000]XNorm: 19.492281 Training: 2021-03-19 15:05:11,164-[cfp_fp][290000]Accuracy-Flip: 0.97686+-0.00868 Training: 2021-03-19 15:05:11,164-[cfp_fp][290000]Accuracy-Highest: 0.97714 Training: 2021-03-19 15:05:27,500-[agedb_30][290000]XNorm: 22.409575 Training: 2021-03-19 15:05:27,501-[agedb_30][290000]Accuracy-Flip: 0.97533+-0.00785 Training: 2021-03-19 15:05:27,501-[agedb_30][290000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:05:37,689-Speed 826.10 samples/sec Loss 1.8623 Epoch: 17 Global Step: 290050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:05:47,864-Speed 5032.25 samples/sec Loss 1.8578 Epoch: 17 Global Step: 290100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:05:57,988-Speed 5057.24 samples/sec Loss 1.8827 Epoch: 17 Global Step: 290150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:08,094-Speed 5066.85 samples/sec Loss 1.8623 Epoch: 17 Global Step: 290200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:18,233-Speed 5050.02 samples/sec Loss 1.9108 Epoch: 17 Global Step: 290250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:28,411-Speed 5031.13 samples/sec Loss 1.8595 Epoch: 17 Global Step: 290300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:38,540-Speed 5054.60 samples/sec Loss 1.8643 Epoch: 17 Global Step: 290350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:48,516-Speed 5133.06 samples/sec Loss 1.8571 Epoch: 17 Global Step: 290400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:06:58,806-Speed 4975.56 samples/sec Loss 1.8805 Epoch: 17 Global Step: 290450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:08,709-Speed 5170.83 samples/sec Loss 1.8669 Epoch: 17 Global Step: 290500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:18,744-Speed 5102.41 samples/sec Loss 1.8817 Epoch: 17 Global Step: 290550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:28,616-Speed 5186.67 samples/sec Loss 1.8552 Epoch: 17 Global Step: 290600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:38,485-Speed 5187.97 samples/sec Loss 1.8670 Epoch: 17 Global Step: 290650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:49,211-Speed 4773.90 samples/sec Loss 1.8395 Epoch: 17 Global Step: 290700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:07:59,110-Speed 5172.31 samples/sec Loss 1.8757 Epoch: 17 Global Step: 290750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:08:09,355-Speed 4998.10 samples/sec Loss 1.8525 Epoch: 17 Global Step: 290800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:08:19,539-Speed 5028.18 samples/sec Loss 1.8824 Epoch: 17 Global Step: 290850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:08:29,542-Speed 5118.74 samples/sec Loss 1.8452 Epoch: 17 Global Step: 290900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:08:39,841-Speed 4971.41 samples/sec Loss 1.8751 Epoch: 17 Global Step: 290950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:08:50,031-Speed 5025.02 samples/sec Loss 1.8547 Epoch: 17 Global Step: 291000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:00,360-Speed 4957.51 samples/sec Loss 1.8547 Epoch: 17 Global Step: 291050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:10,420-Speed 5089.45 samples/sec Loss 1.8557 Epoch: 17 Global Step: 291100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:20,331-Speed 5166.06 samples/sec Loss 1.8423 Epoch: 17 Global Step: 291150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:31,363-Speed 4641.61 samples/sec Loss 1.8576 Epoch: 17 Global Step: 291200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:41,467-Speed 5067.43 samples/sec Loss 1.8586 Epoch: 17 Global Step: 291250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:09:51,634-Speed 5036.33 samples/sec Loss 1.8691 Epoch: 17 Global Step: 291300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:02,450-Speed 4733.61 samples/sec Loss 1.8553 Epoch: 17 Global Step: 291350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:13,468-Speed 4647.37 samples/sec Loss 1.8569 Epoch: 17 Global Step: 291400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:23,455-Speed 5126.78 samples/sec Loss 1.8533 Epoch: 17 Global Step: 291450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:33,581-Speed 5056.67 samples/sec Loss 1.8796 Epoch: 17 Global Step: 291500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:43,666-Speed 5077.27 samples/sec Loss 1.8622 Epoch: 17 Global Step: 291550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:10:53,518-Speed 5197.01 samples/sec Loss 1.8781 Epoch: 17 Global Step: 291600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:03,536-Speed 5111.16 samples/sec Loss 1.8610 Epoch: 17 Global Step: 291650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:14,228-Speed 4788.66 samples/sec Loss 1.8502 Epoch: 17 Global Step: 291700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:25,187-Speed 4672.19 samples/sec Loss 1.8940 Epoch: 17 Global Step: 291750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:35,213-Speed 5107.10 samples/sec Loss 1.8380 Epoch: 17 Global Step: 291800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:45,411-Speed 5020.86 samples/sec Loss 1.8569 Epoch: 17 Global Step: 291850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:11:56,256-Speed 4721.67 samples/sec Loss 1.8906 Epoch: 17 Global Step: 291900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:12:06,053-Speed 5226.09 samples/sec Loss 1.8414 Epoch: 17 Global Step: 291950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:12:17,026-Speed 4666.56 samples/sec Loss 1.8464 Epoch: 17 Global Step: 292000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:12:33,604-[lfw][292000]XNorm: 23.287820 Training: 2021-03-19 15:12:33,605-[lfw][292000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 15:12:33,605-[lfw][292000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:12:52,205-[cfp_fp][292000]XNorm: 19.590940 Training: 2021-03-19 15:12:52,206-[cfp_fp][292000]Accuracy-Flip: 0.97529+-0.00833 Training: 2021-03-19 15:12:52,206-[cfp_fp][292000]Accuracy-Highest: 0.97714 Training: 2021-03-19 15:13:08,295-[agedb_30][292000]XNorm: 22.580161 Training: 2021-03-19 15:13:08,295-[agedb_30][292000]Accuracy-Flip: 0.97517+-0.00787 Training: 2021-03-19 15:13:08,295-[agedb_30][292000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:13:18,249-Speed 836.29 samples/sec Loss 1.8652 Epoch: 17 Global Step: 292050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:13:28,300-Speed 5094.38 samples/sec Loss 1.8976 Epoch: 17 Global Step: 292100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:13:38,448-Speed 5045.87 samples/sec Loss 1.8691 Epoch: 17 Global Step: 292150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:13:48,612-Speed 5037.69 samples/sec Loss 1.8625 Epoch: 17 Global Step: 292200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:13:58,736-Speed 5057.45 samples/sec Loss 1.8738 Epoch: 17 Global Step: 292250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:14:08,848-Speed 5063.74 samples/sec Loss 1.8878 Epoch: 17 Global Step: 292300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:14:19,572-Speed 4774.25 samples/sec Loss 1.8628 Epoch: 17 Global Step: 292350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:14:29,473-Speed 5171.95 samples/sec Loss 1.8697 Epoch: 17 Global Step: 292400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:14:39,606-Speed 5052.84 samples/sec Loss 1.8413 Epoch: 17 Global Step: 292450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:14:49,679-Speed 5083.32 samples/sec Loss 1.8777 Epoch: 17 Global Step: 292500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:00,067-Speed 4928.65 samples/sec Loss 1.8602 Epoch: 17 Global Step: 292550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:10,088-Speed 5109.60 samples/sec Loss 1.8969 Epoch: 17 Global Step: 292600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:20,048-Speed 5141.26 samples/sec Loss 1.8761 Epoch: 17 Global Step: 292650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:30,287-Speed 5000.50 samples/sec Loss 1.8578 Epoch: 17 Global Step: 292700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:40,150-Speed 5191.52 samples/sec Loss 1.8739 Epoch: 17 Global Step: 292750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:15:50,346-Speed 5021.93 samples/sec Loss 1.8499 Epoch: 17 Global Step: 292800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:00,469-Speed 5057.92 samples/sec Loss 1.8821 Epoch: 17 Global Step: 292850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:10,674-Speed 5017.37 samples/sec Loss 1.8579 Epoch: 17 Global Step: 292900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:20,808-Speed 5052.89 samples/sec Loss 1.8639 Epoch: 17 Global Step: 292950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:30,984-Speed 5031.50 samples/sec Loss 1.8820 Epoch: 17 Global Step: 293000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:41,198-Speed 5013.26 samples/sec Loss 1.8582 Epoch: 17 Global Step: 293050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:16:51,417-Speed 5010.50 samples/sec Loss 1.8690 Epoch: 17 Global Step: 293100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:01,502-Speed 5077.12 samples/sec Loss 1.8705 Epoch: 17 Global Step: 293150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:11,387-Speed 5179.78 samples/sec Loss 1.8682 Epoch: 17 Global Step: 293200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:21,517-Speed 5055.02 samples/sec Loss 1.8516 Epoch: 17 Global Step: 293250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:31,607-Speed 5074.43 samples/sec Loss 1.8681 Epoch: 17 Global Step: 293300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:41,568-Speed 5140.10 samples/sec Loss 1.8628 Epoch: 17 Global Step: 293350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:17:51,604-Speed 5102.16 samples/sec Loss 1.8701 Epoch: 17 Global Step: 293400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:01,496-Speed 5176.30 samples/sec Loss 1.8641 Epoch: 17 Global Step: 293450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:11,615-Speed 5060.12 samples/sec Loss 1.8616 Epoch: 17 Global Step: 293500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:21,592-Speed 5131.95 samples/sec Loss 1.8949 Epoch: 17 Global Step: 293550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:31,837-Speed 4997.63 samples/sec Loss 1.8632 Epoch: 17 Global Step: 293600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:41,946-Speed 5064.97 samples/sec Loss 1.8861 Epoch: 17 Global Step: 293650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:18:51,841-Speed 5174.69 samples/sec Loss 1.8374 Epoch: 17 Global Step: 293700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:02,071-Speed 5005.32 samples/sec Loss 1.8728 Epoch: 17 Global Step: 293750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:12,213-Speed 5048.47 samples/sec Loss 1.8617 Epoch: 17 Global Step: 293800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:22,507-Speed 4974.27 samples/sec Loss 1.8554 Epoch: 17 Global Step: 293850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:32,668-Speed 5038.69 samples/sec Loss 1.8662 Epoch: 17 Global Step: 293900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:42,693-Speed 5107.63 samples/sec Loss 1.8662 Epoch: 17 Global Step: 293950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:19:52,906-Speed 5013.47 samples/sec Loss 1.8548 Epoch: 17 Global Step: 294000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:20:09,655-[lfw][294000]XNorm: 23.216606 Training: 2021-03-19 15:20:09,655-[lfw][294000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 15:20:09,655-[lfw][294000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:20:28,392-[cfp_fp][294000]XNorm: 19.534904 Training: 2021-03-19 15:20:28,392-[cfp_fp][294000]Accuracy-Flip: 0.97571+-0.00840 Training: 2021-03-19 15:20:28,392-[cfp_fp][294000]Accuracy-Highest: 0.97714 Training: 2021-03-19 15:20:44,547-[agedb_30][294000]XNorm: 22.483434 Training: 2021-03-19 15:20:44,548-[agedb_30][294000]Accuracy-Flip: 0.97650+-0.00762 Training: 2021-03-19 15:20:44,548-[agedb_30][294000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:20:54,384-Speed 832.84 samples/sec Loss 1.8503 Epoch: 17 Global Step: 294050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:04,343-Speed 5140.97 samples/sec Loss 1.8703 Epoch: 17 Global Step: 294100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:14,412-Speed 5085.40 samples/sec Loss 1.8568 Epoch: 17 Global Step: 294150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:24,844-Speed 4908.47 samples/sec Loss 1.8800 Epoch: 17 Global Step: 294200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:35,146-Speed 4970.19 samples/sec Loss 1.8921 Epoch: 17 Global Step: 294250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:45,296-Speed 5044.92 samples/sec Loss 1.8736 Epoch: 17 Global Step: 294300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:21:55,360-Speed 5087.75 samples/sec Loss 1.8570 Epoch: 17 Global Step: 294350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:05,366-Speed 5117.34 samples/sec Loss 1.8959 Epoch: 17 Global Step: 294400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:15,422-Speed 5091.84 samples/sec Loss 1.8497 Epoch: 17 Global Step: 294450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:25,591-Speed 5034.81 samples/sec Loss 1.8625 Epoch: 17 Global Step: 294500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:35,717-Speed 5056.62 samples/sec Loss 1.8664 Epoch: 17 Global Step: 294550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:46,606-Speed 4702.46 samples/sec Loss 1.8457 Epoch: 17 Global Step: 294600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:22:56,616-Speed 5115.57 samples/sec Loss 1.8438 Epoch: 17 Global Step: 294650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:23:06,734-Speed 5060.64 samples/sec Loss 1.8657 Epoch: 17 Global Step: 294700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:23:18,499-Speed 4352.18 samples/sec Loss 1.8600 Epoch: 17 Global Step: 294750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-19 15:23:28,575-Speed 5081.38 samples/sec Loss 1.8729 Epoch: 17 Global Step: 294800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:23:38,525-Speed 5146.46 samples/sec Loss 1.8508 Epoch: 17 Global Step: 294850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:23:48,575-Speed 5094.48 samples/sec Loss 1.8531 Epoch: 17 Global Step: 294900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:23:58,756-Speed 5029.19 samples/sec Loss 1.8634 Epoch: 17 Global Step: 294950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:24:09,039-Speed 4979.47 samples/sec Loss 1.8514 Epoch: 17 Global Step: 295000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:24:19,317-Speed 4982.38 samples/sec Loss 1.8851 Epoch: 17 Global Step: 295050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:24:29,966-Speed 4807.91 samples/sec Loss 1.8680 Epoch: 17 Global Step: 295100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:24:40,185-Speed 5010.79 samples/sec Loss 1.8459 Epoch: 17 Global Step: 295150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:24:50,909-Speed 4774.48 samples/sec Loss 1.8701 Epoch: 17 Global Step: 295200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:01,084-Speed 5032.42 samples/sec Loss 1.8709 Epoch: 17 Global Step: 295250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:11,179-Speed 5071.85 samples/sec Loss 1.8676 Epoch: 17 Global Step: 295300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:21,933-Speed 4761.18 samples/sec Loss 1.8559 Epoch: 17 Global Step: 295350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:32,803-Speed 4710.76 samples/sec Loss 1.8620 Epoch: 17 Global Step: 295400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:42,904-Speed 5069.11 samples/sec Loss 1.8790 Epoch: 17 Global Step: 295450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:25:53,143-Speed 5000.42 samples/sec Loss 1.8503 Epoch: 17 Global Step: 295500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:02,988-Speed 5201.11 samples/sec Loss 1.8504 Epoch: 17 Global Step: 295550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:12,920-Speed 5155.52 samples/sec Loss 1.8613 Epoch: 17 Global Step: 295600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:22,949-Speed 5105.50 samples/sec Loss 1.8543 Epoch: 17 Global Step: 295650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:33,049-Speed 5069.62 samples/sec Loss 1.8708 Epoch: 17 Global Step: 295700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:43,819-Speed 4753.85 samples/sec Loss 1.8664 Epoch: 17 Global Step: 295750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:26:53,774-Speed 5143.50 samples/sec Loss 1.8517 Epoch: 17 Global Step: 295800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:27:03,808-Speed 5102.85 samples/sec Loss 1.8525 Epoch: 17 Global Step: 295850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:27:13,870-Speed 5089.04 samples/sec Loss 1.8670 Epoch: 17 Global Step: 295900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:27:24,179-Speed 4966.71 samples/sec Loss 1.8566 Epoch: 17 Global Step: 295950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:27:34,223-Speed 5097.64 samples/sec Loss 1.8509 Epoch: 17 Global Step: 296000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:27:50,920-[lfw][296000]XNorm: 23.321465 Training: 2021-03-19 15:27:50,921-[lfw][296000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 15:27:50,921-[lfw][296000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:28:09,574-[cfp_fp][296000]XNorm: 19.640723 Training: 2021-03-19 15:28:09,574-[cfp_fp][296000]Accuracy-Flip: 0.97671+-0.00848 Training: 2021-03-19 15:28:09,574-[cfp_fp][296000]Accuracy-Highest: 0.97714 Training: 2021-03-19 15:28:25,768-[agedb_30][296000]XNorm: 22.583808 Training: 2021-03-19 15:28:25,769-[agedb_30][296000]Accuracy-Flip: 0.97467+-0.00756 Training: 2021-03-19 15:28:25,770-[agedb_30][296000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:28:35,701-Speed 832.83 samples/sec Loss 1.8840 Epoch: 17 Global Step: 296050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:28:45,909-Speed 5015.77 samples/sec Loss 1.8696 Epoch: 17 Global Step: 296100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:28:56,035-Speed 5056.79 samples/sec Loss 1.8769 Epoch: 17 Global Step: 296150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:06,001-Speed 5137.68 samples/sec Loss 1.8323 Epoch: 17 Global Step: 296200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:16,142-Speed 5049.49 samples/sec Loss 1.8548 Epoch: 17 Global Step: 296250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:26,204-Speed 5088.46 samples/sec Loss 1.8855 Epoch: 17 Global Step: 296300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:36,442-Speed 5001.22 samples/sec Loss 1.8762 Epoch: 17 Global Step: 296350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:46,499-Speed 5091.81 samples/sec Loss 1.8647 Epoch: 17 Global Step: 296400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:29:56,748-Speed 4995.36 samples/sec Loss 1.8518 Epoch: 17 Global Step: 296450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:06,744-Speed 5122.61 samples/sec Loss 1.8748 Epoch: 17 Global Step: 296500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:16,920-Speed 5031.59 samples/sec Loss 1.8655 Epoch: 17 Global Step: 296550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:26,773-Speed 5197.02 samples/sec Loss 1.8471 Epoch: 17 Global Step: 296600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:37,005-Speed 5004.29 samples/sec Loss 1.8676 Epoch: 17 Global Step: 296650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:47,045-Speed 5099.95 samples/sec Loss 1.8876 Epoch: 17 Global Step: 296700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:30:57,326-Speed 4980.13 samples/sec Loss 1.8919 Epoch: 17 Global Step: 296750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:07,366-Speed 5100.18 samples/sec Loss 1.8738 Epoch: 17 Global Step: 296800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:17,583-Speed 5011.18 samples/sec Loss 1.8680 Epoch: 17 Global Step: 296850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:27,410-Speed 5210.71 samples/sec Loss 1.8719 Epoch: 17 Global Step: 296900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:37,459-Speed 5095.61 samples/sec Loss 1.8591 Epoch: 17 Global Step: 296950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:47,514-Speed 5091.98 samples/sec Loss 1.8774 Epoch: 17 Global Step: 297000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:31:57,732-Speed 5011.01 samples/sec Loss 1.8589 Epoch: 17 Global Step: 297050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:07,990-Speed 4991.75 samples/sec Loss 1.8673 Epoch: 17 Global Step: 297100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:18,111-Speed 5059.04 samples/sec Loss 1.8676 Epoch: 17 Global Step: 297150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:28,081-Speed 5135.35 samples/sec Loss 1.8780 Epoch: 17 Global Step: 297200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:38,291-Speed 5014.91 samples/sec Loss 1.8704 Epoch: 17 Global Step: 297250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:48,334-Speed 5098.41 samples/sec Loss 1.8743 Epoch: 17 Global Step: 297300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:32:58,622-Speed 4977.39 samples/sec Loss 1.8611 Epoch: 17 Global Step: 297350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:08,773-Speed 5044.25 samples/sec Loss 1.8795 Epoch: 17 Global Step: 297400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:18,873-Speed 5069.12 samples/sec Loss 1.8611 Epoch: 17 Global Step: 297450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:28,825-Speed 5145.26 samples/sec Loss 1.8531 Epoch: 17 Global Step: 297500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:38,729-Speed 5169.77 samples/sec Loss 1.8643 Epoch: 17 Global Step: 297550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:48,979-Speed 4995.76 samples/sec Loss 1.8440 Epoch: 17 Global Step: 297600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:33:58,948-Speed 5136.12 samples/sec Loss 1.8921 Epoch: 17 Global Step: 297650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:34:09,165-Speed 5011.82 samples/sec Loss 1.8836 Epoch: 17 Global Step: 297700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:34:19,354-Speed 5025.10 samples/sec Loss 1.8355 Epoch: 17 Global Step: 297750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:34:29,464-Speed 5064.47 samples/sec Loss 1.8872 Epoch: 17 Global Step: 297800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:34:39,673-Speed 5015.81 samples/sec Loss 1.8643 Epoch: 17 Global Step: 297850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:34:49,706-Speed 5103.10 samples/sec Loss 1.8719 Epoch: 17 Global Step: 297900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:35:01,116-Speed 4487.71 samples/sec Loss 1.8547 Epoch: 17 Global Step: 297950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:35:11,098-Speed 5129.54 samples/sec Loss 1.9071 Epoch: 17 Global Step: 298000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:35:27,699-[lfw][298000]XNorm: 23.216622 Training: 2021-03-19 15:35:27,699-[lfw][298000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 15:35:27,699-[lfw][298000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:35:46,399-[cfp_fp][298000]XNorm: 19.559791 Training: 2021-03-19 15:35:46,400-[cfp_fp][298000]Accuracy-Flip: 0.97729+-0.00898 Training: 2021-03-19 15:35:46,400-[cfp_fp][298000]Accuracy-Highest: 0.97729 Training: 2021-03-19 15:36:02,584-[agedb_30][298000]XNorm: 22.511905 Training: 2021-03-19 15:36:02,585-[agedb_30][298000]Accuracy-Flip: 0.97517+-0.00751 Training: 2021-03-19 15:36:02,585-[agedb_30][298000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:36:12,741-Speed 830.61 samples/sec Loss 1.8664 Epoch: 17 Global Step: 298050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:36:23,530-Speed 4745.78 samples/sec Loss 1.8578 Epoch: 17 Global Step: 298100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:36:34,508-Speed 4663.89 samples/sec Loss 1.8572 Epoch: 17 Global Step: 298150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:36:44,667-Speed 5040.19 samples/sec Loss 1.8689 Epoch: 17 Global Step: 298200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:36:54,691-Speed 5108.09 samples/sec Loss 1.8665 Epoch: 17 Global Step: 298250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:04,881-Speed 5024.88 samples/sec Loss 1.8523 Epoch: 17 Global Step: 298300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:14,847-Speed 5138.10 samples/sec Loss 1.8974 Epoch: 17 Global Step: 298350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:24,812-Speed 5138.27 samples/sec Loss 1.8621 Epoch: 17 Global Step: 298400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:35,843-Speed 4641.59 samples/sec Loss 1.8560 Epoch: 17 Global Step: 298450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:46,228-Speed 4930.49 samples/sec Loss 1.8424 Epoch: 17 Global Step: 298500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:37:56,121-Speed 5175.80 samples/sec Loss 1.8704 Epoch: 17 Global Step: 298550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:06,776-Speed 4805.76 samples/sec Loss 1.8624 Epoch: 17 Global Step: 298600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:16,967-Speed 5024.06 samples/sec Loss 1.8748 Epoch: 17 Global Step: 298650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:27,039-Speed 5083.76 samples/sec Loss 1.8573 Epoch: 17 Global Step: 298700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:37,932-Speed 4700.25 samples/sec Loss 1.8561 Epoch: 17 Global Step: 298750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:48,912-Speed 4663.24 samples/sec Loss 1.8987 Epoch: 17 Global Step: 298800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:38:58,951-Speed 5100.75 samples/sec Loss 1.8576 Epoch: 17 Global Step: 298850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:39:08,938-Speed 5126.61 samples/sec Loss 1.8577 Epoch: 17 Global Step: 298900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:39:18,916-Speed 5131.82 samples/sec Loss 1.8782 Epoch: 17 Global Step: 298950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:39:29,036-Speed 5059.48 samples/sec Loss 1.8603 Epoch: 17 Global Step: 299000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:39:38,963-Speed 5158.00 samples/sec Loss 1.8709 Epoch: 17 Global Step: 299050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:39:49,167-Speed 5018.00 samples/sec Loss 1.8454 Epoch: 17 Global Step: 299100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:00,055-Speed 4702.50 samples/sec Loss 1.8782 Epoch: 17 Global Step: 299150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:10,105-Speed 5095.36 samples/sec Loss 1.8633 Epoch: 17 Global Step: 299200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:20,088-Speed 5128.73 samples/sec Loss 1.8640 Epoch: 17 Global Step: 299250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:30,544-Speed 4896.80 samples/sec Loss 1.8553 Epoch: 17 Global Step: 299300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:40,619-Speed 5082.33 samples/sec Loss 1.8701 Epoch: 17 Global Step: 299350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:40:50,623-Speed 5118.33 samples/sec Loss 1.8684 Epoch: 17 Global Step: 299400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:00,776-Speed 5043.08 samples/sec Loss 1.8819 Epoch: 17 Global Step: 299450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:11,056-Speed 4980.98 samples/sec Loss 1.8742 Epoch: 17 Global Step: 299500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:21,118-Speed 5088.61 samples/sec Loss 1.8844 Epoch: 17 Global Step: 299550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:31,464-Speed 4949.13 samples/sec Loss 1.8458 Epoch: 17 Global Step: 299600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:41,600-Speed 5051.77 samples/sec Loss 1.8802 Epoch: 17 Global Step: 299650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:41:51,771-Speed 5034.08 samples/sec Loss 1.8790 Epoch: 17 Global Step: 299700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:01,894-Speed 5058.17 samples/sec Loss 1.8691 Epoch: 17 Global Step: 299750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:11,945-Speed 5094.11 samples/sec Loss 1.8550 Epoch: 17 Global Step: 299800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:22,122-Speed 5031.48 samples/sec Loss 1.8698 Epoch: 17 Global Step: 299850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:32,043-Speed 5160.96 samples/sec Loss 1.8788 Epoch: 17 Global Step: 299900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:42,115-Speed 5083.44 samples/sec Loss 1.8490 Epoch: 17 Global Step: 299950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:42:52,431-Speed 4963.68 samples/sec Loss 1.8424 Epoch: 17 Global Step: 300000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:43:09,268-[lfw][300000]XNorm: 23.215698 Training: 2021-03-19 15:43:09,269-[lfw][300000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 15:43:09,269-[lfw][300000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:43:27,969-[cfp_fp][300000]XNorm: 19.547843 Training: 2021-03-19 15:43:27,970-[cfp_fp][300000]Accuracy-Flip: 0.97586+-0.00875 Training: 2021-03-19 15:43:27,970-[cfp_fp][300000]Accuracy-Highest: 0.97729 Training: 2021-03-19 15:43:44,283-[agedb_30][300000]XNorm: 22.494424 Training: 2021-03-19 15:43:44,283-[agedb_30][300000]Accuracy-Flip: 0.97467+-0.00792 Training: 2021-03-19 15:43:44,283-[agedb_30][300000]Accuracy-Highest: 0.97667 Training: 2021-03-19 15:43:53,989-Speed 831.75 samples/sec Loss 1.8779 Epoch: 17 Global Step: 300050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:04,014-Speed 5107.60 samples/sec Loss 1.8946 Epoch: 17 Global Step: 300100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:14,277-Speed 4988.93 samples/sec Loss 1.8732 Epoch: 17 Global Step: 300150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:24,168-Speed 5176.64 samples/sec Loss 1.8681 Epoch: 17 Global Step: 300200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:34,576-Speed 4919.52 samples/sec Loss 1.8826 Epoch: 17 Global Step: 300250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:44,801-Speed 5008.12 samples/sec Loss 1.8765 Epoch: 17 Global Step: 300300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:44:55,199-Speed 4924.47 samples/sec Loss 1.8750 Epoch: 17 Global Step: 300350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:45:05,081-Speed 5181.34 samples/sec Loss 1.8233 Epoch: 17 Global Step: 300400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:45:27,304-Speed 2303.99 samples/sec Loss 1.8712 Epoch: 18 Global Step: 300450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:45:38,349-Speed 4635.77 samples/sec Loss 1.8664 Epoch: 18 Global Step: 300500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:45:49,018-Speed 4798.96 samples/sec Loss 1.8867 Epoch: 18 Global Step: 300550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:45:59,203-Speed 5027.19 samples/sec Loss 1.8610 Epoch: 18 Global Step: 300600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:46:09,454-Speed 4995.44 samples/sec Loss 1.8741 Epoch: 18 Global Step: 300650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:46:19,598-Speed 5047.27 samples/sec Loss 1.8488 Epoch: 18 Global Step: 300700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:46:29,685-Speed 5076.07 samples/sec Loss 1.8656 Epoch: 18 Global Step: 300750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:46:39,777-Speed 5073.77 samples/sec Loss 1.8747 Epoch: 18 Global Step: 300800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:46:50,031-Speed 4993.84 samples/sec Loss 1.8689 Epoch: 18 Global Step: 300850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:00,113-Speed 5078.42 samples/sec Loss 1.8724 Epoch: 18 Global Step: 300900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:10,408-Speed 4973.59 samples/sec Loss 1.8721 Epoch: 18 Global Step: 300950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:20,625-Speed 5011.91 samples/sec Loss 1.8461 Epoch: 18 Global Step: 301000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:30,786-Speed 5038.85 samples/sec Loss 1.8477 Epoch: 18 Global Step: 301050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:40,990-Speed 5018.15 samples/sec Loss 1.8722 Epoch: 18 Global Step: 301100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:47:51,231-Speed 4999.86 samples/sec Loss 1.8567 Epoch: 18 Global Step: 301150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:01,492-Speed 4990.10 samples/sec Loss 1.8487 Epoch: 18 Global Step: 301200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:11,855-Speed 4940.87 samples/sec Loss 1.8700 Epoch: 18 Global Step: 301250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:22,918-Speed 4628.06 samples/sec Loss 1.8753 Epoch: 18 Global Step: 301300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:33,028-Speed 5064.96 samples/sec Loss 1.8762 Epoch: 18 Global Step: 301350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:43,102-Speed 5082.52 samples/sec Loss 1.8564 Epoch: 18 Global Step: 301400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:48:53,208-Speed 5066.91 samples/sec Loss 1.8780 Epoch: 18 Global Step: 301450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:04,070-Speed 4713.74 samples/sec Loss 1.8416 Epoch: 18 Global Step: 301500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:14,972-Speed 4696.83 samples/sec Loss 1.8631 Epoch: 18 Global Step: 301550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:24,906-Speed 5154.22 samples/sec Loss 1.8557 Epoch: 18 Global Step: 301600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:35,215-Speed 4966.95 samples/sec Loss 1.8801 Epoch: 18 Global Step: 301650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:45,456-Speed 4999.98 samples/sec Loss 1.8553 Epoch: 18 Global Step: 301700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:49:55,490-Speed 5102.84 samples/sec Loss 1.8516 Epoch: 18 Global Step: 301750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:50:05,430-Speed 5150.98 samples/sec Loss 1.8746 Epoch: 18 Global Step: 301800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:50:16,472-Speed 4637.24 samples/sec Loss 1.8747 Epoch: 18 Global Step: 301850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:50:26,333-Speed 5192.20 samples/sec Loss 1.8591 Epoch: 18 Global Step: 301900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:50:36,546-Speed 5013.61 samples/sec Loss 1.8693 Epoch: 18 Global Step: 301950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:50:47,570-Speed 4644.76 samples/sec Loss 1.8598 Epoch: 18 Global Step: 302000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:51:04,465-[lfw][302000]XNorm: 23.189564 Training: 2021-03-19 15:51:04,466-[lfw][302000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 15:51:04,466-[lfw][302000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:51:23,222-[cfp_fp][302000]XNorm: 19.539194 Training: 2021-03-19 15:51:23,222-[cfp_fp][302000]Accuracy-Flip: 0.97643+-0.00858 Training: 2021-03-19 15:51:23,222-[cfp_fp][302000]Accuracy-Highest: 0.97729 Training: 2021-03-19 15:51:39,368-[agedb_30][302000]XNorm: 22.498300 Training: 2021-03-19 15:51:39,368-[agedb_30][302000]Accuracy-Flip: 0.97717+-0.00711 Training: 2021-03-19 15:51:39,369-[agedb_30][302000]Accuracy-Highest: 0.97717 Training: 2021-03-19 15:51:49,538-Speed 826.24 samples/sec Loss 1.8568 Epoch: 18 Global Step: 302050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:01,269-Speed 4364.52 samples/sec Loss 1.8620 Epoch: 18 Global Step: 302100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:11,477-Speed 5016.43 samples/sec Loss 1.8518 Epoch: 18 Global Step: 302150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:21,610-Speed 5052.77 samples/sec Loss 1.8783 Epoch: 18 Global Step: 302200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:31,709-Speed 5070.31 samples/sec Loss 1.8398 Epoch: 18 Global Step: 302250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:42,086-Speed 4934.47 samples/sec Loss 1.8764 Epoch: 18 Global Step: 302300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:52:52,131-Speed 5097.28 samples/sec Loss 1.8524 Epoch: 18 Global Step: 302350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:02,267-Speed 5051.46 samples/sec Loss 1.8589 Epoch: 18 Global Step: 302400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:12,465-Speed 5020.86 samples/sec Loss 1.8899 Epoch: 18 Global Step: 302450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:22,582-Speed 5061.38 samples/sec Loss 1.8758 Epoch: 18 Global Step: 302500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:33,623-Speed 4637.44 samples/sec Loss 1.8622 Epoch: 18 Global Step: 302550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:43,701-Speed 5080.45 samples/sec Loss 1.8633 Epoch: 18 Global Step: 302600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:53:53,787-Speed 5076.59 samples/sec Loss 1.8550 Epoch: 18 Global Step: 302650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:03,702-Speed 5164.14 samples/sec Loss 1.8521 Epoch: 18 Global Step: 302700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:13,923-Speed 5009.67 samples/sec Loss 1.8574 Epoch: 18 Global Step: 302750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:23,979-Speed 5091.62 samples/sec Loss 1.8583 Epoch: 18 Global Step: 302800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:33,925-Speed 5148.32 samples/sec Loss 1.8605 Epoch: 18 Global Step: 302850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:43,980-Speed 5092.55 samples/sec Loss 1.8646 Epoch: 18 Global Step: 302900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:54:54,077-Speed 5071.15 samples/sec Loss 1.8547 Epoch: 18 Global Step: 302950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:04,190-Speed 5062.69 samples/sec Loss 1.8234 Epoch: 18 Global Step: 303000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:14,277-Speed 5076.18 samples/sec Loss 1.8472 Epoch: 18 Global Step: 303050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:24,511-Speed 5003.06 samples/sec Loss 1.8682 Epoch: 18 Global Step: 303100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:34,753-Speed 4999.29 samples/sec Loss 1.8454 Epoch: 18 Global Step: 303150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:45,249-Speed 4878.63 samples/sec Loss 1.8444 Epoch: 18 Global Step: 303200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:55:55,076-Speed 5210.26 samples/sec Loss 1.8355 Epoch: 18 Global Step: 303250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:05,084-Speed 5116.22 samples/sec Loss 1.8809 Epoch: 18 Global Step: 303300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:15,184-Speed 5069.36 samples/sec Loss 1.8531 Epoch: 18 Global Step: 303350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:25,182-Speed 5121.76 samples/sec Loss 1.8467 Epoch: 18 Global Step: 303400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:35,307-Speed 5056.98 samples/sec Loss 1.8552 Epoch: 18 Global Step: 303450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:45,350-Speed 5098.39 samples/sec Loss 1.8622 Epoch: 18 Global Step: 303500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:56:55,372-Speed 5108.95 samples/sec Loss 1.8675 Epoch: 18 Global Step: 303550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:05,466-Speed 5072.73 samples/sec Loss 1.8606 Epoch: 18 Global Step: 303600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:15,680-Speed 5012.98 samples/sec Loss 1.8278 Epoch: 18 Global Step: 303650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:26,347-Speed 4800.07 samples/sec Loss 1.8736 Epoch: 18 Global Step: 303700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:36,585-Speed 5001.24 samples/sec Loss 1.8836 Epoch: 18 Global Step: 303750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:46,825-Speed 5000.20 samples/sec Loss 1.8664 Epoch: 18 Global Step: 303800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:57:56,895-Speed 5084.64 samples/sec Loss 1.8732 Epoch: 18 Global Step: 303850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:58:07,208-Speed 4964.93 samples/sec Loss 1.8845 Epoch: 18 Global Step: 303900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:58:17,552-Speed 4950.12 samples/sec Loss 1.8523 Epoch: 18 Global Step: 303950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:58:27,972-Speed 4913.93 samples/sec Loss 1.8525 Epoch: 18 Global Step: 304000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:58:44,576-[lfw][304000]XNorm: 23.190796 Training: 2021-03-19 15:58:44,576-[lfw][304000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 15:58:44,576-[lfw][304000]Accuracy-Highest: 0.99767 Training: 2021-03-19 15:59:03,262-[cfp_fp][304000]XNorm: 19.522608 Training: 2021-03-19 15:59:03,262-[cfp_fp][304000]Accuracy-Flip: 0.97429+-0.00838 Training: 2021-03-19 15:59:03,262-[cfp_fp][304000]Accuracy-Highest: 0.97729 Training: 2021-03-19 15:59:19,457-[agedb_30][304000]XNorm: 22.475612 Training: 2021-03-19 15:59:19,457-[agedb_30][304000]Accuracy-Flip: 0.97483+-0.00797 Training: 2021-03-19 15:59:19,457-[agedb_30][304000]Accuracy-Highest: 0.97717 Training: 2021-03-19 15:59:29,325-Speed 834.52 samples/sec Loss 1.8476 Epoch: 18 Global Step: 304050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:59:39,290-Speed 5138.00 samples/sec Loss 1.8629 Epoch: 18 Global Step: 304100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:59:49,430-Speed 5049.79 samples/sec Loss 1.8797 Epoch: 18 Global Step: 304150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 15:59:59,670-Speed 5000.21 samples/sec Loss 1.8546 Epoch: 18 Global Step: 304200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:00:09,934-Speed 4988.82 samples/sec Loss 1.8332 Epoch: 18 Global Step: 304250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:00:19,944-Speed 5115.01 samples/sec Loss 1.8776 Epoch: 18 Global Step: 304300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:00:30,156-Speed 5013.85 samples/sec Loss 1.8577 Epoch: 18 Global Step: 304350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:00:40,242-Speed 5076.53 samples/sec Loss 1.8550 Epoch: 18 Global Step: 304400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:00:50,069-Speed 5210.49 samples/sec Loss 1.8530 Epoch: 18 Global Step: 304450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:00,032-Speed 5139.60 samples/sec Loss 1.8643 Epoch: 18 Global Step: 304500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:10,113-Speed 5079.50 samples/sec Loss 1.8728 Epoch: 18 Global Step: 304550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:20,235-Speed 5058.49 samples/sec Loss 1.8803 Epoch: 18 Global Step: 304600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:31,170-Speed 4682.59 samples/sec Loss 1.8648 Epoch: 18 Global Step: 304650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:41,081-Speed 5166.01 samples/sec Loss 1.8585 Epoch: 18 Global Step: 304700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:01:51,198-Speed 5061.41 samples/sec Loss 1.8633 Epoch: 18 Global Step: 304750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:01,339-Speed 5048.90 samples/sec Loss 1.8787 Epoch: 18 Global Step: 304800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:12,214-Speed 4708.47 samples/sec Loss 1.8931 Epoch: 18 Global Step: 304850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:22,405-Speed 5024.06 samples/sec Loss 1.8698 Epoch: 18 Global Step: 304900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:32,569-Speed 5037.80 samples/sec Loss 1.8502 Epoch: 18 Global Step: 304950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:43,654-Speed 4619.16 samples/sec Loss 1.8833 Epoch: 18 Global Step: 305000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:02:53,750-Speed 5071.24 samples/sec Loss 1.8503 Epoch: 18 Global Step: 305050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:03,583-Speed 5207.54 samples/sec Loss 1.8696 Epoch: 18 Global Step: 305100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:13,561-Speed 5131.41 samples/sec Loss 1.8742 Epoch: 18 Global Step: 305150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:24,433-Speed 4709.57 samples/sec Loss 1.8629 Epoch: 18 Global Step: 305200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:34,563-Speed 5054.41 samples/sec Loss 1.8583 Epoch: 18 Global Step: 305250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:44,621-Speed 5090.96 samples/sec Loss 1.8628 Epoch: 18 Global Step: 305300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:03:55,606-Speed 4661.18 samples/sec Loss 1.8132 Epoch: 18 Global Step: 305350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:05,622-Speed 5111.78 samples/sec Loss 1.8633 Epoch: 18 Global Step: 305400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:15,877-Speed 4993.26 samples/sec Loss 1.8495 Epoch: 18 Global Step: 305450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:27,485-Speed 4410.98 samples/sec Loss 1.8601 Epoch: 18 Global Step: 305500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:37,416-Speed 5155.62 samples/sec Loss 1.8781 Epoch: 18 Global Step: 305550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:47,406-Speed 5125.70 samples/sec Loss 1.8660 Epoch: 18 Global Step: 305600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:04:57,526-Speed 5059.77 samples/sec Loss 1.8634 Epoch: 18 Global Step: 305650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:07,608-Speed 5078.33 samples/sec Loss 1.8495 Epoch: 18 Global Step: 305700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:17,539-Speed 5155.93 samples/sec Loss 1.8714 Epoch: 18 Global Step: 305750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:27,683-Speed 5047.53 samples/sec Loss 1.8899 Epoch: 18 Global Step: 305800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:37,658-Speed 5133.41 samples/sec Loss 1.8606 Epoch: 18 Global Step: 305850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:47,812-Speed 5042.49 samples/sec Loss 1.8405 Epoch: 18 Global Step: 305900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:05:58,619-Speed 4738.04 samples/sec Loss 1.8820 Epoch: 18 Global Step: 305950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:06:08,738-Speed 5060.10 samples/sec Loss 1.8568 Epoch: 18 Global Step: 306000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:06:25,637-[lfw][306000]XNorm: 23.179854 Training: 2021-03-19 16:06:25,638-[lfw][306000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 16:06:25,638-[lfw][306000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:06:44,383-[cfp_fp][306000]XNorm: 19.508332 Training: 2021-03-19 16:06:44,383-[cfp_fp][306000]Accuracy-Flip: 0.97529+-0.00828 Training: 2021-03-19 16:06:44,383-[cfp_fp][306000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:07:00,520-[agedb_30][306000]XNorm: 22.455646 Training: 2021-03-19 16:07:00,521-[agedb_30][306000]Accuracy-Flip: 0.97567+-0.00775 Training: 2021-03-19 16:07:00,521-[agedb_30][306000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:07:10,476-Speed 829.31 samples/sec Loss 1.8780 Epoch: 18 Global Step: 306050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:07:20,661-Speed 5027.57 samples/sec Loss 1.8613 Epoch: 18 Global Step: 306100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:07:30,813-Speed 5043.60 samples/sec Loss 1.8640 Epoch: 18 Global Step: 306150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:07:40,856-Speed 5098.26 samples/sec Loss 1.8700 Epoch: 18 Global Step: 306200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:07:50,854-Speed 5121.44 samples/sec Loss 1.8620 Epoch: 18 Global Step: 306250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:00,923-Speed 5084.97 samples/sec Loss 1.8663 Epoch: 18 Global Step: 306300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:11,205-Speed 4980.28 samples/sec Loss 1.8597 Epoch: 18 Global Step: 306350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:21,226-Speed 5109.41 samples/sec Loss 1.8786 Epoch: 18 Global Step: 306400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:31,378-Speed 5043.83 samples/sec Loss 1.8748 Epoch: 18 Global Step: 306450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:41,382-Speed 5118.14 samples/sec Loss 1.8483 Epoch: 18 Global Step: 306500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:08:51,606-Speed 5007.89 samples/sec Loss 1.8591 Epoch: 18 Global Step: 306550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:01,801-Speed 5022.35 samples/sec Loss 1.8643 Epoch: 18 Global Step: 306600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:11,740-Speed 5152.10 samples/sec Loss 1.8609 Epoch: 18 Global Step: 306650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:21,966-Speed 5006.69 samples/sec Loss 1.8378 Epoch: 18 Global Step: 306700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:32,061-Speed 5072.03 samples/sec Loss 1.8685 Epoch: 18 Global Step: 306750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:42,048-Speed 5127.43 samples/sec Loss 1.8701 Epoch: 18 Global Step: 306800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:09:52,059-Speed 5114.54 samples/sec Loss 1.8305 Epoch: 18 Global Step: 306850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:02,259-Speed 5019.98 samples/sec Loss 1.8983 Epoch: 18 Global Step: 306900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:12,515-Speed 4992.09 samples/sec Loss 1.8613 Epoch: 18 Global Step: 306950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:22,623-Speed 5065.95 samples/sec Loss 1.8795 Epoch: 18 Global Step: 307000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:32,627-Speed 5118.02 samples/sec Loss 1.8662 Epoch: 18 Global Step: 307050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:42,620-Speed 5124.18 samples/sec Loss 1.8663 Epoch: 18 Global Step: 307100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:10:52,570-Speed 5146.03 samples/sec Loss 1.8654 Epoch: 18 Global Step: 307150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:02,759-Speed 5025.38 samples/sec Loss 1.8751 Epoch: 18 Global Step: 307200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:12,858-Speed 5070.08 samples/sec Loss 1.8691 Epoch: 18 Global Step: 307250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:22,950-Speed 5073.47 samples/sec Loss 1.8776 Epoch: 18 Global Step: 307300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:33,253-Speed 4969.40 samples/sec Loss 1.8843 Epoch: 18 Global Step: 307350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:43,310-Speed 5091.29 samples/sec Loss 1.8573 Epoch: 18 Global Step: 307400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:11:53,268-Speed 5142.00 samples/sec Loss 1.8362 Epoch: 18 Global Step: 307450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:03,449-Speed 5029.19 samples/sec Loss 1.8521 Epoch: 18 Global Step: 307500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:13,326-Speed 5184.07 samples/sec Loss 1.8680 Epoch: 18 Global Step: 307550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:23,458-Speed 5053.63 samples/sec Loss 1.8855 Epoch: 18 Global Step: 307600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:33,593-Speed 5051.79 samples/sec Loss 1.8759 Epoch: 18 Global Step: 307650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:43,601-Speed 5116.40 samples/sec Loss 1.8611 Epoch: 18 Global Step: 307700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:12:53,445-Speed 5201.32 samples/sec Loss 1.8501 Epoch: 18 Global Step: 307750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:13:03,586-Speed 5049.03 samples/sec Loss 1.8590 Epoch: 18 Global Step: 307800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:13:13,630-Speed 5098.28 samples/sec Loss 1.8599 Epoch: 18 Global Step: 307850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:13:23,703-Speed 5083.01 samples/sec Loss 1.8803 Epoch: 18 Global Step: 307900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:13:34,511-Speed 4737.28 samples/sec Loss 1.8822 Epoch: 18 Global Step: 307950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:13:44,433-Speed 5160.56 samples/sec Loss 1.8763 Epoch: 18 Global Step: 308000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:14:01,277-[lfw][308000]XNorm: 23.265717 Training: 2021-03-19 16:14:01,277-[lfw][308000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 16:14:01,277-[lfw][308000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:14:19,939-[cfp_fp][308000]XNorm: 19.590426 Training: 2021-03-19 16:14:19,939-[cfp_fp][308000]Accuracy-Flip: 0.97529+-0.00831 Training: 2021-03-19 16:14:19,939-[cfp_fp][308000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:14:36,040-[agedb_30][308000]XNorm: 22.544177 Training: 2021-03-19 16:14:36,040-[agedb_30][308000]Accuracy-Flip: 0.97533+-0.00770 Training: 2021-03-19 16:14:36,040-[agedb_30][308000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:14:46,132-Speed 829.84 samples/sec Loss 1.8547 Epoch: 18 Global Step: 308050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:14:56,227-Speed 5072.36 samples/sec Loss 1.8489 Epoch: 18 Global Step: 308100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:06,570-Speed 4950.44 samples/sec Loss 1.8477 Epoch: 18 Global Step: 308150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:16,781-Speed 5014.65 samples/sec Loss 1.8537 Epoch: 18 Global Step: 308200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:27,628-Speed 4720.51 samples/sec Loss 1.8655 Epoch: 18 Global Step: 308250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:37,664-Speed 5101.94 samples/sec Loss 1.8643 Epoch: 18 Global Step: 308300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:48,196-Speed 4861.34 samples/sec Loss 1.8688 Epoch: 18 Global Step: 308350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:15:58,487-Speed 4975.78 samples/sec Loss 1.8596 Epoch: 18 Global Step: 308400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:16:09,520-Speed 4640.77 samples/sec Loss 1.8736 Epoch: 18 Global Step: 308450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:16:19,773-Speed 4994.05 samples/sec Loss 1.8575 Epoch: 18 Global Step: 308500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:16:30,120-Speed 4948.75 samples/sec Loss 1.8592 Epoch: 18 Global Step: 308550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:16:41,092-Speed 4666.96 samples/sec Loss 1.8902 Epoch: 18 Global Step: 308600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:16:51,435-Speed 4950.11 samples/sec Loss 1.8518 Epoch: 18 Global Step: 308650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:02,489-Speed 4631.96 samples/sec Loss 1.8678 Epoch: 18 Global Step: 308700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:12,388-Speed 5172.91 samples/sec Loss 1.8378 Epoch: 18 Global Step: 308750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:23,262-Speed 4708.64 samples/sec Loss 1.8440 Epoch: 18 Global Step: 308800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:33,459-Speed 5021.35 samples/sec Loss 1.8353 Epoch: 18 Global Step: 308850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:44,354-Speed 4699.63 samples/sec Loss 1.8536 Epoch: 18 Global Step: 308900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:17:54,536-Speed 5028.53 samples/sec Loss 1.8269 Epoch: 18 Global Step: 308950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:04,721-Speed 5027.83 samples/sec Loss 1.8641 Epoch: 18 Global Step: 309000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:14,697-Speed 5132.18 samples/sec Loss 1.8603 Epoch: 18 Global Step: 309050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:24,985-Speed 4977.19 samples/sec Loss 1.8507 Epoch: 18 Global Step: 309100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:35,224-Speed 5000.58 samples/sec Loss 1.8873 Epoch: 18 Global Step: 309150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:45,350-Speed 5056.52 samples/sec Loss 1.8400 Epoch: 18 Global Step: 309200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:18:55,537-Speed 5026.54 samples/sec Loss 1.8669 Epoch: 18 Global Step: 309250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:06,224-Speed 4791.07 samples/sec Loss 1.8581 Epoch: 18 Global Step: 309300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:16,135-Speed 5166.02 samples/sec Loss 1.8989 Epoch: 18 Global Step: 309350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:26,093-Speed 5141.97 samples/sec Loss 1.8740 Epoch: 18 Global Step: 309400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:36,092-Speed 5120.80 samples/sec Loss 1.8458 Epoch: 18 Global Step: 309450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:46,338-Speed 4997.13 samples/sec Loss 1.8600 Epoch: 18 Global Step: 309500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:19:56,395-Speed 5091.71 samples/sec Loss 1.8604 Epoch: 18 Global Step: 309550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:06,660-Speed 4987.71 samples/sec Loss 1.8660 Epoch: 18 Global Step: 309600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:16,987-Speed 4958.26 samples/sec Loss 1.8584 Epoch: 18 Global Step: 309650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:27,068-Speed 5079.08 samples/sec Loss 1.8465 Epoch: 18 Global Step: 309700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:37,550-Speed 4884.83 samples/sec Loss 1.8625 Epoch: 18 Global Step: 309750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:47,775-Speed 5007.67 samples/sec Loss 1.8718 Epoch: 18 Global Step: 309800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:20:57,869-Speed 5072.71 samples/sec Loss 1.8453 Epoch: 18 Global Step: 309850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:21:07,804-Speed 5153.53 samples/sec Loss 1.8435 Epoch: 18 Global Step: 309900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:21:17,845-Speed 5099.30 samples/sec Loss 1.8630 Epoch: 18 Global Step: 309950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:21:28,063-Speed 5011.15 samples/sec Loss 1.8588 Epoch: 18 Global Step: 310000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:21:44,862-[lfw][310000]XNorm: 23.243383 Training: 2021-03-19 16:21:44,862-[lfw][310000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 16:21:44,862-[lfw][310000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:22:03,653-[cfp_fp][310000]XNorm: 19.566049 Training: 2021-03-19 16:22:03,653-[cfp_fp][310000]Accuracy-Flip: 0.97543+-0.00837 Training: 2021-03-19 16:22:03,654-[cfp_fp][310000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:22:19,768-[agedb_30][310000]XNorm: 22.534353 Training: 2021-03-19 16:22:19,769-[agedb_30][310000]Accuracy-Flip: 0.97467+-0.00752 Training: 2021-03-19 16:22:19,769-[agedb_30][310000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:22:29,869-Speed 828.41 samples/sec Loss 1.8633 Epoch: 18 Global Step: 310050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:22:39,737-Speed 5188.82 samples/sec Loss 1.8706 Epoch: 18 Global Step: 310100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:22:49,869-Speed 5053.48 samples/sec Loss 1.8626 Epoch: 18 Global Step: 310150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:23:00,138-Speed 4986.14 samples/sec Loss 1.8595 Epoch: 18 Global Step: 310200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:23:10,223-Speed 5077.09 samples/sec Loss 1.8838 Epoch: 18 Global Step: 310250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:23:20,166-Speed 5149.60 samples/sec Loss 1.8804 Epoch: 18 Global Step: 310300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:23:30,316-Speed 5044.58 samples/sec Loss 1.8240 Epoch: 18 Global Step: 310350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-19 16:23:40,514-Speed 5021.39 samples/sec Loss 1.8492 Epoch: 18 Global Step: 310400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:23:50,791-Speed 4982.02 samples/sec Loss 1.8610 Epoch: 18 Global Step: 310450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:01,127-Speed 4954.06 samples/sec Loss 1.8625 Epoch: 18 Global Step: 310500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:11,305-Speed 5030.60 samples/sec Loss 1.8813 Epoch: 18 Global Step: 310550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:21,439-Speed 5052.65 samples/sec Loss 1.8944 Epoch: 18 Global Step: 310600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:31,936-Speed 4878.05 samples/sec Loss 1.8675 Epoch: 18 Global Step: 310650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:42,219-Speed 4979.11 samples/sec Loss 1.8626 Epoch: 18 Global Step: 310700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:24:52,420-Speed 5019.35 samples/sec Loss 1.8792 Epoch: 18 Global Step: 310750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:02,570-Speed 5044.75 samples/sec Loss 1.8868 Epoch: 18 Global Step: 310800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:12,676-Speed 5066.53 samples/sec Loss 1.8694 Epoch: 18 Global Step: 310850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:23,038-Speed 4941.63 samples/sec Loss 1.8568 Epoch: 18 Global Step: 310900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:33,443-Speed 4920.79 samples/sec Loss 1.8772 Epoch: 18 Global Step: 310950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:43,659-Speed 5012.11 samples/sec Loss 1.8575 Epoch: 18 Global Step: 311000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:25:53,795-Speed 5051.58 samples/sec Loss 1.8761 Epoch: 18 Global Step: 311050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:03,806-Speed 5114.76 samples/sec Loss 1.8674 Epoch: 18 Global Step: 311100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:13,805-Speed 5120.75 samples/sec Loss 1.8786 Epoch: 18 Global Step: 311150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:23,823-Speed 5111.19 samples/sec Loss 1.8696 Epoch: 18 Global Step: 311200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:33,929-Speed 5066.81 samples/sec Loss 1.8451 Epoch: 18 Global Step: 311250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:43,934-Speed 5117.65 samples/sec Loss 1.8651 Epoch: 18 Global Step: 311300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:26:54,869-Speed 4682.29 samples/sec Loss 1.8479 Epoch: 18 Global Step: 311350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:05,066-Speed 5021.56 samples/sec Loss 1.8750 Epoch: 18 Global Step: 311400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:15,270-Speed 5017.81 samples/sec Loss 1.8762 Epoch: 18 Global Step: 311450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:25,551-Speed 4980.35 samples/sec Loss 1.8622 Epoch: 18 Global Step: 311500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:36,438-Speed 4703.29 samples/sec Loss 1.8587 Epoch: 18 Global Step: 311550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:46,485-Speed 5096.38 samples/sec Loss 1.8788 Epoch: 18 Global Step: 311600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:27:56,481-Speed 5122.10 samples/sec Loss 1.8316 Epoch: 18 Global Step: 311650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:06,731-Speed 4995.48 samples/sec Loss 1.8672 Epoch: 18 Global Step: 311700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:16,762-Speed 5104.47 samples/sec Loss 1.8461 Epoch: 18 Global Step: 311750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:26,951-Speed 5025.09 samples/sec Loss 1.8598 Epoch: 18 Global Step: 311800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:38,197-Speed 4553.31 samples/sec Loss 1.8662 Epoch: 18 Global Step: 311850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:48,189-Speed 5124.31 samples/sec Loss 1.8717 Epoch: 18 Global Step: 311900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:28:59,162-Speed 4666.42 samples/sec Loss 1.8652 Epoch: 18 Global Step: 311950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:29:09,409-Speed 4996.59 samples/sec Loss 1.8678 Epoch: 18 Global Step: 312000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:29:26,257-[lfw][312000]XNorm: 23.245253 Training: 2021-03-19 16:29:26,258-[lfw][312000]Accuracy-Flip: 0.99667+-0.00269 Training: 2021-03-19 16:29:26,258-[lfw][312000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:29:44,945-[cfp_fp][312000]XNorm: 19.556946 Training: 2021-03-19 16:29:44,946-[cfp_fp][312000]Accuracy-Flip: 0.97614+-0.00919 Training: 2021-03-19 16:29:44,946-[cfp_fp][312000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:30:01,088-[agedb_30][312000]XNorm: 22.528726 Training: 2021-03-19 16:30:01,088-[agedb_30][312000]Accuracy-Flip: 0.97417+-0.00765 Training: 2021-03-19 16:30:01,088-[agedb_30][312000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:30:10,982-Speed 831.54 samples/sec Loss 1.8651 Epoch: 18 Global Step: 312050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:30:21,308-Speed 4958.65 samples/sec Loss 1.8580 Epoch: 18 Global Step: 312100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:30:32,139-Speed 4727.72 samples/sec Loss 1.8477 Epoch: 18 Global Step: 312150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:30:42,941-Speed 4740.05 samples/sec Loss 1.8596 Epoch: 18 Global Step: 312200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:30:53,160-Speed 5010.94 samples/sec Loss 1.8480 Epoch: 18 Global Step: 312250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:04,082-Speed 4687.95 samples/sec Loss 1.8473 Epoch: 18 Global Step: 312300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:14,196-Speed 5062.75 samples/sec Loss 1.8698 Epoch: 18 Global Step: 312350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:24,408-Speed 5013.56 samples/sec Loss 1.8660 Epoch: 18 Global Step: 312400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:34,604-Speed 5022.49 samples/sec Loss 1.8498 Epoch: 18 Global Step: 312450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:44,588-Speed 5128.29 samples/sec Loss 1.8647 Epoch: 18 Global Step: 312500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:31:54,657-Speed 5085.10 samples/sec Loss 1.8777 Epoch: 18 Global Step: 312550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:04,814-Speed 5041.49 samples/sec Loss 1.8447 Epoch: 18 Global Step: 312600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:15,657-Speed 4722.16 samples/sec Loss 1.8472 Epoch: 18 Global Step: 312650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:25,942-Speed 4978.62 samples/sec Loss 1.8659 Epoch: 18 Global Step: 312700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:35,955-Speed 5113.40 samples/sec Loss 1.8789 Epoch: 18 Global Step: 312750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:46,296-Speed 4951.71 samples/sec Loss 1.8680 Epoch: 18 Global Step: 312800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:32:56,390-Speed 5072.44 samples/sec Loss 1.8737 Epoch: 18 Global Step: 312850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:06,701-Speed 4966.21 samples/sec Loss 1.8464 Epoch: 18 Global Step: 312900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:16,772-Speed 5083.93 samples/sec Loss 1.8578 Epoch: 18 Global Step: 312950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:26,847-Speed 5082.26 samples/sec Loss 1.8754 Epoch: 18 Global Step: 313000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:37,085-Speed 5000.96 samples/sec Loss 1.8589 Epoch: 18 Global Step: 313050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:47,191-Speed 5066.92 samples/sec Loss 1.8844 Epoch: 18 Global Step: 313100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:33:57,250-Speed 5090.29 samples/sec Loss 1.8313 Epoch: 18 Global Step: 313150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:07,368-Speed 5060.59 samples/sec Loss 1.8600 Epoch: 18 Global Step: 313200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:17,549-Speed 5029.00 samples/sec Loss 1.8676 Epoch: 18 Global Step: 313250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:27,560-Speed 5114.73 samples/sec Loss 1.8662 Epoch: 18 Global Step: 313300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:37,767-Speed 5016.49 samples/sec Loss 1.8593 Epoch: 18 Global Step: 313350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:47,952-Speed 5027.07 samples/sec Loss 1.8555 Epoch: 18 Global Step: 313400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:34:58,194-Speed 4999.61 samples/sec Loss 1.8667 Epoch: 18 Global Step: 313450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:08,453-Speed 4990.89 samples/sec Loss 1.8545 Epoch: 18 Global Step: 313500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:18,354-Speed 5171.67 samples/sec Loss 1.8421 Epoch: 18 Global Step: 313550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:28,339-Speed 5127.69 samples/sec Loss 1.8808 Epoch: 18 Global Step: 313600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:38,511-Speed 5033.78 samples/sec Loss 1.8715 Epoch: 18 Global Step: 313650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:48,638-Speed 5056.26 samples/sec Loss 1.8837 Epoch: 18 Global Step: 313700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:35:58,931-Speed 4974.47 samples/sec Loss 1.8586 Epoch: 18 Global Step: 313750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:36:09,076-Speed 5047.13 samples/sec Loss 1.8884 Epoch: 18 Global Step: 313800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:36:19,155-Speed 5080.04 samples/sec Loss 1.8867 Epoch: 18 Global Step: 313850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:36:29,345-Speed 5024.37 samples/sec Loss 1.8568 Epoch: 18 Global Step: 313900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:36:39,517-Speed 5033.99 samples/sec Loss 1.8439 Epoch: 18 Global Step: 313950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:36:49,657-Speed 5049.65 samples/sec Loss 1.8681 Epoch: 18 Global Step: 314000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:37:06,401-[lfw][314000]XNorm: 23.280405 Training: 2021-03-19 16:37:06,402-[lfw][314000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 16:37:06,402-[lfw][314000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:37:25,077-[cfp_fp][314000]XNorm: 19.611576 Training: 2021-03-19 16:37:25,077-[cfp_fp][314000]Accuracy-Flip: 0.97714+-0.00867 Training: 2021-03-19 16:37:25,077-[cfp_fp][314000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:37:41,214-[agedb_30][314000]XNorm: 22.553787 Training: 2021-03-19 16:37:41,214-[agedb_30][314000]Accuracy-Flip: 0.97600+-0.00680 Training: 2021-03-19 16:37:41,214-[agedb_30][314000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:37:51,011-Speed 834.51 samples/sec Loss 1.8542 Epoch: 18 Global Step: 314050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:00,754-Speed 5255.09 samples/sec Loss 1.8609 Epoch: 18 Global Step: 314100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:10,924-Speed 5034.89 samples/sec Loss 1.8541 Epoch: 18 Global Step: 314150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:21,183-Speed 4991.18 samples/sec Loss 1.8798 Epoch: 18 Global Step: 314200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:31,206-Speed 5108.44 samples/sec Loss 1.8819 Epoch: 18 Global Step: 314250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:41,198-Speed 5124.62 samples/sec Loss 1.8655 Epoch: 18 Global Step: 314300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:38:51,235-Speed 5101.41 samples/sec Loss 1.8855 Epoch: 18 Global Step: 314350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:01,450-Speed 5012.41 samples/sec Loss 1.8735 Epoch: 18 Global Step: 314400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:11,705-Speed 4993.26 samples/sec Loss 1.8787 Epoch: 18 Global Step: 314450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:21,999-Speed 4973.89 samples/sec Loss 1.8749 Epoch: 18 Global Step: 314500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:31,982-Speed 5128.96 samples/sec Loss 1.8621 Epoch: 18 Global Step: 314550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:42,404-Speed 4912.95 samples/sec Loss 1.8799 Epoch: 18 Global Step: 314600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:39:52,689-Speed 4978.24 samples/sec Loss 1.8514 Epoch: 18 Global Step: 314650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:03,485-Speed 4742.76 samples/sec Loss 1.8601 Epoch: 18 Global Step: 314700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:13,447-Speed 5140.19 samples/sec Loss 1.8673 Epoch: 18 Global Step: 314750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:23,416-Speed 5135.87 samples/sec Loss 1.8721 Epoch: 18 Global Step: 314800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:33,671-Speed 4993.18 samples/sec Loss 1.8724 Epoch: 18 Global Step: 314850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:44,538-Speed 4711.78 samples/sec Loss 1.8647 Epoch: 18 Global Step: 314900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:40:54,717-Speed 5030.18 samples/sec Loss 1.8636 Epoch: 18 Global Step: 314950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:04,973-Speed 4992.26 samples/sec Loss 1.8657 Epoch: 18 Global Step: 315000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:15,051-Speed 5080.92 samples/sec Loss 1.8648 Epoch: 18 Global Step: 315050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:25,082-Speed 5104.23 samples/sec Loss 1.8376 Epoch: 18 Global Step: 315100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:35,946-Speed 4713.18 samples/sec Loss 1.8760 Epoch: 18 Global Step: 315150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:46,147-Speed 5019.44 samples/sec Loss 1.8786 Epoch: 18 Global Step: 315200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:41:56,363-Speed 5012.35 samples/sec Loss 1.8773 Epoch: 18 Global Step: 315250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:06,402-Speed 5100.35 samples/sec Loss 1.8702 Epoch: 18 Global Step: 315300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:17,106-Speed 4783.73 samples/sec Loss 1.8702 Epoch: 18 Global Step: 315350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:27,124-Speed 5111.26 samples/sec Loss 1.8468 Epoch: 18 Global Step: 315400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:37,243-Speed 5060.06 samples/sec Loss 1.8724 Epoch: 18 Global Step: 315450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:48,430-Speed 4576.63 samples/sec Loss 1.8538 Epoch: 18 Global Step: 315500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:42:58,512-Speed 5078.71 samples/sec Loss 1.8726 Epoch: 18 Global Step: 315550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:43:09,519-Speed 4652.09 samples/sec Loss 1.8763 Epoch: 18 Global Step: 315600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:43:20,291-Speed 4753.09 samples/sec Loss 1.8659 Epoch: 18 Global Step: 315650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:43:30,613-Speed 4960.75 samples/sec Loss 1.8894 Epoch: 18 Global Step: 315700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:43:40,707-Speed 5072.63 samples/sec Loss 1.8588 Epoch: 18 Global Step: 315750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:43:50,673-Speed 5137.31 samples/sec Loss 1.8790 Epoch: 18 Global Step: 315800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:44:00,802-Speed 5055.37 samples/sec Loss 1.8657 Epoch: 18 Global Step: 315850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:44:10,988-Speed 5027.05 samples/sec Loss 1.8466 Epoch: 18 Global Step: 315900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:44:21,333-Speed 4949.60 samples/sec Loss 1.8530 Epoch: 18 Global Step: 315950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:44:31,612-Speed 4981.32 samples/sec Loss 1.8734 Epoch: 18 Global Step: 316000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:44:48,309-[lfw][316000]XNorm: 23.325958 Training: 2021-03-19 16:44:48,310-[lfw][316000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-19 16:44:48,310-[lfw][316000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:45:06,925-[cfp_fp][316000]XNorm: 19.624735 Training: 2021-03-19 16:45:06,926-[cfp_fp][316000]Accuracy-Flip: 0.97600+-0.00901 Training: 2021-03-19 16:45:06,926-[cfp_fp][316000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:45:23,027-[agedb_30][316000]XNorm: 22.583288 Training: 2021-03-19 16:45:23,027-[agedb_30][316000]Accuracy-Flip: 0.97583+-0.00731 Training: 2021-03-19 16:45:23,027-[agedb_30][316000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:45:33,049-Speed 833.37 samples/sec Loss 1.8382 Epoch: 18 Global Step: 316050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:45:42,978-Speed 5157.00 samples/sec Loss 1.8899 Epoch: 18 Global Step: 316100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:45:53,250-Speed 4984.75 samples/sec Loss 1.8612 Epoch: 18 Global Step: 316150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:03,439-Speed 5025.30 samples/sec Loss 1.8944 Epoch: 18 Global Step: 316200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:13,547-Speed 5065.59 samples/sec Loss 1.8382 Epoch: 18 Global Step: 316250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:23,762-Speed 5012.94 samples/sec Loss 1.8462 Epoch: 18 Global Step: 316300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:33,998-Speed 5001.99 samples/sec Loss 1.8497 Epoch: 18 Global Step: 316350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:44,173-Speed 5032.38 samples/sec Loss 1.8690 Epoch: 18 Global Step: 316400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:46:54,403-Speed 5005.19 samples/sec Loss 1.8782 Epoch: 18 Global Step: 316450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:04,679-Speed 4982.54 samples/sec Loss 1.8854 Epoch: 18 Global Step: 316500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:14,921-Speed 4999.25 samples/sec Loss 1.8689 Epoch: 18 Global Step: 316550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:25,107-Speed 5026.74 samples/sec Loss 1.8573 Epoch: 18 Global Step: 316600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:35,218-Speed 5064.27 samples/sec Loss 1.8974 Epoch: 18 Global Step: 316650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:45,375-Speed 5041.47 samples/sec Loss 1.8341 Epoch: 18 Global Step: 316700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:47:55,293-Speed 5162.85 samples/sec Loss 1.8402 Epoch: 18 Global Step: 316750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:05,350-Speed 5091.20 samples/sec Loss 1.8483 Epoch: 18 Global Step: 316800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:15,441-Speed 5074.12 samples/sec Loss 1.8511 Epoch: 18 Global Step: 316850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:25,573-Speed 5053.25 samples/sec Loss 1.8875 Epoch: 18 Global Step: 316900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:35,524-Speed 5146.05 samples/sec Loss 1.8716 Epoch: 18 Global Step: 316950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:45,718-Speed 5022.75 samples/sec Loss 1.8557 Epoch: 18 Global Step: 317000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:48:56,269-Speed 4852.95 samples/sec Loss 1.8669 Epoch: 18 Global Step: 317050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:49:06,464-Speed 5022.43 samples/sec Loss 1.8637 Epoch: 18 Global Step: 317100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:49:28,730-Speed 2299.52 samples/sec Loss 1.8652 Epoch: 19 Global Step: 317150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:49:39,195-Speed 4892.90 samples/sec Loss 1.8631 Epoch: 19 Global Step: 317200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:49:49,596-Speed 4922.84 samples/sec Loss 1.8528 Epoch: 19 Global Step: 317250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:49:59,948-Speed 4946.61 samples/sec Loss 1.8563 Epoch: 19 Global Step: 317300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:50:10,089-Speed 5049.05 samples/sec Loss 1.8608 Epoch: 19 Global Step: 317350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:50:20,218-Speed 5055.31 samples/sec Loss 1.8561 Epoch: 19 Global Step: 317400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:50:30,746-Speed 4863.29 samples/sec Loss 1.8622 Epoch: 19 Global Step: 317450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:50:40,862-Speed 5061.66 samples/sec Loss 1.8629 Epoch: 19 Global Step: 317500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:50:51,283-Speed 4913.64 samples/sec Loss 1.8578 Epoch: 19 Global Step: 317550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:01,690-Speed 4919.99 samples/sec Loss 1.8828 Epoch: 19 Global Step: 317600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:11,860-Speed 5035.03 samples/sec Loss 1.8688 Epoch: 19 Global Step: 317650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:22,220-Speed 4941.96 samples/sec Loss 1.8405 Epoch: 19 Global Step: 317700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:32,583-Speed 4941.24 samples/sec Loss 1.8510 Epoch: 19 Global Step: 317750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:42,722-Speed 5050.09 samples/sec Loss 1.8796 Epoch: 19 Global Step: 317800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:51:52,812-Speed 5074.41 samples/sec Loss 1.8868 Epoch: 19 Global Step: 317850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:52:02,951-Speed 5050.23 samples/sec Loss 1.8747 Epoch: 19 Global Step: 317900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:52:13,398-Speed 4901.31 samples/sec Loss 1.8521 Epoch: 19 Global Step: 317950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:52:23,585-Speed 5026.59 samples/sec Loss 1.8731 Epoch: 19 Global Step: 318000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:52:40,481-[lfw][318000]XNorm: 23.236754 Training: 2021-03-19 16:52:40,482-[lfw][318000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 16:52:40,482-[lfw][318000]Accuracy-Highest: 0.99767 Training: 2021-03-19 16:52:59,408-[cfp_fp][318000]XNorm: 19.552952 Training: 2021-03-19 16:52:59,408-[cfp_fp][318000]Accuracy-Flip: 0.97543+-0.00859 Training: 2021-03-19 16:52:59,408-[cfp_fp][318000]Accuracy-Highest: 0.97729 Training: 2021-03-19 16:53:15,578-[agedb_30][318000]XNorm: 22.500122 Training: 2021-03-19 16:53:15,578-[agedb_30][318000]Accuracy-Flip: 0.97650+-0.00747 Training: 2021-03-19 16:53:15,578-[agedb_30][318000]Accuracy-Highest: 0.97717 Training: 2021-03-19 16:53:26,298-Speed 816.42 samples/sec Loss 1.8735 Epoch: 19 Global Step: 318050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:53:36,528-Speed 5005.08 samples/sec Loss 1.8611 Epoch: 19 Global Step: 318100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:53:46,891-Speed 4940.80 samples/sec Loss 1.8397 Epoch: 19 Global Step: 318150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:53:56,883-Speed 5124.73 samples/sec Loss 1.8440 Epoch: 19 Global Step: 318200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:07,018-Speed 5051.96 samples/sec Loss 1.8724 Epoch: 19 Global Step: 318250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:17,165-Speed 5045.96 samples/sec Loss 1.8638 Epoch: 19 Global Step: 318300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:27,795-Speed 4816.97 samples/sec Loss 1.8685 Epoch: 19 Global Step: 318350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:37,917-Speed 5058.58 samples/sec Loss 1.8722 Epoch: 19 Global Step: 318400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:48,202-Speed 4978.38 samples/sec Loss 1.8489 Epoch: 19 Global Step: 318450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:54:58,387-Speed 5027.12 samples/sec Loss 1.8501 Epoch: 19 Global Step: 318500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:55:09,934-Speed 4434.32 samples/sec Loss 1.9050 Epoch: 19 Global Step: 318550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:55:20,058-Speed 5057.58 samples/sec Loss 1.8500 Epoch: 19 Global Step: 318600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:55:30,118-Speed 5089.86 samples/sec Loss 1.8675 Epoch: 19 Global Step: 318650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:55:40,922-Speed 4739.23 samples/sec Loss 1.8651 Epoch: 19 Global Step: 318700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:55:51,028-Speed 5066.36 samples/sec Loss 1.8781 Epoch: 19 Global Step: 318750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:01,144-Speed 5061.44 samples/sec Loss 1.8688 Epoch: 19 Global Step: 318800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:11,451-Speed 4967.87 samples/sec Loss 1.8700 Epoch: 19 Global Step: 318850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:21,525-Speed 5082.59 samples/sec Loss 1.8542 Epoch: 19 Global Step: 318900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:32,440-Speed 4691.07 samples/sec Loss 1.8575 Epoch: 19 Global Step: 318950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:44,262-Speed 4331.00 samples/sec Loss 1.8560 Epoch: 19 Global Step: 319000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:56:54,592-Speed 4957.09 samples/sec Loss 1.8550 Epoch: 19 Global Step: 319050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:04,676-Speed 5077.40 samples/sec Loss 1.8706 Epoch: 19 Global Step: 319100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:14,723-Speed 5096.33 samples/sec Loss 1.8357 Epoch: 19 Global Step: 319150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:24,763-Speed 5099.93 samples/sec Loss 1.8630 Epoch: 19 Global Step: 319200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:35,065-Speed 4970.52 samples/sec Loss 1.8499 Epoch: 19 Global Step: 319250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:45,067-Speed 5119.33 samples/sec Loss 1.8724 Epoch: 19 Global Step: 319300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:57:56,118-Speed 4633.48 samples/sec Loss 1.8607 Epoch: 19 Global Step: 319350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:06,047-Speed 5156.45 samples/sec Loss 1.8672 Epoch: 19 Global Step: 319400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:16,188-Speed 5049.13 samples/sec Loss 1.8581 Epoch: 19 Global Step: 319450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:26,314-Speed 5056.63 samples/sec Loss 1.8699 Epoch: 19 Global Step: 319500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:36,459-Speed 5047.11 samples/sec Loss 1.8751 Epoch: 19 Global Step: 319550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:46,688-Speed 5005.67 samples/sec Loss 1.8618 Epoch: 19 Global Step: 319600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:58:56,721-Speed 5103.66 samples/sec Loss 1.8500 Epoch: 19 Global Step: 319650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:06,918-Speed 5021.19 samples/sec Loss 1.8961 Epoch: 19 Global Step: 319700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:17,048-Speed 5054.21 samples/sec Loss 1.8690 Epoch: 19 Global Step: 319750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:27,485-Speed 4906.23 samples/sec Loss 1.8520 Epoch: 19 Global Step: 319800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:37,512-Speed 5106.41 samples/sec Loss 1.8629 Epoch: 19 Global Step: 319850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:47,600-Speed 5075.62 samples/sec Loss 1.8591 Epoch: 19 Global Step: 319900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 16:59:57,667-Speed 5086.54 samples/sec Loss 1.8479 Epoch: 19 Global Step: 319950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:00:07,896-Speed 5005.48 samples/sec Loss 1.8581 Epoch: 19 Global Step: 320000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:00:24,689-[lfw][320000]XNorm: 23.293376 Training: 2021-03-19 17:00:24,689-[lfw][320000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 17:00:24,689-[lfw][320000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:00:43,489-[cfp_fp][320000]XNorm: 19.617996 Training: 2021-03-19 17:00:43,489-[cfp_fp][320000]Accuracy-Flip: 0.97571+-0.00899 Training: 2021-03-19 17:00:43,489-[cfp_fp][320000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:00:59,768-[agedb_30][320000]XNorm: 22.540793 Training: 2021-03-19 17:00:59,768-[agedb_30][320000]Accuracy-Flip: 0.97633+-0.00733 Training: 2021-03-19 17:00:59,768-[agedb_30][320000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:01:09,762-Speed 827.60 samples/sec Loss 1.8689 Epoch: 19 Global Step: 320050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:01:19,844-Speed 5078.91 samples/sec Loss 1.8538 Epoch: 19 Global Step: 320100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:01:30,108-Speed 4988.66 samples/sec Loss 1.8347 Epoch: 19 Global Step: 320150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:01:40,133-Speed 5107.46 samples/sec Loss 1.8759 Epoch: 19 Global Step: 320200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:01:50,237-Speed 5067.23 samples/sec Loss 1.8814 Epoch: 19 Global Step: 320250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:00,414-Speed 5031.62 samples/sec Loss 1.8520 Epoch: 19 Global Step: 320300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:10,653-Speed 5000.67 samples/sec Loss 1.8554 Epoch: 19 Global Step: 320350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:21,125-Speed 4889.64 samples/sec Loss 1.8594 Epoch: 19 Global Step: 320400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:31,551-Speed 4910.71 samples/sec Loss 1.8405 Epoch: 19 Global Step: 320450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:41,942-Speed 4927.54 samples/sec Loss 1.8866 Epoch: 19 Global Step: 320500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:02:51,934-Speed 5124.74 samples/sec Loss 1.8264 Epoch: 19 Global Step: 320550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:02,224-Speed 4975.67 samples/sec Loss 1.8388 Epoch: 19 Global Step: 320600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:12,323-Speed 5070.43 samples/sec Loss 1.8686 Epoch: 19 Global Step: 320650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:22,563-Speed 5000.23 samples/sec Loss 1.8649 Epoch: 19 Global Step: 320700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:32,559-Speed 5122.47 samples/sec Loss 1.8506 Epoch: 19 Global Step: 320750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:42,669-Speed 5064.26 samples/sec Loss 1.8501 Epoch: 19 Global Step: 320800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:03:52,883-Speed 5013.27 samples/sec Loss 1.8415 Epoch: 19 Global Step: 320850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:03,038-Speed 5041.97 samples/sec Loss 1.8421 Epoch: 19 Global Step: 320900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:13,080-Speed 5098.91 samples/sec Loss 1.8728 Epoch: 19 Global Step: 320950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:23,268-Speed 5025.73 samples/sec Loss 1.8712 Epoch: 19 Global Step: 321000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:33,370-Speed 5068.64 samples/sec Loss 1.8692 Epoch: 19 Global Step: 321050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:43,483-Speed 5063.36 samples/sec Loss 1.8685 Epoch: 19 Global Step: 321100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:04:53,663-Speed 5029.70 samples/sec Loss 1.8585 Epoch: 19 Global Step: 321150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:03,742-Speed 5080.48 samples/sec Loss 1.8319 Epoch: 19 Global Step: 321200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:14,083-Speed 4951.26 samples/sec Loss 1.8744 Epoch: 19 Global Step: 321250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:24,223-Speed 5049.63 samples/sec Loss 1.8630 Epoch: 19 Global Step: 321300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:34,203-Speed 5130.90 samples/sec Loss 1.8409 Epoch: 19 Global Step: 321350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:44,438-Speed 5002.80 samples/sec Loss 1.8527 Epoch: 19 Global Step: 321400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:05:55,341-Speed 4696.14 samples/sec Loss 1.8508 Epoch: 19 Global Step: 321450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:05,291-Speed 5146.28 samples/sec Loss 1.8566 Epoch: 19 Global Step: 321500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:15,551-Speed 4990.19 samples/sec Loss 1.8475 Epoch: 19 Global Step: 321550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:26,452-Speed 4697.49 samples/sec Loss 1.8521 Epoch: 19 Global Step: 321600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:36,707-Speed 4992.75 samples/sec Loss 1.8285 Epoch: 19 Global Step: 321650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:46,649-Speed 5150.47 samples/sec Loss 1.8595 Epoch: 19 Global Step: 321700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:06:56,786-Speed 5050.85 samples/sec Loss 1.8334 Epoch: 19 Global Step: 321750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:07:07,006-Speed 5009.83 samples/sec Loss 1.8715 Epoch: 19 Global Step: 321800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:07:17,177-Speed 5034.28 samples/sec Loss 1.8799 Epoch: 19 Global Step: 321850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:07:27,204-Speed 5106.67 samples/sec Loss 1.8705 Epoch: 19 Global Step: 321900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:07:38,207-Speed 4653.31 samples/sec Loss 1.8622 Epoch: 19 Global Step: 321950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:07:48,462-Speed 4993.20 samples/sec Loss 1.8631 Epoch: 19 Global Step: 322000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:08:04,746-[lfw][322000]XNorm: 23.234600 Training: 2021-03-19 17:08:04,747-[lfw][322000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 17:08:04,747-[lfw][322000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:08:23,320-[cfp_fp][322000]XNorm: 19.564163 Training: 2021-03-19 17:08:23,321-[cfp_fp][322000]Accuracy-Flip: 0.97600+-0.00868 Training: 2021-03-19 17:08:23,321-[cfp_fp][322000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:08:39,411-[agedb_30][322000]XNorm: 22.508955 Training: 2021-03-19 17:08:39,411-[agedb_30][322000]Accuracy-Flip: 0.97450+-0.00778 Training: 2021-03-19 17:08:39,411-[agedb_30][322000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:08:49,559-Speed 838.02 samples/sec Loss 1.8587 Epoch: 19 Global Step: 322050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:00,552-Speed 4658.02 samples/sec Loss 1.8724 Epoch: 19 Global Step: 322100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:10,601-Speed 5095.38 samples/sec Loss 1.8576 Epoch: 19 Global Step: 322150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:21,109-Speed 4872.48 samples/sec Loss 1.8580 Epoch: 19 Global Step: 322200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:31,345-Speed 5002.45 samples/sec Loss 1.8612 Epoch: 19 Global Step: 322250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:42,793-Speed 4472.60 samples/sec Loss 1.8823 Epoch: 19 Global Step: 322300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:09:53,932-Speed 4596.71 samples/sec Loss 1.8584 Epoch: 19 Global Step: 322350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:03,930-Speed 5121.21 samples/sec Loss 1.8380 Epoch: 19 Global Step: 322400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:14,300-Speed 4937.69 samples/sec Loss 1.8579 Epoch: 19 Global Step: 322450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:24,369-Speed 5085.42 samples/sec Loss 1.8602 Epoch: 19 Global Step: 322500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:34,583-Speed 5012.98 samples/sec Loss 1.8557 Epoch: 19 Global Step: 322550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:44,631-Speed 5095.97 samples/sec Loss 1.8584 Epoch: 19 Global Step: 322600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:10:54,792-Speed 5039.24 samples/sec Loss 1.8751 Epoch: 19 Global Step: 322650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:05,107-Speed 4963.52 samples/sec Loss 1.8433 Epoch: 19 Global Step: 322700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:15,717-Speed 4826.02 samples/sec Loss 1.8624 Epoch: 19 Global Step: 322750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:26,077-Speed 4942.57 samples/sec Loss 1.8673 Epoch: 19 Global Step: 322800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:36,271-Speed 5023.00 samples/sec Loss 1.8970 Epoch: 19 Global Step: 322850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:46,496-Speed 5007.85 samples/sec Loss 1.8605 Epoch: 19 Global Step: 322900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:11:56,723-Speed 5006.64 samples/sec Loss 1.8458 Epoch: 19 Global Step: 322950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:07,007-Speed 4978.81 samples/sec Loss 1.8815 Epoch: 19 Global Step: 323000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:17,132-Speed 5056.92 samples/sec Loss 1.8355 Epoch: 19 Global Step: 323050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:27,200-Speed 5085.58 samples/sec Loss 1.8333 Epoch: 19 Global Step: 323100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:37,210-Speed 5115.11 samples/sec Loss 1.8772 Epoch: 19 Global Step: 323150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:47,280-Speed 5084.80 samples/sec Loss 1.8707 Epoch: 19 Global Step: 323200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:12:57,440-Speed 5039.66 samples/sec Loss 1.8592 Epoch: 19 Global Step: 323250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:07,596-Speed 5041.85 samples/sec Loss 1.8598 Epoch: 19 Global Step: 323300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:17,749-Speed 5043.18 samples/sec Loss 1.8577 Epoch: 19 Global Step: 323350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:27,875-Speed 5056.25 samples/sec Loss 1.8611 Epoch: 19 Global Step: 323400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:38,043-Speed 5035.75 samples/sec Loss 1.8437 Epoch: 19 Global Step: 323450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:48,057-Speed 5113.08 samples/sec Loss 1.8610 Epoch: 19 Global Step: 323500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:13:57,962-Speed 5169.59 samples/sec Loss 1.8608 Epoch: 19 Global Step: 323550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:07,937-Speed 5133.48 samples/sec Loss 1.8580 Epoch: 19 Global Step: 323600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:17,889-Speed 5144.65 samples/sec Loss 1.8763 Epoch: 19 Global Step: 323650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:27,923-Speed 5103.28 samples/sec Loss 1.8499 Epoch: 19 Global Step: 323700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:38,498-Speed 4841.83 samples/sec Loss 1.8728 Epoch: 19 Global Step: 323750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:48,439-Speed 5150.54 samples/sec Loss 1.8321 Epoch: 19 Global Step: 323800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:14:58,517-Speed 5080.67 samples/sec Loss 1.8621 Epoch: 19 Global Step: 323850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:15:08,778-Speed 4990.07 samples/sec Loss 1.8417 Epoch: 19 Global Step: 323900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:15:18,966-Speed 5026.07 samples/sec Loss 1.8640 Epoch: 19 Global Step: 323950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:15:29,282-Speed 4963.59 samples/sec Loss 1.8576 Epoch: 19 Global Step: 324000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:15:46,052-[lfw][324000]XNorm: 23.269836 Training: 2021-03-19 17:15:46,052-[lfw][324000]Accuracy-Flip: 0.99683+-0.00283 Training: 2021-03-19 17:15:46,052-[lfw][324000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:16:04,743-[cfp_fp][324000]XNorm: 19.587607 Training: 2021-03-19 17:16:04,744-[cfp_fp][324000]Accuracy-Flip: 0.97686+-0.00882 Training: 2021-03-19 17:16:04,744-[cfp_fp][324000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:16:20,892-[agedb_30][324000]XNorm: 22.526044 Training: 2021-03-19 17:16:20,892-[agedb_30][324000]Accuracy-Flip: 0.97433+-0.00750 Training: 2021-03-19 17:16:20,892-[agedb_30][324000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:16:31,102-Speed 828.21 samples/sec Loss 1.8706 Epoch: 19 Global Step: 324050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:16:41,165-Speed 5088.54 samples/sec Loss 1.8747 Epoch: 19 Global Step: 324100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:16:51,077-Speed 5165.52 samples/sec Loss 1.8748 Epoch: 19 Global Step: 324150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:01,286-Speed 5015.65 samples/sec Loss 1.8826 Epoch: 19 Global Step: 324200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:11,212-Speed 5158.66 samples/sec Loss 1.8585 Epoch: 19 Global Step: 324250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:21,693-Speed 4885.12 samples/sec Loss 1.8552 Epoch: 19 Global Step: 324300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:31,798-Speed 5067.43 samples/sec Loss 1.8725 Epoch: 19 Global Step: 324350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:41,714-Speed 5163.78 samples/sec Loss 1.8472 Epoch: 19 Global Step: 324400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:17:51,670-Speed 5142.67 samples/sec Loss 1.8484 Epoch: 19 Global Step: 324450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:01,970-Speed 4971.18 samples/sec Loss 1.8823 Epoch: 19 Global Step: 324500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:12,329-Speed 4942.92 samples/sec Loss 1.8517 Epoch: 19 Global Step: 324550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:22,475-Speed 5046.62 samples/sec Loss 1.8583 Epoch: 19 Global Step: 324600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:32,674-Speed 5020.38 samples/sec Loss 1.8869 Epoch: 19 Global Step: 324650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:42,763-Speed 5075.00 samples/sec Loss 1.8707 Epoch: 19 Global Step: 324700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:18:52,813-Speed 5095.17 samples/sec Loss 1.8700 Epoch: 19 Global Step: 324750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:02,751-Speed 5151.83 samples/sec Loss 1.8600 Epoch: 19 Global Step: 324800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:13,936-Speed 4577.70 samples/sec Loss 1.8715 Epoch: 19 Global Step: 324850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:24,232-Speed 4973.47 samples/sec Loss 1.8517 Epoch: 19 Global Step: 324900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:34,251-Speed 5110.50 samples/sec Loss 1.8770 Epoch: 19 Global Step: 324950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:45,217-Speed 4669.14 samples/sec Loss 1.8786 Epoch: 19 Global Step: 325000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:19:55,411-Speed 5022.84 samples/sec Loss 1.8751 Epoch: 19 Global Step: 325050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:05,559-Speed 5045.74 samples/sec Loss 1.8665 Epoch: 19 Global Step: 325100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:15,545-Speed 5127.90 samples/sec Loss 1.8635 Epoch: 19 Global Step: 325150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:25,753-Speed 5015.86 samples/sec Loss 1.8539 Epoch: 19 Global Step: 325200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:35,629-Speed 5184.45 samples/sec Loss 1.8736 Epoch: 19 Global Step: 325250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:45,783-Speed 5042.80 samples/sec Loss 1.8708 Epoch: 19 Global Step: 325300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:20:56,820-Speed 4639.21 samples/sec Loss 1.8637 Epoch: 19 Global Step: 325350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:06,919-Speed 5070.28 samples/sec Loss 1.8750 Epoch: 19 Global Step: 325400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:17,770-Speed 4718.76 samples/sec Loss 1.8250 Epoch: 19 Global Step: 325450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:27,889-Speed 5060.24 samples/sec Loss 1.8813 Epoch: 19 Global Step: 325500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:38,006-Speed 5060.75 samples/sec Loss 1.8475 Epoch: 19 Global Step: 325550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:48,167-Speed 5039.63 samples/sec Loss 1.8701 Epoch: 19 Global Step: 325600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:21:58,276-Speed 5065.11 samples/sec Loss 1.8569 Epoch: 19 Global Step: 325650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:22:09,717-Speed 4475.41 samples/sec Loss 1.8688 Epoch: 19 Global Step: 325700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:22:19,807-Speed 5074.34 samples/sec Loss 1.8446 Epoch: 19 Global Step: 325750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:22:30,641-Speed 4726.06 samples/sec Loss 1.8657 Epoch: 19 Global Step: 325800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:22:40,868-Speed 5007.10 samples/sec Loss 1.8575 Epoch: 19 Global Step: 325850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:22:50,784-Speed 5163.73 samples/sec Loss 1.8574 Epoch: 19 Global Step: 325900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:23:00,853-Speed 5085.08 samples/sec Loss 1.8715 Epoch: 19 Global Step: 325950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:23:10,813-Speed 5141.07 samples/sec Loss 1.8574 Epoch: 19 Global Step: 326000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-19 17:23:27,342-[lfw][326000]XNorm: 23.250917 Training: 2021-03-19 17:23:27,343-[lfw][326000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 17:23:27,343-[lfw][326000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:23:45,929-[cfp_fp][326000]XNorm: 19.578478 Training: 2021-03-19 17:23:45,930-[cfp_fp][326000]Accuracy-Flip: 0.97529+-0.00910 Training: 2021-03-19 17:23:45,930-[cfp_fp][326000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:24:02,019-[agedb_30][326000]XNorm: 22.523973 Training: 2021-03-19 17:24:02,019-[agedb_30][326000]Accuracy-Flip: 0.97483+-0.00769 Training: 2021-03-19 17:24:02,019-[agedb_30][326000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:24:12,018-Speed 836.53 samples/sec Loss 1.8874 Epoch: 19 Global Step: 326050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:24:22,138-Speed 5059.53 samples/sec Loss 1.8820 Epoch: 19 Global Step: 326100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:24:32,943-Speed 4739.13 samples/sec Loss 1.8564 Epoch: 19 Global Step: 326150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:24:42,859-Speed 5163.67 samples/sec Loss 1.8609 Epoch: 19 Global Step: 326200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:24:52,823-Speed 5138.93 samples/sec Loss 1.8534 Epoch: 19 Global Step: 326250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:02,878-Speed 5092.15 samples/sec Loss 1.8671 Epoch: 19 Global Step: 326300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:13,052-Speed 5032.84 samples/sec Loss 1.8495 Epoch: 19 Global Step: 326350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:23,420-Speed 4938.75 samples/sec Loss 1.8449 Epoch: 19 Global Step: 326400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:33,520-Speed 5069.61 samples/sec Loss 1.8741 Epoch: 19 Global Step: 326450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:43,640-Speed 5059.18 samples/sec Loss 1.8168 Epoch: 19 Global Step: 326500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:25:53,743-Speed 5068.22 samples/sec Loss 1.8599 Epoch: 19 Global Step: 326550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:04,084-Speed 4951.69 samples/sec Loss 1.8748 Epoch: 19 Global Step: 326600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:14,138-Speed 5092.94 samples/sec Loss 1.8678 Epoch: 19 Global Step: 326650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:24,378-Speed 5000.32 samples/sec Loss 1.8460 Epoch: 19 Global Step: 326700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:34,501-Speed 5057.72 samples/sec Loss 1.8486 Epoch: 19 Global Step: 326750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:44,670-Speed 5035.33 samples/sec Loss 1.8679 Epoch: 19 Global Step: 326800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:26:54,946-Speed 4982.50 samples/sec Loss 1.8596 Epoch: 19 Global Step: 326850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:05,108-Speed 5038.76 samples/sec Loss 1.8571 Epoch: 19 Global Step: 326900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:15,145-Speed 5101.70 samples/sec Loss 1.8560 Epoch: 19 Global Step: 326950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:25,270-Speed 5057.14 samples/sec Loss 1.8619 Epoch: 19 Global Step: 327000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:35,212-Speed 5150.03 samples/sec Loss 1.8583 Epoch: 19 Global Step: 327050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:45,406-Speed 5023.07 samples/sec Loss 1.8331 Epoch: 19 Global Step: 327100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:27:55,323-Speed 5163.12 samples/sec Loss 1.8895 Epoch: 19 Global Step: 327150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:05,345-Speed 5109.02 samples/sec Loss 1.8946 Epoch: 19 Global Step: 327200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:15,586-Speed 4999.35 samples/sec Loss 1.8674 Epoch: 19 Global Step: 327250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:25,555-Speed 5136.48 samples/sec Loss 1.8603 Epoch: 19 Global Step: 327300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:35,480-Speed 5159.09 samples/sec Loss 1.8712 Epoch: 19 Global Step: 327350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:45,764-Speed 4978.93 samples/sec Loss 1.8497 Epoch: 19 Global Step: 327400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:28:55,963-Speed 5020.11 samples/sec Loss 1.8831 Epoch: 19 Global Step: 327450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:06,232-Speed 4986.03 samples/sec Loss 1.8699 Epoch: 19 Global Step: 327500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:16,339-Speed 5066.66 samples/sec Loss 1.8683 Epoch: 19 Global Step: 327550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:26,718-Speed 4933.41 samples/sec Loss 1.8848 Epoch: 19 Global Step: 327600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:36,783-Speed 5087.19 samples/sec Loss 1.8550 Epoch: 19 Global Step: 327650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:46,899-Speed 5061.90 samples/sec Loss 1.8663 Epoch: 19 Global Step: 327700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:29:57,015-Speed 5061.29 samples/sec Loss 1.8645 Epoch: 19 Global Step: 327750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:30:07,074-Speed 5090.34 samples/sec Loss 1.8754 Epoch: 19 Global Step: 327800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:30:17,160-Speed 5076.56 samples/sec Loss 1.8517 Epoch: 19 Global Step: 327850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:30:27,210-Speed 5094.99 samples/sec Loss 1.8796 Epoch: 19 Global Step: 327900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:30:37,192-Speed 5129.51 samples/sec Loss 1.8599 Epoch: 19 Global Step: 327950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:30:47,457-Speed 4988.36 samples/sec Loss 1.8772 Epoch: 19 Global Step: 328000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:31:04,118-[lfw][328000]XNorm: 23.133665 Training: 2021-03-19 17:31:04,118-[lfw][328000]Accuracy-Flip: 0.99683+-0.00263 Training: 2021-03-19 17:31:04,118-[lfw][328000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:31:22,713-[cfp_fp][328000]XNorm: 19.463052 Training: 2021-03-19 17:31:22,714-[cfp_fp][328000]Accuracy-Flip: 0.97714+-0.00876 Training: 2021-03-19 17:31:22,715-[cfp_fp][328000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:31:38,859-[agedb_30][328000]XNorm: 22.410166 Training: 2021-03-19 17:31:38,859-[agedb_30][328000]Accuracy-Flip: 0.97433+-0.00800 Training: 2021-03-19 17:31:38,859-[agedb_30][328000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:31:48,840-Speed 834.11 samples/sec Loss 1.8610 Epoch: 19 Global Step: 328050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:31:58,951-Speed 5064.08 samples/sec Loss 1.8502 Epoch: 19 Global Step: 328100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:32:10,093-Speed 4595.26 samples/sec Loss 1.8544 Epoch: 19 Global Step: 328150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:32:20,276-Speed 5028.19 samples/sec Loss 1.8443 Epoch: 19 Global Step: 328200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:32:30,416-Speed 5049.95 samples/sec Loss 1.8388 Epoch: 19 Global Step: 328250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:32:40,467-Speed 5094.38 samples/sec Loss 1.8467 Epoch: 19 Global Step: 328300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:32:51,165-Speed 4785.74 samples/sec Loss 1.8735 Epoch: 19 Global Step: 328350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:01,286-Speed 5059.61 samples/sec Loss 1.8787 Epoch: 19 Global Step: 328400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:11,435-Speed 5045.11 samples/sec Loss 1.8646 Epoch: 19 Global Step: 328450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:21,707-Speed 4984.33 samples/sec Loss 1.8568 Epoch: 19 Global Step: 328500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:31,750-Speed 5099.16 samples/sec Loss 1.8617 Epoch: 19 Global Step: 328550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:41,830-Speed 5079.28 samples/sec Loss 1.8701 Epoch: 19 Global Step: 328600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:33:52,038-Speed 5015.94 samples/sec Loss 1.8569 Epoch: 19 Global Step: 328650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:02,069-Speed 5104.42 samples/sec Loss 1.8576 Epoch: 19 Global Step: 328700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:12,069-Speed 5120.53 samples/sec Loss 1.8675 Epoch: 19 Global Step: 328750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:22,980-Speed 4692.97 samples/sec Loss 1.8546 Epoch: 19 Global Step: 328800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:33,886-Speed 4694.88 samples/sec Loss 1.8562 Epoch: 19 Global Step: 328850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:43,984-Speed 5070.75 samples/sec Loss 1.8537 Epoch: 19 Global Step: 328900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:34:54,134-Speed 5044.50 samples/sec Loss 1.8427 Epoch: 19 Global Step: 328950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:04,401-Speed 4986.97 samples/sec Loss 1.8437 Epoch: 19 Global Step: 329000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:14,497-Speed 5071.87 samples/sec Loss 1.8604 Epoch: 19 Global Step: 329050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:26,813-Speed 4157.13 samples/sec Loss 1.8630 Epoch: 19 Global Step: 329100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:37,137-Speed 4959.71 samples/sec Loss 1.8545 Epoch: 19 Global Step: 329150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:47,322-Speed 5027.60 samples/sec Loss 1.8522 Epoch: 19 Global Step: 329200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:35:57,300-Speed 5131.53 samples/sec Loss 1.8583 Epoch: 19 Global Step: 329250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:07,583-Speed 4979.27 samples/sec Loss 1.8677 Epoch: 19 Global Step: 329300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:17,855-Speed 4984.55 samples/sec Loss 1.8562 Epoch: 19 Global Step: 329350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:27,919-Speed 5087.77 samples/sec Loss 1.8525 Epoch: 19 Global Step: 329400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:38,268-Speed 4947.88 samples/sec Loss 1.8763 Epoch: 19 Global Step: 329450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:48,237-Speed 5136.33 samples/sec Loss 1.8560 Epoch: 19 Global Step: 329500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:36:58,396-Speed 5040.26 samples/sec Loss 1.8729 Epoch: 19 Global Step: 329550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:37:09,291-Speed 4699.70 samples/sec Loss 1.8689 Epoch: 19 Global Step: 329600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:37:19,509-Speed 5010.67 samples/sec Loss 1.8575 Epoch: 19 Global Step: 329650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:37:29,951-Speed 4903.65 samples/sec Loss 1.8501 Epoch: 19 Global Step: 329700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:37:39,997-Speed 5096.69 samples/sec Loss 1.8483 Epoch: 19 Global Step: 329750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:37:50,066-Speed 5085.28 samples/sec Loss 1.8612 Epoch: 19 Global Step: 329800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:38:00,095-Speed 5106.17 samples/sec Loss 1.8465 Epoch: 19 Global Step: 329850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:38:10,354-Speed 4990.95 samples/sec Loss 1.8798 Epoch: 19 Global Step: 329900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:38:20,495-Speed 5049.12 samples/sec Loss 1.8707 Epoch: 19 Global Step: 329950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:38:30,663-Speed 5035.44 samples/sec Loss 1.8524 Epoch: 19 Global Step: 330000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:38:47,862-[lfw][330000]XNorm: 23.202223 Training: 2021-03-19 17:38:47,863-[lfw][330000]Accuracy-Flip: 0.99667+-0.00279 Training: 2021-03-19 17:38:47,863-[lfw][330000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:39:06,662-[cfp_fp][330000]XNorm: 19.517568 Training: 2021-03-19 17:39:06,662-[cfp_fp][330000]Accuracy-Flip: 0.97529+-0.00892 Training: 2021-03-19 17:39:06,662-[cfp_fp][330000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:39:22,871-[agedb_30][330000]XNorm: 22.465357 Training: 2021-03-19 17:39:22,872-[agedb_30][330000]Accuracy-Flip: 0.97533+-0.00770 Training: 2021-03-19 17:39:22,872-[agedb_30][330000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:39:33,062-Speed 820.53 samples/sec Loss 1.8575 Epoch: 19 Global Step: 330050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:39:43,186-Speed 5057.54 samples/sec Loss 1.8420 Epoch: 19 Global Step: 330100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:39:53,387-Speed 5019.89 samples/sec Loss 1.8694 Epoch: 19 Global Step: 330150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:03,478-Speed 5073.99 samples/sec Loss 1.8492 Epoch: 19 Global Step: 330200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:13,530-Speed 5093.95 samples/sec Loss 1.8462 Epoch: 19 Global Step: 330250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:23,765-Speed 5002.74 samples/sec Loss 1.8484 Epoch: 19 Global Step: 330300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:33,971-Speed 5017.37 samples/sec Loss 1.8726 Epoch: 19 Global Step: 330350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:44,055-Speed 5077.25 samples/sec Loss 1.8492 Epoch: 19 Global Step: 330400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:40:54,200-Speed 5047.51 samples/sec Loss 1.8653 Epoch: 19 Global Step: 330450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:04,224-Speed 5107.76 samples/sec Loss 1.8735 Epoch: 19 Global Step: 330500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:14,556-Speed 4955.77 samples/sec Loss 1.8744 Epoch: 19 Global Step: 330550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:24,637-Speed 5079.44 samples/sec Loss 1.8723 Epoch: 19 Global Step: 330600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:34,636-Speed 5120.65 samples/sec Loss 1.8574 Epoch: 19 Global Step: 330650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:44,755-Speed 5060.01 samples/sec Loss 1.8521 Epoch: 19 Global Step: 330700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:41:54,915-Speed 5039.66 samples/sec Loss 1.8703 Epoch: 19 Global Step: 330750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:05,042-Speed 5056.17 samples/sec Loss 1.8421 Epoch: 19 Global Step: 330800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:15,233-Speed 5024.01 samples/sec Loss 1.8587 Epoch: 19 Global Step: 330850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:25,416-Speed 5028.56 samples/sec Loss 1.8716 Epoch: 19 Global Step: 330900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:35,756-Speed 4951.89 samples/sec Loss 1.8735 Epoch: 19 Global Step: 330950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:45,845-Speed 5075.34 samples/sec Loss 1.8772 Epoch: 19 Global Step: 331000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:42:55,916-Speed 5083.99 samples/sec Loss 1.8616 Epoch: 19 Global Step: 331050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:06,370-Speed 4897.85 samples/sec Loss 1.8721 Epoch: 19 Global Step: 331100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:16,680-Speed 4966.32 samples/sec Loss 1.8582 Epoch: 19 Global Step: 331150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:27,026-Speed 4948.93 samples/sec Loss 1.8559 Epoch: 19 Global Step: 331200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:37,144-Speed 5060.44 samples/sec Loss 1.8439 Epoch: 19 Global Step: 331250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:47,066-Speed 5160.62 samples/sec Loss 1.8455 Epoch: 19 Global Step: 331300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:43:57,114-Speed 5095.96 samples/sec Loss 1.8636 Epoch: 19 Global Step: 331350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:07,272-Speed 5040.48 samples/sec Loss 1.8668 Epoch: 19 Global Step: 331400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:17,457-Speed 5027.09 samples/sec Loss 1.8709 Epoch: 19 Global Step: 331450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:27,703-Speed 4997.70 samples/sec Loss 1.8670 Epoch: 19 Global Step: 331500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:38,511-Speed 4737.42 samples/sec Loss 1.8402 Epoch: 19 Global Step: 331550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:48,656-Speed 5046.99 samples/sec Loss 1.8644 Epoch: 19 Global Step: 331600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:44:58,977-Speed 4961.14 samples/sec Loss 1.8344 Epoch: 19 Global Step: 331650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:45:09,996-Speed 4646.97 samples/sec Loss 1.8913 Epoch: 19 Global Step: 331700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:45:20,264-Speed 4986.67 samples/sec Loss 1.8404 Epoch: 19 Global Step: 331750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:45:30,504-Speed 4999.85 samples/sec Loss 1.8765 Epoch: 19 Global Step: 331800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:45:40,526-Speed 5109.05 samples/sec Loss 1.8632 Epoch: 19 Global Step: 331850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:45:50,769-Speed 4998.74 samples/sec Loss 1.8579 Epoch: 19 Global Step: 331900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:46:00,788-Speed 5110.63 samples/sec Loss 1.8678 Epoch: 19 Global Step: 331950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:46:10,682-Speed 5175.23 samples/sec Loss 1.8441 Epoch: 19 Global Step: 332000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:46:27,418-[lfw][332000]XNorm: 23.117767 Training: 2021-03-19 17:46:27,418-[lfw][332000]Accuracy-Flip: 0.99700+-0.00277 Training: 2021-03-19 17:46:27,418-[lfw][332000]Accuracy-Highest: 0.99767 Training: 2021-03-19 17:46:46,164-[cfp_fp][332000]XNorm: 19.472920 Training: 2021-03-19 17:46:46,165-[cfp_fp][332000]Accuracy-Flip: 0.97600+-0.00828 Training: 2021-03-19 17:46:46,165-[cfp_fp][332000]Accuracy-Highest: 0.97729 Training: 2021-03-19 17:47:02,324-[agedb_30][332000]XNorm: 22.399361 Training: 2021-03-19 17:47:02,324-[agedb_30][332000]Accuracy-Flip: 0.97550+-0.00753 Training: 2021-03-19 17:47:02,324-[agedb_30][332000]Accuracy-Highest: 0.97717 Training: 2021-03-19 17:47:12,282-Speed 831.18 samples/sec Loss 1.8597 Epoch: 19 Global Step: 332050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:47:22,433-Speed 5044.13 samples/sec Loss 1.8551 Epoch: 19 Global Step: 332100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:47:33,440-Speed 4652.06 samples/sec Loss 1.8583 Epoch: 19 Global Step: 332150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:47:43,380-Speed 5151.45 samples/sec Loss 1.8982 Epoch: 19 Global Step: 332200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:47:54,124-Speed 4765.73 samples/sec Loss 1.8557 Epoch: 19 Global Step: 332250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:04,386-Speed 4989.29 samples/sec Loss 1.8436 Epoch: 19 Global Step: 332300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:14,575-Speed 5025.58 samples/sec Loss 1.8716 Epoch: 19 Global Step: 332350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:25,323-Speed 4764.18 samples/sec Loss 1.8788 Epoch: 19 Global Step: 332400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:35,607-Speed 4978.81 samples/sec Loss 1.8531 Epoch: 19 Global Step: 332450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:46,594-Speed 4660.14 samples/sec Loss 1.8500 Epoch: 19 Global Step: 332500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:48:57,937-Speed 4514.08 samples/sec Loss 1.8503 Epoch: 19 Global Step: 332550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:08,112-Speed 5032.27 samples/sec Loss 1.8427 Epoch: 19 Global Step: 332600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:18,328-Speed 5012.14 samples/sec Loss 1.8481 Epoch: 19 Global Step: 332650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:28,416-Speed 5075.38 samples/sec Loss 1.8840 Epoch: 19 Global Step: 332700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:38,595-Speed 5030.14 samples/sec Loss 1.8487 Epoch: 19 Global Step: 332750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:48,630-Speed 5102.57 samples/sec Loss 1.8673 Epoch: 19 Global Step: 332800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:49:58,699-Speed 5085.73 samples/sec Loss 1.8635 Epoch: 19 Global Step: 332850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:50:08,855-Speed 5041.61 samples/sec Loss 1.8515 Epoch: 19 Global Step: 332900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:50:18,938-Speed 5078.03 samples/sec Loss 1.8664 Epoch: 19 Global Step: 332950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:50:29,041-Speed 5068.06 samples/sec Loss 1.8544 Epoch: 19 Global Step: 333000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:50:39,942-Speed 4697.24 samples/sec Loss 1.8647 Epoch: 19 Global Step: 333050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:50:50,030-Speed 5075.28 samples/sec Loss 1.8689 Epoch: 19 Global Step: 333100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:00,087-Speed 5091.18 samples/sec Loss 1.8320 Epoch: 19 Global Step: 333150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:10,295-Speed 5016.07 samples/sec Loss 1.8725 Epoch: 19 Global Step: 333200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:20,444-Speed 5045.41 samples/sec Loss 1.8967 Epoch: 19 Global Step: 333250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:30,630-Speed 5026.89 samples/sec Loss 1.8462 Epoch: 19 Global Step: 333300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:41,006-Speed 4934.51 samples/sec Loss 1.8766 Epoch: 19 Global Step: 333350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:51:51,123-Speed 5061.04 samples/sec Loss 1.8440 Epoch: 19 Global Step: 333400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:01,166-Speed 5098.50 samples/sec Loss 1.8744 Epoch: 19 Global Step: 333450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:11,382-Speed 5012.10 samples/sec Loss 1.8657 Epoch: 19 Global Step: 333500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:21,355-Speed 5134.07 samples/sec Loss 1.8900 Epoch: 19 Global Step: 333550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:31,411-Speed 5091.62 samples/sec Loss 1.8716 Epoch: 19 Global Step: 333600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:41,507-Speed 5071.87 samples/sec Loss 1.8898 Epoch: 19 Global Step: 333650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:52:51,402-Speed 5174.57 samples/sec Loss 1.8418 Epoch: 19 Global Step: 333700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:53:01,650-Speed 4996.62 samples/sec Loss 1.8723 Epoch: 19 Global Step: 333750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-19 17:53:11,876-Speed 5007.38 samples/sec Loss 1.8651 Epoch: 19 Global Step: 333800 Fp16 Grad Scale: 16384 Required: 0 hours