Training: 2021-03-16 00:08:05,448-rank_id: 0 Training: 2021-03-16 00:08:30,528-softmax weight init successfully! Training: 2021-03-16 00:08:30,528-softmax weight mom init successfully! Training: 2021-03-16 00:08:30,533-Total Step is: 333821 Training: 2021-03-16 00:09:12,062-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-16 00:09:49,515-Speed 2833.31 samples/sec Loss 45.8920 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 39 hours Training: 2021-03-16 00:10:07,412-Speed 2860.86 samples/sec Loss 44.1739 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 37 hours Training: 2021-03-16 00:10:23,327-Speed 3217.33 samples/sec Loss 42.9473 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 35 hours Training: 2021-03-16 00:10:39,353-Speed 3194.82 samples/sec Loss 41.7955 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 34 hours Training: 2021-03-16 00:10:55,288-Speed 3213.14 samples/sec Loss 40.7426 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 33 hours Training: 2021-03-16 00:11:15,528-Speed 2529.76 samples/sec Loss 39.9706 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 34 hours Training: 2021-03-16 00:11:32,363-Speed 3041.33 samples/sec Loss 39.6410 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 34 hours Training: 2021-03-16 00:11:48,226-Speed 3227.82 samples/sec Loss 39.2024 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 33 hours Training: 2021-03-16 00:12:04,006-Speed 3244.81 samples/sec Loss 38.8278 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 33 hours Training: 2021-03-16 00:12:20,058-Speed 3189.72 samples/sec Loss 38.4068 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 32 hours Training: 2021-03-16 00:12:36,766-Speed 3064.39 samples/sec Loss 38.0667 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 32 hours Training: 2021-03-16 00:12:52,557-Speed 3242.63 samples/sec Loss 37.7404 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 32 hours Training: 2021-03-16 00:13:08,386-Speed 3234.59 samples/sec Loss 37.5098 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 00:13:24,297-Speed 3218.14 samples/sec Loss 37.1702 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 00:13:40,231-Speed 3213.36 samples/sec Loss 36.8667 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 00:13:56,438-Speed 3159.20 samples/sec Loss 36.5524 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:14:12,461-Speed 3195.58 samples/sec Loss 36.3227 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:14:28,260-Speed 3240.70 samples/sec Loss 36.0139 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:14:44,078-Speed 3237.03 samples/sec Loss 35.7665 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:14:59,977-Speed 3220.48 samples/sec Loss 35.5076 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:15:15,723-Speed 3251.56 samples/sec Loss 35.1560 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:15:31,745-Speed 3195.76 samples/sec Loss 34.8668 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:15:47,723-Speed 3204.63 samples/sec Loss 34.5919 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:16:03,415-Speed 3262.77 samples/sec Loss 34.2626 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:16:19,780-Speed 3128.82 samples/sec Loss 33.9931 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:16:35,901-Speed 3176.06 samples/sec Loss 33.6923 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:16:53,313-Speed 2940.70 samples/sec Loss 33.3387 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:17:12,419-Speed 2679.78 samples/sec Loss 33.0541 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:17:29,129-Speed 3064.18 samples/sec Loss 32.6634 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:17:45,230-Speed 3179.93 samples/sec Loss 32.4335 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:18:01,016-Speed 3243.57 samples/sec Loss 32.0923 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:18:16,899-Speed 3223.66 samples/sec Loss 31.7564 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:18:33,028-Speed 3174.48 samples/sec Loss 31.4021 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:18:48,854-Speed 3235.33 samples/sec Loss 31.1092 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:19:04,634-Speed 3244.78 samples/sec Loss 30.7287 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:19:20,688-Speed 3189.25 samples/sec Loss 30.4005 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:19:37,478-Speed 3049.57 samples/sec Loss 30.0752 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:19:53,631-Speed 3169.92 samples/sec Loss 29.7154 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:20:09,726-Speed 3181.20 samples/sec Loss 29.4090 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 00:21:03,260-[lfw][2000]XNorm: 23.232140 Training: 2021-03-16 00:21:03,261-[lfw][2000]Accuracy-Flip: 0.95417+-0.00917 Training: 2021-03-16 00:21:03,261-[lfw][2000]Accuracy-Highest: 0.95417 Training: 2021-03-16 00:22:05,393-[cfp_fp][2000]XNorm: 21.411464 Training: 2021-03-16 00:22:05,393-[cfp_fp][2000]Accuracy-Flip: 0.76514+-0.01732 Training: 2021-03-16 00:22:05,393-[cfp_fp][2000]Accuracy-Highest: 0.76514 Training: 2021-03-16 00:22:59,183-[agedb_30][2000]XNorm: 22.823843 Training: 2021-03-16 00:22:59,184-[agedb_30][2000]Accuracy-Flip: 0.77117+-0.02701 Training: 2021-03-16 00:22:59,184-[agedb_30][2000]Accuracy-Highest: 0.77117 Training: 2021-03-16 00:23:15,110-Speed 276.18 samples/sec Loss 29.1051 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 38 hours Training: 2021-03-16 00:23:31,009-Speed 3220.34 samples/sec Loss 28.7186 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 38 hours Training: 2021-03-16 00:23:46,821-Speed 3238.25 samples/sec Loss 28.4021 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 38 hours Training: 2021-03-16 00:24:02,827-Speed 3199.01 samples/sec Loss 28.0169 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:24:18,925-Speed 3180.52 samples/sec Loss 27.6825 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:24:34,856-Speed 3213.93 samples/sec Loss 27.3129 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:24:50,678-Speed 3236.08 samples/sec Loss 26.9717 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:25:06,713-Speed 3193.21 samples/sec Loss 26.6203 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:25:22,615-Speed 3219.80 samples/sec Loss 26.3049 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:25:38,382-Speed 3247.51 samples/sec Loss 26.0031 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:25:54,085-Speed 3260.41 samples/sec Loss 25.6681 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:26:10,099-Speed 3197.33 samples/sec Loss 25.5233 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:26:25,911-Speed 3238.21 samples/sec Loss 24.9659 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:26:41,778-Speed 3226.98 samples/sec Loss 24.7618 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:26:57,430-Speed 3271.23 samples/sec Loss 24.4828 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:27:13,306-Speed 3225.09 samples/sec Loss 24.1737 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:27:29,046-Speed 3252.95 samples/sec Loss 23.8620 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:27:44,730-Speed 3264.68 samples/sec Loss 23.5277 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:28:00,530-Speed 3240.58 samples/sec Loss 23.1968 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:28:18,349-Speed 2873.39 samples/sec Loss 22.8674 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:28:37,398-Speed 2687.86 samples/sec Loss 22.6567 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:28:54,207-Speed 3046.02 samples/sec Loss 22.3398 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:29:10,170-Speed 3207.59 samples/sec Loss 22.1332 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:29:26,130-Speed 3208.26 samples/sec Loss 21.8000 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:29:41,926-Speed 3241.32 samples/sec Loss 21.6180 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:29:57,966-Speed 3192.17 samples/sec Loss 21.3058 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:30:14,140-Speed 3165.52 samples/sec Loss 21.0425 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:30:29,951-Speed 3238.35 samples/sec Loss 20.8757 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:30:45,902-Speed 3210.11 samples/sec Loss 20.4826 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:31:02,144-Speed 3152.24 samples/sec Loss 20.2963 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:31:18,938-Speed 3048.88 samples/sec Loss 20.0434 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:31:34,814-Speed 3224.99 samples/sec Loss 19.8205 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:31:50,914-Speed 3181.69 samples/sec Loss 19.6246 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:32:06,983-Speed 3186.35 samples/sec Loss 19.3697 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:32:22,973-Speed 3202.10 samples/sec Loss 19.2625 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:32:38,896-Speed 3215.45 samples/sec Loss 19.1025 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:32:54,785-Speed 3222.46 samples/sec Loss 18.8601 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:33:10,683-Speed 3220.63 samples/sec Loss 18.5849 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:33:26,571-Speed 3222.71 samples/sec Loss 18.3224 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:33:42,676-Speed 3179.21 samples/sec Loss 18.2185 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 00:34:35,768-[lfw][4000]XNorm: 23.377428 Training: 2021-03-16 00:34:35,768-[lfw][4000]Accuracy-Flip: 0.98550+-0.00563 Training: 2021-03-16 00:34:35,768-[lfw][4000]Accuracy-Highest: 0.98550 Training: 2021-03-16 00:35:37,576-[cfp_fp][4000]XNorm: 20.649463 Training: 2021-03-16 00:35:37,576-[cfp_fp][4000]Accuracy-Flip: 0.87486+-0.01444 Training: 2021-03-16 00:35:37,576-[cfp_fp][4000]Accuracy-Highest: 0.87486 Training: 2021-03-16 00:36:30,889-[agedb_30][4000]XNorm: 22.759454 Training: 2021-03-16 00:36:30,889-[agedb_30][4000]Accuracy-Flip: 0.89167+-0.01682 Training: 2021-03-16 00:36:30,889-[agedb_30][4000]Accuracy-Highest: 0.89167 Training: 2021-03-16 00:36:46,804-Speed 278.07 samples/sec Loss 17.9946 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 38 hours Training: 2021-03-16 00:37:02,761-Speed 3208.88 samples/sec Loss 17.6663 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:37:18,731-Speed 3205.94 samples/sec Loss 17.6048 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:37:34,614-Speed 3223.78 samples/sec Loss 17.3775 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:37:50,333-Speed 3257.25 samples/sec Loss 17.2652 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:38:06,174-Speed 3232.30 samples/sec Loss 17.1281 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:38:22,283-Speed 3178.42 samples/sec Loss 16.8643 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:38:38,318-Speed 3193.10 samples/sec Loss 16.7081 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:38:54,328-Speed 3198.03 samples/sec Loss 16.5626 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:39:10,370-Speed 3191.77 samples/sec Loss 16.4076 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:39:26,423-Speed 3189.59 samples/sec Loss 16.2842 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:39:42,422-Speed 3200.15 samples/sec Loss 16.0152 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:39:58,162-Speed 3253.15 samples/sec Loss 15.8192 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:40:14,947-Speed 3050.29 samples/sec Loss 15.8236 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:40:31,873-Speed 3025.13 samples/sec Loss 15.6750 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:40:50,436-Speed 2758.27 samples/sec Loss 15.5105 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:41:07,344-Speed 3028.20 samples/sec Loss 15.3014 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:41:24,022-Speed 3069.99 samples/sec Loss 15.2387 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:41:39,937-Speed 3217.16 samples/sec Loss 15.1517 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:41:56,058-Speed 3176.16 samples/sec Loss 14.9455 Epoch: 0 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:42:12,047-Speed 3202.38 samples/sec Loss 14.8795 Epoch: 0 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:42:27,907-Speed 3228.30 samples/sec Loss 14.7632 Epoch: 0 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:42:43,819-Speed 3217.77 samples/sec Loss 14.5452 Epoch: 0 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:42:59,862-Speed 3191.62 samples/sec Loss 14.5063 Epoch: 0 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:43:15,905-Speed 3191.46 samples/sec Loss 14.3381 Epoch: 0 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:43:31,717-Speed 3238.22 samples/sec Loss 14.1871 Epoch: 0 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:43:47,609-Speed 3221.83 samples/sec Loss 14.1136 Epoch: 0 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:44:04,332-Speed 3061.75 samples/sec Loss 13.9645 Epoch: 0 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:44:20,294-Speed 3207.74 samples/sec Loss 13.8608 Epoch: 0 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:44:36,312-Speed 3196.47 samples/sec Loss 13.8293 Epoch: 0 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:44:52,269-Speed 3208.69 samples/sec Loss 13.7091 Epoch: 0 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:45:08,324-Speed 3189.11 samples/sec Loss 13.5512 Epoch: 0 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:45:24,227-Speed 3219.69 samples/sec Loss 13.4331 Epoch: 0 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:45:40,056-Speed 3234.60 samples/sec Loss 13.4330 Epoch: 0 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:45:56,160-Speed 3179.51 samples/sec Loss 13.2596 Epoch: 0 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:46:12,104-Speed 3211.34 samples/sec Loss 13.2687 Epoch: 0 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:46:28,001-Speed 3220.76 samples/sec Loss 13.0856 Epoch: 0 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:46:43,766-Speed 3247.89 samples/sec Loss 13.1248 Epoch: 0 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:46:59,711-Speed 3211.09 samples/sec Loss 12.8758 Epoch: 0 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:47:15,713-Speed 3199.62 samples/sec Loss 12.8385 Epoch: 0 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:48:08,906-[lfw][6000]XNorm: 23.771280 Training: 2021-03-16 00:48:08,907-[lfw][6000]Accuracy-Flip: 0.99200+-0.00393 Training: 2021-03-16 00:48:08,907-[lfw][6000]Accuracy-Highest: 0.99200 Training: 2021-03-16 00:49:11,056-[cfp_fp][6000]XNorm: 20.678736 Training: 2021-03-16 00:49:11,057-[cfp_fp][6000]Accuracy-Flip: 0.90643+-0.01224 Training: 2021-03-16 00:49:11,057-[cfp_fp][6000]Accuracy-Highest: 0.90643 Training: 2021-03-16 00:50:04,315-[agedb_30][6000]XNorm: 23.145711 Training: 2021-03-16 00:50:04,316-[agedb_30][6000]Accuracy-Flip: 0.91633+-0.01310 Training: 2021-03-16 00:50:04,316-[agedb_30][6000]Accuracy-Highest: 0.91633 Training: 2021-03-16 00:50:20,077-Speed 277.71 samples/sec Loss 12.7304 Epoch: 0 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:50:36,013-Speed 3212.86 samples/sec Loss 12.6652 Epoch: 0 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:50:51,771-Speed 3249.38 samples/sec Loss 12.5627 Epoch: 0 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:51:07,724-Speed 3209.47 samples/sec Loss 12.4282 Epoch: 0 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:51:23,626-Speed 3219.72 samples/sec Loss 12.4276 Epoch: 0 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:51:39,655-Speed 3194.44 samples/sec Loss 12.2571 Epoch: 0 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:51:55,596-Speed 3211.87 samples/sec Loss 12.2612 Epoch: 0 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:52:11,462-Speed 3227.08 samples/sec Loss 12.1860 Epoch: 0 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:52:27,330-Speed 3226.69 samples/sec Loss 12.0485 Epoch: 0 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:52:43,336-Speed 3199.03 samples/sec Loss 12.0635 Epoch: 0 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:52:59,447-Speed 3177.88 samples/sec Loss 11.9361 Epoch: 0 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:53:17,827-Speed 2785.84 samples/sec Loss 11.8419 Epoch: 0 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 00:53:34,439-Speed 3082.18 samples/sec Loss 11.8191 Epoch: 0 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:53:50,443-Speed 3199.30 samples/sec Loss 11.7973 Epoch: 0 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:54:08,067-Speed 2905.13 samples/sec Loss 11.6999 Epoch: 0 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:54:23,879-Speed 3238.20 samples/sec Loss 11.5450 Epoch: 0 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:54:39,768-Speed 3222.42 samples/sec Loss 11.5234 Epoch: 0 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:54:56,129-Speed 3129.50 samples/sec Loss 11.4409 Epoch: 0 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:55:12,034-Speed 3219.25 samples/sec Loss 11.3351 Epoch: 0 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:55:27,873-Speed 3232.50 samples/sec Loss 11.4635 Epoch: 0 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:55:43,698-Speed 3235.55 samples/sec Loss 11.2835 Epoch: 0 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:55:59,660-Speed 3207.70 samples/sec Loss 11.2416 Epoch: 0 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:56:15,531-Speed 3226.17 samples/sec Loss 11.2017 Epoch: 0 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:56:31,470-Speed 3212.35 samples/sec Loss 11.0155 Epoch: 0 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:56:48,127-Speed 3073.79 samples/sec Loss 11.0165 Epoch: 0 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:57:04,248-Speed 3176.20 samples/sec Loss 10.9701 Epoch: 0 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:57:20,229-Speed 3203.88 samples/sec Loss 10.9204 Epoch: 0 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:57:36,025-Speed 3241.33 samples/sec Loss 10.8124 Epoch: 0 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:57:51,800-Speed 3245.84 samples/sec Loss 10.7536 Epoch: 0 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:58:07,588-Speed 3242.98 samples/sec Loss 10.7498 Epoch: 0 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 00:58:23,415-Speed 3235.24 samples/sec Loss 10.6579 Epoch: 0 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:58:39,236-Speed 3236.32 samples/sec Loss 10.6621 Epoch: 0 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:58:55,056-Speed 3236.40 samples/sec Loss 10.5036 Epoch: 0 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:59:10,887-Speed 3234.37 samples/sec Loss 10.5607 Epoch: 0 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:59:26,868-Speed 3203.77 samples/sec Loss 10.5177 Epoch: 0 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:59:42,729-Speed 3228.23 samples/sec Loss 10.4728 Epoch: 0 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 00:59:58,623-Speed 3221.38 samples/sec Loss 10.3867 Epoch: 0 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:00:14,446-Speed 3235.83 samples/sec Loss 10.3780 Epoch: 0 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:00:30,291-Speed 3231.41 samples/sec Loss 10.2292 Epoch: 0 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:00:45,961-Speed 3267.66 samples/sec Loss 10.1965 Epoch: 0 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:01:39,046-[lfw][8000]XNorm: 21.997562 Training: 2021-03-16 01:01:39,047-[lfw][8000]Accuracy-Flip: 0.99333+-0.00415 Training: 2021-03-16 01:01:39,047-[lfw][8000]Accuracy-Highest: 0.99333 Training: 2021-03-16 01:02:40,724-[cfp_fp][8000]XNorm: 19.846066 Training: 2021-03-16 01:02:40,724-[cfp_fp][8000]Accuracy-Flip: 0.91943+-0.01118 Training: 2021-03-16 01:02:40,724-[cfp_fp][8000]Accuracy-Highest: 0.91943 Training: 2021-03-16 01:03:34,010-[agedb_30][8000]XNorm: 21.054768 Training: 2021-03-16 01:03:34,010-[agedb_30][8000]Accuracy-Flip: 0.93450+-0.01225 Training: 2021-03-16 01:03:34,010-[agedb_30][8000]Accuracy-Highest: 0.93450 Training: 2021-03-16 01:03:49,996-Speed 278.21 samples/sec Loss 10.1357 Epoch: 0 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:04:05,836-Speed 3232.29 samples/sec Loss 10.1766 Epoch: 0 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:04:21,687-Speed 3230.24 samples/sec Loss 10.1236 Epoch: 0 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:04:37,508-Speed 3236.31 samples/sec Loss 10.0983 Epoch: 0 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:04:53,558-Speed 3190.19 samples/sec Loss 10.0858 Epoch: 0 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:05:09,429-Speed 3226.14 samples/sec Loss 9.9801 Epoch: 0 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:05:25,328-Speed 3220.36 samples/sec Loss 9.8726 Epoch: 0 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:05:41,563-Speed 3153.87 samples/sec Loss 9.9068 Epoch: 0 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:05:58,453-Speed 3031.48 samples/sec Loss 9.8572 Epoch: 0 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:06:16,701-Speed 2805.84 samples/sec Loss 9.8541 Epoch: 0 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:06:32,534-Speed 3233.79 samples/sec Loss 9.7684 Epoch: 0 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:06:49,142-Speed 3082.95 samples/sec Loss 9.7793 Epoch: 0 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:07:05,077-Speed 3213.18 samples/sec Loss 9.7208 Epoch: 0 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:07:21,810-Speed 3059.88 samples/sec Loss 9.6644 Epoch: 0 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:07:38,574-Speed 3054.23 samples/sec Loss 9.6438 Epoch: 0 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:07:54,466-Speed 3221.97 samples/sec Loss 9.6083 Epoch: 0 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:08:10,498-Speed 3193.66 samples/sec Loss 9.5559 Epoch: 0 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:08:26,290-Speed 3242.22 samples/sec Loss 9.5050 Epoch: 0 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:08:42,473-Speed 3163.94 samples/sec Loss 9.5729 Epoch: 0 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:08:58,464-Speed 3201.90 samples/sec Loss 9.5281 Epoch: 0 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:09:14,227-Speed 3248.13 samples/sec Loss 9.4230 Epoch: 0 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:09:30,308-Speed 3184.05 samples/sec Loss 9.4917 Epoch: 0 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:09:47,199-Speed 3031.18 samples/sec Loss 9.3551 Epoch: 0 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:10:02,875-Speed 3266.43 samples/sec Loss 9.4223 Epoch: 0 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:10:19,182-Speed 3139.69 samples/sec Loss 9.3332 Epoch: 0 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:10:35,024-Speed 3232.05 samples/sec Loss 9.3387 Epoch: 0 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:10:50,979-Speed 3209.20 samples/sec Loss 9.2794 Epoch: 0 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:11:06,942-Speed 3207.65 samples/sec Loss 9.2772 Epoch: 0 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:11:22,883-Speed 3211.86 samples/sec Loss 9.1266 Epoch: 0 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:11:38,757-Speed 3225.43 samples/sec Loss 9.1668 Epoch: 0 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:11:54,856-Speed 3180.46 samples/sec Loss 9.1524 Epoch: 0 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:12:10,923-Speed 3186.87 samples/sec Loss 9.1408 Epoch: 0 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:12:26,698-Speed 3245.71 samples/sec Loss 9.0591 Epoch: 0 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:12:42,551-Speed 3229.76 samples/sec Loss 9.0620 Epoch: 0 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:12:58,448-Speed 3220.85 samples/sec Loss 9.1030 Epoch: 0 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:13:14,245-Speed 3241.28 samples/sec Loss 9.0191 Epoch: 0 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:13:30,046-Speed 3240.33 samples/sec Loss 8.9680 Epoch: 0 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:13:46,213-Speed 3167.01 samples/sec Loss 8.9662 Epoch: 0 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:14:01,964-Speed 3250.70 samples/sec Loss 8.9323 Epoch: 0 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:14:17,841-Speed 3224.86 samples/sec Loss 8.8731 Epoch: 0 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:15:11,071-[lfw][10000]XNorm: 22.130288 Training: 2021-03-16 01:15:11,071-[lfw][10000]Accuracy-Flip: 0.99450+-0.00308 Training: 2021-03-16 01:15:11,071-[lfw][10000]Accuracy-Highest: 0.99450 Training: 2021-03-16 01:16:12,721-[cfp_fp][10000]XNorm: 19.658108 Training: 2021-03-16 01:16:12,721-[cfp_fp][10000]Accuracy-Flip: 0.92800+-0.01600 Training: 2021-03-16 01:16:12,721-[cfp_fp][10000]Accuracy-Highest: 0.92800 Training: 2021-03-16 01:17:05,711-[agedb_30][10000]XNorm: 21.433548 Training: 2021-03-16 01:17:05,711-[agedb_30][10000]Accuracy-Flip: 0.94233+-0.01141 Training: 2021-03-16 01:17:05,711-[agedb_30][10000]Accuracy-Highest: 0.94233 Training: 2021-03-16 01:17:21,479-Speed 278.81 samples/sec Loss 8.8333 Epoch: 0 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:17:37,208-Speed 3255.24 samples/sec Loss 8.8963 Epoch: 0 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:17:53,009-Speed 3240.52 samples/sec Loss 8.8987 Epoch: 0 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:18:09,106-Speed 3180.75 samples/sec Loss 8.8057 Epoch: 0 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 37 hours Training: 2021-03-16 01:18:24,958-Speed 3230.10 samples/sec Loss 8.8432 Epoch: 0 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:18:40,821-Speed 3227.59 samples/sec Loss 8.7731 Epoch: 0 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:18:57,334-Speed 3100.72 samples/sec Loss 8.7760 Epoch: 0 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:19:13,931-Speed 3085.07 samples/sec Loss 8.7258 Epoch: 0 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:19:31,553-Speed 2905.45 samples/sec Loss 8.6360 Epoch: 0 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:19:49,070-Speed 2923.07 samples/sec Loss 8.6693 Epoch: 0 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:20:04,889-Speed 3236.61 samples/sec Loss 8.6306 Epoch: 0 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:20:21,744-Speed 3037.80 samples/sec Loss 8.6267 Epoch: 0 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:20:38,468-Speed 3061.52 samples/sec Loss 8.5541 Epoch: 0 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:20:54,659-Speed 3162.26 samples/sec Loss 8.6036 Epoch: 0 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:21:10,510-Speed 3230.24 samples/sec Loss 8.5907 Epoch: 0 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:21:26,337-Speed 3235.04 samples/sec Loss 8.5170 Epoch: 0 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:21:42,195-Speed 3228.92 samples/sec Loss 8.5189 Epoch: 0 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:21:58,080-Speed 3223.10 samples/sec Loss 8.4658 Epoch: 0 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:22:14,062-Speed 3203.73 samples/sec Loss 8.5103 Epoch: 0 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:22:30,076-Speed 3197.43 samples/sec Loss 8.4405 Epoch: 0 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:22:45,843-Speed 3247.25 samples/sec Loss 8.5404 Epoch: 0 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:23:01,704-Speed 3228.18 samples/sec Loss 8.3766 Epoch: 0 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:23:18,450-Speed 3057.48 samples/sec Loss 8.4369 Epoch: 0 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:23:34,563-Speed 3177.75 samples/sec Loss 8.4331 Epoch: 0 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:23:50,850-Speed 3143.70 samples/sec Loss 8.4142 Epoch: 0 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:24:06,765-Speed 3217.21 samples/sec Loss 8.3812 Epoch: 0 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:24:22,790-Speed 3195.05 samples/sec Loss 8.3636 Epoch: 0 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:24:38,760-Speed 3206.11 samples/sec Loss 8.3998 Epoch: 0 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:24:54,798-Speed 3192.53 samples/sec Loss 8.3126 Epoch: 0 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:25:10,848-Speed 3190.06 samples/sec Loss 8.3674 Epoch: 0 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:25:26,726-Speed 3224.69 samples/sec Loss 8.3149 Epoch: 0 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:25:42,678-Speed 3209.76 samples/sec Loss 8.3302 Epoch: 0 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:25:58,785-Speed 3178.87 samples/sec Loss 8.2792 Epoch: 0 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:26:14,713-Speed 3214.53 samples/sec Loss 8.2680 Epoch: 0 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:26:30,617-Speed 3219.48 samples/sec Loss 8.2491 Epoch: 0 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:26:46,634-Speed 3196.74 samples/sec Loss 8.2450 Epoch: 0 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:27:02,438-Speed 3239.68 samples/sec Loss 8.1581 Epoch: 0 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:27:18,308-Speed 3226.33 samples/sec Loss 8.1559 Epoch: 0 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:27:34,281-Speed 3205.69 samples/sec Loss 8.1789 Epoch: 0 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:27:50,025-Speed 3252.02 samples/sec Loss 8.1649 Epoch: 0 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:28:43,183-[lfw][12000]XNorm: 22.441605 Training: 2021-03-16 01:28:43,184-[lfw][12000]Accuracy-Flip: 0.99367+-0.00371 Training: 2021-03-16 01:28:43,184-[lfw][12000]Accuracy-Highest: 0.99450 Training: 2021-03-16 01:29:45,725-[cfp_fp][12000]XNorm: 19.261255 Training: 2021-03-16 01:29:45,726-[cfp_fp][12000]Accuracy-Flip: 0.93371+-0.01159 Training: 2021-03-16 01:29:45,726-[cfp_fp][12000]Accuracy-Highest: 0.93371 Training: 2021-03-16 01:30:39,460-[agedb_30][12000]XNorm: 21.511459 Training: 2021-03-16 01:30:39,460-[agedb_30][12000]Accuracy-Flip: 0.95167+-0.01035 Training: 2021-03-16 01:30:39,461-[agedb_30][12000]Accuracy-Highest: 0.95167 Training: 2021-03-16 01:30:55,145-Speed 276.58 samples/sec Loss 8.1250 Epoch: 0 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:31:10,895-Speed 3250.88 samples/sec Loss 8.0979 Epoch: 0 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:31:26,865-Speed 3206.02 samples/sec Loss 8.0873 Epoch: 0 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:31:42,656-Speed 3242.44 samples/sec Loss 8.0890 Epoch: 0 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:31:58,513-Speed 3228.91 samples/sec Loss 8.1072 Epoch: 0 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:32:14,474-Speed 3207.90 samples/sec Loss 8.1110 Epoch: 0 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:32:31,226-Speed 3056.48 samples/sec Loss 8.0553 Epoch: 0 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:32:48,105-Speed 3033.59 samples/sec Loss 7.9990 Epoch: 0 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:33:05,803-Speed 2892.92 samples/sec Loss 8.0227 Epoch: 0 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:33:21,485-Speed 3265.09 samples/sec Loss 8.0363 Epoch: 0 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:33:38,124-Speed 3077.11 samples/sec Loss 7.9842 Epoch: 0 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:33:54,690-Speed 3090.77 samples/sec Loss 7.9461 Epoch: 0 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:34:11,416-Speed 3061.16 samples/sec Loss 7.9523 Epoch: 0 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:34:27,387-Speed 3206.03 samples/sec Loss 7.9639 Epoch: 0 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:34:43,227-Speed 3232.42 samples/sec Loss 7.9703 Epoch: 0 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:34:59,012-Speed 3243.77 samples/sec Loss 7.9629 Epoch: 0 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:35:14,868-Speed 3229.17 samples/sec Loss 7.9366 Epoch: 0 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:35:31,000-Speed 3173.80 samples/sec Loss 7.9149 Epoch: 0 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:35:47,045-Speed 3191.20 samples/sec Loss 7.9019 Epoch: 0 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:36:03,120-Speed 3185.19 samples/sec Loss 7.9053 Epoch: 0 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:36:19,254-Speed 3173.52 samples/sec Loss 7.8516 Epoch: 0 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:36:35,941-Speed 3068.30 samples/sec Loss 7.7804 Epoch: 0 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:36:51,760-Speed 3236.73 samples/sec Loss 7.8120 Epoch: 0 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:37:07,706-Speed 3210.91 samples/sec Loss 7.7834 Epoch: 0 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:37:23,457-Speed 3250.76 samples/sec Loss 7.8394 Epoch: 0 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:37:39,637-Speed 3164.37 samples/sec Loss 7.8503 Epoch: 0 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:37:55,595-Speed 3208.60 samples/sec Loss 7.8199 Epoch: 0 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:38:11,529-Speed 3213.36 samples/sec Loss 7.7701 Epoch: 0 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:38:27,555-Speed 3194.88 samples/sec Loss 7.7258 Epoch: 0 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:38:43,326-Speed 3246.59 samples/sec Loss 7.6942 Epoch: 0 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:38:59,333-Speed 3198.76 samples/sec Loss 7.7214 Epoch: 0 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:39:15,185-Speed 3229.94 samples/sec Loss 7.7702 Epoch: 0 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:39:31,445-Speed 3148.80 samples/sec Loss 7.6898 Epoch: 0 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:39:47,365-Speed 3216.21 samples/sec Loss 7.7445 Epoch: 0 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:40:03,270-Speed 3219.35 samples/sec Loss 7.7344 Epoch: 0 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:40:18,984-Speed 3258.20 samples/sec Loss 7.6912 Epoch: 0 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:40:35,175-Speed 3162.49 samples/sec Loss 7.7260 Epoch: 0 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:40:50,938-Speed 3248.23 samples/sec Loss 7.7378 Epoch: 0 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:41:06,918-Speed 3204.01 samples/sec Loss 7.6746 Epoch: 0 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:41:22,971-Speed 3189.48 samples/sec Loss 7.5710 Epoch: 0 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:42:16,130-[lfw][14000]XNorm: 21.768731 Training: 2021-03-16 01:42:16,131-[lfw][14000]Accuracy-Flip: 0.99400+-0.00335 Training: 2021-03-16 01:42:16,131-[lfw][14000]Accuracy-Highest: 0.99450 Training: 2021-03-16 01:43:17,805-[cfp_fp][14000]XNorm: 18.772501 Training: 2021-03-16 01:43:17,805-[cfp_fp][14000]Accuracy-Flip: 0.93614+-0.01132 Training: 2021-03-16 01:43:17,805-[cfp_fp][14000]Accuracy-Highest: 0.93614 Training: 2021-03-16 01:44:10,956-[agedb_30][14000]XNorm: 21.179125 Training: 2021-03-16 01:44:10,956-[agedb_30][14000]Accuracy-Flip: 0.94650+-0.01163 Training: 2021-03-16 01:44:10,956-[agedb_30][14000]Accuracy-Highest: 0.95167 Training: 2021-03-16 01:44:26,887-Speed 278.39 samples/sec Loss 7.6633 Epoch: 0 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:44:42,789-Speed 3219.74 samples/sec Loss 7.6349 Epoch: 0 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:44:58,679-Speed 3222.34 samples/sec Loss 7.6615 Epoch: 0 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:45:14,501-Speed 3235.96 samples/sec Loss 7.6370 Epoch: 0 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:45:30,388-Speed 3222.89 samples/sec Loss 7.6080 Epoch: 0 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:45:46,310-Speed 3215.79 samples/sec Loss 7.5595 Epoch: 0 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:46:02,921-Speed 3082.37 samples/sec Loss 7.5621 Epoch: 0 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:46:19,557-Speed 3077.82 samples/sec Loss 7.6263 Epoch: 0 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:46:36,339-Speed 3050.99 samples/sec Loss 7.5916 Epoch: 0 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:46:53,002-Speed 3072.75 samples/sec Loss 7.6573 Epoch: 0 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:47:09,039-Speed 3192.65 samples/sec Loss 7.5656 Epoch: 0 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:47:26,480-Speed 2935.75 samples/sec Loss 7.5422 Epoch: 0 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:47:42,969-Speed 3105.28 samples/sec Loss 7.5463 Epoch: 0 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:47:58,807-Speed 3232.81 samples/sec Loss 7.5305 Epoch: 0 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:48:14,688-Speed 3224.06 samples/sec Loss 7.5418 Epoch: 0 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:48:30,716-Speed 3194.54 samples/sec Loss 7.5302 Epoch: 0 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:48:46,687-Speed 3205.87 samples/sec Loss 7.5396 Epoch: 0 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:49:02,593-Speed 3218.97 samples/sec Loss 7.5288 Epoch: 0 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:49:18,616-Speed 3195.50 samples/sec Loss 7.5052 Epoch: 0 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:49:34,711-Speed 3181.19 samples/sec Loss 7.4911 Epoch: 0 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:49:51,007-Speed 3142.12 samples/sec Loss 7.4941 Epoch: 0 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:50:07,645-Speed 3077.25 samples/sec Loss 7.4725 Epoch: 0 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:50:23,432-Speed 3243.38 samples/sec Loss 7.3996 Epoch: 0 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:50:39,230-Speed 3240.97 samples/sec Loss 7.3655 Epoch: 0 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:50:55,172-Speed 3211.67 samples/sec Loss 7.4341 Epoch: 0 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:51:10,934-Speed 3248.54 samples/sec Loss 7.4344 Epoch: 0 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:51:26,714-Speed 3244.83 samples/sec Loss 7.4830 Epoch: 0 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:51:42,583-Speed 3226.40 samples/sec Loss 7.3773 Epoch: 0 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:51:58,657-Speed 3185.33 samples/sec Loss 7.4073 Epoch: 0 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:52:14,698-Speed 3191.88 samples/sec Loss 7.3770 Epoch: 0 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:52:30,801-Speed 3179.78 samples/sec Loss 7.3635 Epoch: 0 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:52:46,629-Speed 3234.87 samples/sec Loss 7.3961 Epoch: 0 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:53:02,566-Speed 3212.79 samples/sec Loss 7.3624 Epoch: 0 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:53:18,419-Speed 3229.68 samples/sec Loss 7.3828 Epoch: 0 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:53:34,223-Speed 3239.92 samples/sec Loss 7.3470 Epoch: 0 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:53:50,007-Speed 3243.82 samples/sec Loss 7.3307 Epoch: 0 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:54:05,888-Speed 3224.04 samples/sec Loss 7.3012 Epoch: 0 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:54:21,904-Speed 3197.00 samples/sec Loss 7.3365 Epoch: 0 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:54:37,866-Speed 3207.70 samples/sec Loss 7.2969 Epoch: 0 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:54:53,644-Speed 3245.18 samples/sec Loss 7.2855 Epoch: 0 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 01:55:47,089-[lfw][16000]XNorm: 21.144383 Training: 2021-03-16 01:55:47,090-[lfw][16000]Accuracy-Flip: 0.99500+-0.00415 Training: 2021-03-16 01:55:47,090-[lfw][16000]Accuracy-Highest: 0.99500 Training: 2021-03-16 01:56:48,804-[cfp_fp][16000]XNorm: 18.295517 Training: 2021-03-16 01:56:48,805-[cfp_fp][16000]Accuracy-Flip: 0.93914+-0.01112 Training: 2021-03-16 01:56:48,805-[cfp_fp][16000]Accuracy-Highest: 0.93914 Training: 2021-03-16 01:57:41,949-[agedb_30][16000]XNorm: 20.409432 Training: 2021-03-16 01:57:41,950-[agedb_30][16000]Accuracy-Flip: 0.95250+-0.01012 Training: 2021-03-16 01:57:41,950-[agedb_30][16000]Accuracy-Highest: 0.95250 Training: 2021-03-16 01:57:57,822-Speed 277.99 samples/sec Loss 7.3002 Epoch: 0 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:58:13,694-Speed 3225.76 samples/sec Loss 7.2901 Epoch: 0 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:58:29,401-Speed 3259.87 samples/sec Loss 7.3333 Epoch: 0 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:58:45,154-Speed 3250.35 samples/sec Loss 7.2823 Epoch: 0 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:59:01,127-Speed 3205.50 samples/sec Loss 7.2797 Epoch: 0 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:59:17,001-Speed 3225.48 samples/sec Loss 7.2761 Epoch: 0 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:59:33,666-Speed 3072.31 samples/sec Loss 7.2430 Epoch: 0 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 01:59:50,381-Speed 3063.19 samples/sec Loss 7.1891 Epoch: 0 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:00:06,395-Speed 3197.45 samples/sec Loss 7.2683 Epoch: 0 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:00:23,231-Speed 3041.12 samples/sec Loss 7.2619 Epoch: 0 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:00:40,805-Speed 2913.46 samples/sec Loss 7.2903 Epoch: 0 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:00:56,714-Speed 3218.48 samples/sec Loss 7.2239 Epoch: 0 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:01:13,321-Speed 3083.06 samples/sec Loss 7.2443 Epoch: 0 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:01:35,203-Speed 2339.94 samples/sec Loss 7.0999 Epoch: 1 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:01:51,408-Speed 3159.47 samples/sec Loss 6.5424 Epoch: 1 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:02:07,447-Speed 3192.42 samples/sec Loss 6.5192 Epoch: 1 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:02:23,369-Speed 3215.71 samples/sec Loss 6.5570 Epoch: 1 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:02:39,441-Speed 3185.93 samples/sec Loss 6.5974 Epoch: 1 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:02:55,440-Speed 3200.27 samples/sec Loss 6.6037 Epoch: 1 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:03:11,453-Speed 3197.51 samples/sec Loss 6.6214 Epoch: 1 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:03:27,590-Speed 3172.99 samples/sec Loss 6.5683 Epoch: 1 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:03:44,557-Speed 3017.74 samples/sec Loss 6.6497 Epoch: 1 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:04:00,472-Speed 3217.15 samples/sec Loss 6.6437 Epoch: 1 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:04:16,390-Speed 3216.66 samples/sec Loss 6.6588 Epoch: 1 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:04:32,268-Speed 3224.52 samples/sec Loss 6.6265 Epoch: 1 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:04:48,192-Speed 3215.38 samples/sec Loss 6.6681 Epoch: 1 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:05:04,384-Speed 3162.13 samples/sec Loss 6.7410 Epoch: 1 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:05:20,402-Speed 3196.58 samples/sec Loss 6.6919 Epoch: 1 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:05:36,344-Speed 3211.74 samples/sec Loss 6.6523 Epoch: 1 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:05:52,347-Speed 3199.55 samples/sec Loss 6.7403 Epoch: 1 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:06:08,379-Speed 3193.64 samples/sec Loss 6.7308 Epoch: 1 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:06:24,379-Speed 3200.21 samples/sec Loss 6.6830 Epoch: 1 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:06:40,606-Speed 3155.31 samples/sec Loss 6.7243 Epoch: 1 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:06:56,652-Speed 3190.78 samples/sec Loss 6.7435 Epoch: 1 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:07:12,562-Speed 3218.27 samples/sec Loss 6.6831 Epoch: 1 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:07:28,562-Speed 3200.12 samples/sec Loss 6.7127 Epoch: 1 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:07:44,476-Speed 3217.45 samples/sec Loss 6.7624 Epoch: 1 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:08:00,268-Speed 3242.15 samples/sec Loss 6.7823 Epoch: 1 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:08:16,143-Speed 3225.36 samples/sec Loss 6.6722 Epoch: 1 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:08:32,200-Speed 3188.70 samples/sec Loss 6.6972 Epoch: 1 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:09:25,290-[lfw][18000]XNorm: 23.056547 Training: 2021-03-16 02:09:25,291-[lfw][18000]Accuracy-Flip: 0.99617+-0.00259 Training: 2021-03-16 02:09:25,291-[lfw][18000]Accuracy-Highest: 0.99617 Training: 2021-03-16 02:10:26,882-[cfp_fp][18000]XNorm: 19.578185 Training: 2021-03-16 02:10:26,883-[cfp_fp][18000]Accuracy-Flip: 0.93857+-0.01006 Training: 2021-03-16 02:10:26,883-[cfp_fp][18000]Accuracy-Highest: 0.93914 Training: 2021-03-16 02:11:19,882-[agedb_30][18000]XNorm: 22.291912 Training: 2021-03-16 02:11:19,883-[agedb_30][18000]Accuracy-Flip: 0.95250+-0.01241 Training: 2021-03-16 02:11:19,883-[agedb_30][18000]Accuracy-Highest: 0.95250 Training: 2021-03-16 02:11:35,755-Speed 278.94 samples/sec Loss 6.7697 Epoch: 1 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:11:51,935-Speed 3164.57 samples/sec Loss 6.7808 Epoch: 1 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:12:07,904-Speed 3206.27 samples/sec Loss 6.7668 Epoch: 1 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:12:23,735-Speed 3234.37 samples/sec Loss 6.6813 Epoch: 1 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:12:39,913-Speed 3164.94 samples/sec Loss 6.7304 Epoch: 1 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:12:55,978-Speed 3186.97 samples/sec Loss 6.7415 Epoch: 1 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:13:13,180-Speed 2976.61 samples/sec Loss 6.7707 Epoch: 1 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:13:30,058-Speed 3033.52 samples/sec Loss 6.7805 Epoch: 1 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:13:46,221-Speed 3167.95 samples/sec Loss 6.7047 Epoch: 1 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 36 hours Training: 2021-03-16 02:14:03,374-Speed 2985.00 samples/sec Loss 6.7524 Epoch: 1 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:14:20,255-Speed 3032.96 samples/sec Loss 6.7251 Epoch: 1 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:14:37,141-Speed 3032.24 samples/sec Loss 6.8119 Epoch: 1 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:14:53,782-Speed 3076.78 samples/sec Loss 6.7685 Epoch: 1 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:15:09,591-Speed 3238.87 samples/sec Loss 6.7963 Epoch: 1 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:15:26,691-Speed 2994.13 samples/sec Loss 6.7101 Epoch: 1 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:15:42,571-Speed 3224.28 samples/sec Loss 6.7668 Epoch: 1 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:15:58,319-Speed 3251.38 samples/sec Loss 6.7555 Epoch: 1 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:16:14,601-Speed 3144.69 samples/sec Loss 6.8297 Epoch: 1 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:16:30,685-Speed 3183.44 samples/sec Loss 6.7959 Epoch: 1 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:16:46,787-Speed 3179.85 samples/sec Loss 6.7561 Epoch: 1 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:17:02,663-Speed 3225.08 samples/sec Loss 6.7491 Epoch: 1 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:17:18,533-Speed 3226.16 samples/sec Loss 6.7643 Epoch: 1 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:17:35,303-Speed 3053.26 samples/sec Loss 6.7446 Epoch: 1 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:17:51,388-Speed 3183.02 samples/sec Loss 6.7600 Epoch: 1 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:18:07,165-Speed 3245.40 samples/sec Loss 6.8078 Epoch: 1 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:18:23,011-Speed 3231.18 samples/sec Loss 6.8087 Epoch: 1 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:18:38,960-Speed 3210.40 samples/sec Loss 6.7724 Epoch: 1 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:18:54,978-Speed 3196.40 samples/sec Loss 6.7774 Epoch: 1 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:19:10,938-Speed 3208.23 samples/sec Loss 6.7407 Epoch: 1 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:19:27,048-Speed 3178.07 samples/sec Loss 6.7471 Epoch: 1 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:19:42,902-Speed 3229.64 samples/sec Loss 6.7578 Epoch: 1 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:19:58,912-Speed 3198.17 samples/sec Loss 6.7421 Epoch: 1 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:20:14,927-Speed 3197.04 samples/sec Loss 6.7715 Epoch: 1 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:20:30,900-Speed 3205.59 samples/sec Loss 6.7975 Epoch: 1 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:20:46,794-Speed 3221.34 samples/sec Loss 6.7795 Epoch: 1 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:21:02,909-Speed 3177.40 samples/sec Loss 6.7877 Epoch: 1 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:21:18,876-Speed 3206.71 samples/sec Loss 6.7708 Epoch: 1 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:21:34,680-Speed 3239.77 samples/sec Loss 6.7579 Epoch: 1 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:21:50,554-Speed 3225.39 samples/sec Loss 6.7489 Epoch: 1 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:22:06,506-Speed 3209.80 samples/sec Loss 6.7187 Epoch: 1 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:22:59,312-[lfw][20000]XNorm: 21.180537 Training: 2021-03-16 02:22:59,313-[lfw][20000]Accuracy-Flip: 0.99517+-0.00320 Training: 2021-03-16 02:22:59,313-[lfw][20000]Accuracy-Highest: 0.99617 Training: 2021-03-16 02:24:01,290-[cfp_fp][20000]XNorm: 18.060109 Training: 2021-03-16 02:24:01,290-[cfp_fp][20000]Accuracy-Flip: 0.93357+-0.01272 Training: 2021-03-16 02:24:01,290-[cfp_fp][20000]Accuracy-Highest: 0.93914 Training: 2021-03-16 02:24:54,575-[agedb_30][20000]XNorm: 20.914667 Training: 2021-03-16 02:24:54,575-[agedb_30][20000]Accuracy-Flip: 0.95367+-0.00954 Training: 2021-03-16 02:24:54,575-[agedb_30][20000]Accuracy-Highest: 0.95367 Training: 2021-03-16 02:25:10,601-Speed 278.12 samples/sec Loss 6.7547 Epoch: 1 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:25:26,644-Speed 3191.42 samples/sec Loss 6.7541 Epoch: 1 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:25:42,611-Speed 3206.65 samples/sec Loss 6.7908 Epoch: 1 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:25:58,581-Speed 3206.22 samples/sec Loss 6.8111 Epoch: 1 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:26:14,705-Speed 3175.47 samples/sec Loss 6.7739 Epoch: 1 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:26:30,854-Speed 3170.49 samples/sec Loss 6.7940 Epoch: 1 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:26:47,966-Speed 2992.24 samples/sec Loss 6.7365 Epoch: 1 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:27:03,903-Speed 3212.75 samples/sec Loss 6.7550 Epoch: 1 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:27:20,945-Speed 3004.43 samples/sec Loss 6.7811 Epoch: 1 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:27:36,796-Speed 3230.18 samples/sec Loss 6.8094 Epoch: 1 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:27:52,875-Speed 3184.35 samples/sec Loss 6.7540 Epoch: 1 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:28:10,516-Speed 2902.42 samples/sec Loss 6.7704 Epoch: 1 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:28:26,334-Speed 3236.86 samples/sec Loss 6.7731 Epoch: 1 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:28:44,078-Speed 2885.64 samples/sec Loss 6.7232 Epoch: 1 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:29:00,851-Speed 3052.53 samples/sec Loss 6.7413 Epoch: 1 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:29:16,758-Speed 3218.78 samples/sec Loss 6.7783 Epoch: 1 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:29:32,755-Speed 3200.71 samples/sec Loss 6.7634 Epoch: 1 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:29:48,618-Speed 3227.80 samples/sec Loss 6.7793 Epoch: 1 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:30:04,582-Speed 3207.27 samples/sec Loss 6.7477 Epoch: 1 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:30:20,806-Speed 3155.88 samples/sec Loss 6.7354 Epoch: 1 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:30:36,896-Speed 3182.31 samples/sec Loss 6.7458 Epoch: 1 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:30:52,760-Speed 3227.56 samples/sec Loss 6.6597 Epoch: 1 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:31:08,780-Speed 3196.09 samples/sec Loss 6.7273 Epoch: 1 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:31:24,671-Speed 3222.05 samples/sec Loss 6.7189 Epoch: 1 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:31:40,522-Speed 3230.13 samples/sec Loss 6.7353 Epoch: 1 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:31:57,486-Speed 3018.26 samples/sec Loss 6.7518 Epoch: 1 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:32:13,553-Speed 3186.72 samples/sec Loss 6.7153 Epoch: 1 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:32:29,277-Speed 3256.31 samples/sec Loss 6.6816 Epoch: 1 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:32:45,283-Speed 3198.96 samples/sec Loss 6.6691 Epoch: 1 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:33:01,426-Speed 3171.63 samples/sec Loss 6.7417 Epoch: 1 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:33:17,502-Speed 3185.05 samples/sec Loss 6.6980 Epoch: 1 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:33:33,583-Speed 3183.88 samples/sec Loss 6.6594 Epoch: 1 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:33:49,565-Speed 3203.74 samples/sec Loss 6.7606 Epoch: 1 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:34:05,352-Speed 3243.34 samples/sec Loss 6.7863 Epoch: 1 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:34:21,338-Speed 3202.87 samples/sec Loss 6.7291 Epoch: 1 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:34:37,354-Speed 3196.83 samples/sec Loss 6.7379 Epoch: 1 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:34:53,067-Speed 3258.57 samples/sec Loss 6.6858 Epoch: 1 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:35:08,911-Speed 3231.56 samples/sec Loss 6.7711 Epoch: 1 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:35:24,972-Speed 3188.02 samples/sec Loss 6.7068 Epoch: 1 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:35:40,776-Speed 3239.84 samples/sec Loss 6.6655 Epoch: 1 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:36:33,922-[lfw][22000]XNorm: 22.759189 Training: 2021-03-16 02:36:33,922-[lfw][22000]Accuracy-Flip: 0.99667+-0.00258 Training: 2021-03-16 02:36:33,922-[lfw][22000]Accuracy-Highest: 0.99667 Training: 2021-03-16 02:37:36,133-[cfp_fp][22000]XNorm: 20.155992 Training: 2021-03-16 02:37:36,133-[cfp_fp][22000]Accuracy-Flip: 0.94386+-0.01339 Training: 2021-03-16 02:37:36,133-[cfp_fp][22000]Accuracy-Highest: 0.94386 Training: 2021-03-16 02:38:29,335-[agedb_30][22000]XNorm: 22.429609 Training: 2021-03-16 02:38:29,335-[agedb_30][22000]Accuracy-Flip: 0.95400+-0.00901 Training: 2021-03-16 02:38:29,335-[agedb_30][22000]Accuracy-Highest: 0.95400 Training: 2021-03-16 02:38:45,188-Speed 277.64 samples/sec Loss 6.7053 Epoch: 1 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:39:01,252-Speed 3187.37 samples/sec Loss 6.7511 Epoch: 1 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:39:17,067-Speed 3237.60 samples/sec Loss 6.7566 Epoch: 1 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:39:33,277-Speed 3158.62 samples/sec Loss 6.6766 Epoch: 1 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:39:49,373-Speed 3181.12 samples/sec Loss 6.6766 Epoch: 1 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:40:05,342-Speed 3206.22 samples/sec Loss 6.6670 Epoch: 1 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:40:21,321-Speed 3204.40 samples/sec Loss 6.6849 Epoch: 1 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:40:38,113-Speed 3049.17 samples/sec Loss 6.6551 Epoch: 1 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:40:54,040-Speed 3214.59 samples/sec Loss 6.6552 Epoch: 1 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:41:10,886-Speed 3039.37 samples/sec Loss 6.6935 Epoch: 1 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:41:26,909-Speed 3195.56 samples/sec Loss 6.6972 Epoch: 1 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:41:42,803-Speed 3221.56 samples/sec Loss 6.6657 Epoch: 1 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:42:00,467-Speed 2898.62 samples/sec Loss 6.7099 Epoch: 1 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:42:16,703-Speed 3153.48 samples/sec Loss 6.6435 Epoch: 1 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:42:33,543-Speed 3040.43 samples/sec Loss 6.6021 Epoch: 1 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:42:51,054-Speed 2924.07 samples/sec Loss 6.6215 Epoch: 1 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:43:07,359-Speed 3140.28 samples/sec Loss 6.6820 Epoch: 1 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:43:23,362-Speed 3199.34 samples/sec Loss 6.6221 Epoch: 1 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:43:39,179-Speed 3237.27 samples/sec Loss 6.6885 Epoch: 1 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:43:55,111-Speed 3213.61 samples/sec Loss 6.6501 Epoch: 1 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:44:11,223-Speed 3177.99 samples/sec Loss 6.6711 Epoch: 1 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:44:26,912-Speed 3263.52 samples/sec Loss 6.6346 Epoch: 1 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:44:43,140-Speed 3155.19 samples/sec Loss 6.6324 Epoch: 1 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:44:59,244-Speed 3179.40 samples/sec Loss 6.6492 Epoch: 1 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:45:15,043-Speed 3240.78 samples/sec Loss 6.5683 Epoch: 1 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:45:30,978-Speed 3213.05 samples/sec Loss 6.6196 Epoch: 1 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:45:47,062-Speed 3183.51 samples/sec Loss 6.6494 Epoch: 1 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:46:02,796-Speed 3254.22 samples/sec Loss 6.6908 Epoch: 1 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:46:19,847-Speed 3002.85 samples/sec Loss 6.6233 Epoch: 1 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:46:35,805-Speed 3208.49 samples/sec Loss 6.6568 Epoch: 1 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:46:51,945-Speed 3172.18 samples/sec Loss 6.6451 Epoch: 1 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:47:08,056-Speed 3178.20 samples/sec Loss 6.6393 Epoch: 1 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:47:24,223-Speed 3166.92 samples/sec Loss 6.6185 Epoch: 1 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:47:40,028-Speed 3239.58 samples/sec Loss 6.5762 Epoch: 1 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:47:56,084-Speed 3188.95 samples/sec Loss 6.5994 Epoch: 1 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:48:12,272-Speed 3162.91 samples/sec Loss 6.6279 Epoch: 1 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:48:28,219-Speed 3210.77 samples/sec Loss 6.6171 Epoch: 1 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:48:44,287-Speed 3186.73 samples/sec Loss 6.6356 Epoch: 1 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 02:49:00,420-Speed 3173.65 samples/sec Loss 6.5956 Epoch: 1 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 02:49:16,374-Speed 3209.29 samples/sec Loss 6.6015 Epoch: 1 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 02:50:09,747-[lfw][24000]XNorm: 22.912782 Training: 2021-03-16 02:50:09,747-[lfw][24000]Accuracy-Flip: 0.99517+-0.00273 Training: 2021-03-16 02:50:09,747-[lfw][24000]Accuracy-Highest: 0.99667 Training: 2021-03-16 02:51:11,668-[cfp_fp][24000]XNorm: 20.241518 Training: 2021-03-16 02:51:11,669-[cfp_fp][24000]Accuracy-Flip: 0.94414+-0.01489 Training: 2021-03-16 02:51:11,669-[cfp_fp][24000]Accuracy-Highest: 0.94414 Training: 2021-03-16 02:52:04,787-[agedb_30][24000]XNorm: 22.626190 Training: 2021-03-16 02:52:04,787-[agedb_30][24000]Accuracy-Flip: 0.95583+-0.01049 Training: 2021-03-16 02:52:04,787-[agedb_30][24000]Accuracy-Highest: 0.95583 Training: 2021-03-16 02:52:20,677-Speed 277.80 samples/sec Loss 6.5658 Epoch: 1 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:52:36,852-Speed 3165.58 samples/sec Loss 6.6058 Epoch: 1 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:52:52,744-Speed 3221.88 samples/sec Loss 6.6066 Epoch: 1 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:53:08,697-Speed 3209.36 samples/sec Loss 6.5591 Epoch: 1 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:53:24,939-Speed 3152.57 samples/sec Loss 6.5409 Epoch: 1 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:53:40,976-Speed 3192.57 samples/sec Loss 6.5544 Epoch: 1 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:53:57,081-Speed 3179.34 samples/sec Loss 6.6171 Epoch: 1 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:54:13,101-Speed 3196.00 samples/sec Loss 6.5777 Epoch: 1 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:54:29,944-Speed 3039.98 samples/sec Loss 6.5852 Epoch: 1 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:54:46,707-Speed 3054.50 samples/sec Loss 6.5848 Epoch: 1 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:55:02,737-Speed 3194.12 samples/sec Loss 6.5893 Epoch: 1 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:55:18,637-Speed 3220.17 samples/sec Loss 6.6172 Epoch: 1 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:55:34,616-Speed 3204.26 samples/sec Loss 6.5641 Epoch: 1 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:55:51,397-Speed 3051.27 samples/sec Loss 6.5802 Epoch: 1 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:56:08,903-Speed 2924.76 samples/sec Loss 6.5831 Epoch: 1 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:56:24,802-Speed 3220.38 samples/sec Loss 6.5189 Epoch: 1 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:56:41,621-Speed 3044.28 samples/sec Loss 6.5730 Epoch: 1 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:56:57,607-Speed 3202.82 samples/sec Loss 6.5583 Epoch: 1 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:57:14,335-Speed 3060.88 samples/sec Loss 6.5595 Epoch: 1 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:57:30,575-Speed 3152.75 samples/sec Loss 6.6299 Epoch: 1 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:57:46,689-Speed 3177.56 samples/sec Loss 6.5674 Epoch: 1 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:58:02,437-Speed 3251.28 samples/sec Loss 6.5503 Epoch: 1 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:58:18,291-Speed 3229.59 samples/sec Loss 6.4845 Epoch: 1 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:58:34,579-Speed 3143.38 samples/sec Loss 6.5491 Epoch: 1 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:58:50,416-Speed 3233.20 samples/sec Loss 6.5528 Epoch: 1 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:59:06,713-Speed 3141.79 samples/sec Loss 6.5256 Epoch: 1 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:59:22,577-Speed 3227.52 samples/sec Loss 6.5267 Epoch: 1 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:59:38,342-Speed 3247.76 samples/sec Loss 6.5565 Epoch: 1 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 02:59:54,528-Speed 3163.21 samples/sec Loss 6.5947 Epoch: 1 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:00:11,571-Speed 3004.29 samples/sec Loss 6.6129 Epoch: 1 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:00:27,447-Speed 3225.19 samples/sec Loss 6.5155 Epoch: 1 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:00:43,264-Speed 3237.08 samples/sec Loss 6.5112 Epoch: 1 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:00:59,474-Speed 3158.62 samples/sec Loss 6.5416 Epoch: 1 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:01:15,257-Speed 3244.02 samples/sec Loss 6.5251 Epoch: 1 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:01:31,272-Speed 3197.21 samples/sec Loss 6.5351 Epoch: 1 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:01:47,368-Speed 3181.03 samples/sec Loss 6.5082 Epoch: 1 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:02:03,270-Speed 3219.66 samples/sec Loss 6.5102 Epoch: 1 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:02:19,428-Speed 3168.97 samples/sec Loss 6.5274 Epoch: 1 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:02:35,467-Speed 3192.20 samples/sec Loss 6.5461 Epoch: 1 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:02:51,336-Speed 3226.52 samples/sec Loss 6.4964 Epoch: 1 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:03:44,844-[lfw][26000]XNorm: 22.217155 Training: 2021-03-16 03:03:44,845-[lfw][26000]Accuracy-Flip: 0.99450+-0.00373 Training: 2021-03-16 03:03:44,845-[lfw][26000]Accuracy-Highest: 0.99667 Training: 2021-03-16 03:04:46,799-[cfp_fp][26000]XNorm: 18.687555 Training: 2021-03-16 03:04:46,800-[cfp_fp][26000]Accuracy-Flip: 0.94229+-0.01206 Training: 2021-03-16 03:04:46,800-[cfp_fp][26000]Accuracy-Highest: 0.94414 Training: 2021-03-16 03:05:40,020-[agedb_30][26000]XNorm: 21.667832 Training: 2021-03-16 03:05:40,020-[agedb_30][26000]Accuracy-Flip: 0.95367+-0.01069 Training: 2021-03-16 03:05:40,020-[agedb_30][26000]Accuracy-Highest: 0.95583 Training: 2021-03-16 03:05:56,013-Speed 277.24 samples/sec Loss 6.5041 Epoch: 1 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:06:11,913-Speed 3220.20 samples/sec Loss 6.5078 Epoch: 1 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:06:27,696-Speed 3244.19 samples/sec Loss 6.4873 Epoch: 1 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:06:43,825-Speed 3174.47 samples/sec Loss 6.5478 Epoch: 1 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:06:59,919-Speed 3181.35 samples/sec Loss 6.5186 Epoch: 1 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:07:15,724-Speed 3239.63 samples/sec Loss 6.5433 Epoch: 1 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:07:31,791-Speed 3186.86 samples/sec Loss 6.4691 Epoch: 1 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:07:47,884-Speed 3181.42 samples/sec Loss 6.4794 Epoch: 1 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:08:03,551-Speed 3268.12 samples/sec Loss 6.5051 Epoch: 1 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:08:20,420-Speed 3035.27 samples/sec Loss 6.5317 Epoch: 1 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:08:37,966-Speed 2918.13 samples/sec Loss 6.4912 Epoch: 1 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:08:54,125-Speed 3168.56 samples/sec Loss 6.4885 Epoch: 1 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:09:10,043-Speed 3216.66 samples/sec Loss 6.4602 Epoch: 1 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:09:26,216-Speed 3165.96 samples/sec Loss 6.5236 Epoch: 1 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:09:42,909-Speed 3067.07 samples/sec Loss 6.4560 Epoch: 1 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:09:58,814-Speed 3219.38 samples/sec Loss 6.5379 Epoch: 1 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:10:16,555-Speed 2886.03 samples/sec Loss 6.5252 Epoch: 1 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:10:33,307-Speed 3056.47 samples/sec Loss 6.4973 Epoch: 1 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:10:49,170-Speed 3227.66 samples/sec Loss 6.4390 Epoch: 1 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:11:05,377-Speed 3159.25 samples/sec Loss 6.5360 Epoch: 1 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:11:22,504-Speed 2989.49 samples/sec Loss 6.4556 Epoch: 1 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:11:38,277-Speed 3246.21 samples/sec Loss 6.4658 Epoch: 1 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:11:54,365-Speed 3182.51 samples/sec Loss 6.4761 Epoch: 1 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:12:10,454-Speed 3182.47 samples/sec Loss 6.4802 Epoch: 1 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:12:26,387-Speed 3213.60 samples/sec Loss 6.4836 Epoch: 1 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:12:42,748-Speed 3129.46 samples/sec Loss 6.4963 Epoch: 1 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:12:58,770-Speed 3195.77 samples/sec Loss 6.4760 Epoch: 1 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:13:14,599-Speed 3234.59 samples/sec Loss 6.4645 Epoch: 1 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:13:30,620-Speed 3195.88 samples/sec Loss 6.4298 Epoch: 1 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:13:46,586-Speed 3207.01 samples/sec Loss 6.5095 Epoch: 1 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:14:02,614-Speed 3194.38 samples/sec Loss 6.5106 Epoch: 1 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:14:18,675-Speed 3188.08 samples/sec Loss 6.5132 Epoch: 1 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:14:35,710-Speed 3005.59 samples/sec Loss 6.4582 Epoch: 1 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:14:51,649-Speed 3212.39 samples/sec Loss 6.4485 Epoch: 1 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:15:07,812-Speed 3167.78 samples/sec Loss 6.3701 Epoch: 1 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:15:24,033-Speed 3156.58 samples/sec Loss 6.4599 Epoch: 1 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:15:39,975-Speed 3211.68 samples/sec Loss 6.4579 Epoch: 1 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:15:56,085-Speed 3178.25 samples/sec Loss 6.4541 Epoch: 1 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:16:12,366-Speed 3144.75 samples/sec Loss 6.4503 Epoch: 1 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:16:28,256-Speed 3222.27 samples/sec Loss 6.4202 Epoch: 1 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:17:21,260-[lfw][28000]XNorm: 20.545132 Training: 2021-03-16 03:17:21,261-[lfw][28000]Accuracy-Flip: 0.99567+-0.00309 Training: 2021-03-16 03:17:21,261-[lfw][28000]Accuracy-Highest: 0.99667 Training: 2021-03-16 03:18:23,034-[cfp_fp][28000]XNorm: 18.351558 Training: 2021-03-16 03:18:23,034-[cfp_fp][28000]Accuracy-Flip: 0.93386+-0.01410 Training: 2021-03-16 03:18:23,034-[cfp_fp][28000]Accuracy-Highest: 0.94414 Training: 2021-03-16 03:19:16,179-[agedb_30][28000]XNorm: 19.824036 Training: 2021-03-16 03:19:16,179-[agedb_30][28000]Accuracy-Flip: 0.94667+-0.01062 Training: 2021-03-16 03:19:16,179-[agedb_30][28000]Accuracy-Highest: 0.95583 Training: 2021-03-16 03:19:32,134-Speed 278.45 samples/sec Loss 6.4130 Epoch: 1 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:19:48,197-Speed 3187.48 samples/sec Loss 6.4290 Epoch: 1 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:20:04,084-Speed 3222.97 samples/sec Loss 6.3885 Epoch: 1 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:20:20,271-Speed 3163.13 samples/sec Loss 6.3798 Epoch: 1 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:20:36,317-Speed 3190.93 samples/sec Loss 6.4338 Epoch: 1 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:20:52,072-Speed 3249.69 samples/sec Loss 6.4191 Epoch: 1 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 35 hours Training: 2021-03-16 03:21:08,254-Speed 3164.09 samples/sec Loss 6.4166 Epoch: 1 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:21:24,297-Speed 3191.62 samples/sec Loss 6.3918 Epoch: 1 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:21:40,041-Speed 3252.22 samples/sec Loss 6.4445 Epoch: 1 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:21:56,334-Speed 3142.43 samples/sec Loss 6.3671 Epoch: 1 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:22:13,387-Speed 3002.63 samples/sec Loss 6.4154 Epoch: 1 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:22:29,584-Speed 3160.98 samples/sec Loss 6.4092 Epoch: 1 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:22:46,597-Speed 3009.66 samples/sec Loss 6.4932 Epoch: 1 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:23:02,545-Speed 3210.51 samples/sec Loss 6.3937 Epoch: 1 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:23:18,286-Speed 3252.73 samples/sec Loss 6.4062 Epoch: 1 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:23:35,266-Speed 3015.32 samples/sec Loss 6.4223 Epoch: 1 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:23:51,319-Speed 3189.64 samples/sec Loss 6.3351 Epoch: 1 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:24:08,163-Speed 3039.68 samples/sec Loss 6.4222 Epoch: 1 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:24:25,004-Speed 3040.29 samples/sec Loss 6.3763 Epoch: 1 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:24:41,860-Speed 3037.67 samples/sec Loss 6.3760 Epoch: 1 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:24:57,664-Speed 3239.82 samples/sec Loss 6.4246 Epoch: 1 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:25:13,430-Speed 3247.60 samples/sec Loss 6.3690 Epoch: 1 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:25:30,449-Speed 3008.49 samples/sec Loss 6.3693 Epoch: 1 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:25:46,676-Speed 3155.25 samples/sec Loss 6.4133 Epoch: 1 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:26:02,487-Speed 3238.25 samples/sec Loss 6.4008 Epoch: 1 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:26:18,527-Speed 3192.21 samples/sec Loss 6.4361 Epoch: 1 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:26:34,272-Speed 3251.94 samples/sec Loss 6.4369 Epoch: 1 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:26:50,238-Speed 3206.87 samples/sec Loss 6.4079 Epoch: 1 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:27:06,525-Speed 3143.73 samples/sec Loss 6.3258 Epoch: 1 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:27:22,618-Speed 3181.66 samples/sec Loss 6.4304 Epoch: 1 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:27:38,436-Speed 3236.88 samples/sec Loss 6.3648 Epoch: 1 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:27:54,430-Speed 3201.31 samples/sec Loss 6.3655 Epoch: 1 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:28:10,540-Speed 3178.15 samples/sec Loss 6.3884 Epoch: 1 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:28:26,281-Speed 3252.76 samples/sec Loss 6.3223 Epoch: 1 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:28:42,372-Speed 3182.10 samples/sec Loss 6.3702 Epoch: 1 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:28:59,001-Speed 3079.08 samples/sec Loss 6.3570 Epoch: 1 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:29:14,795-Speed 3241.69 samples/sec Loss 6.3728 Epoch: 1 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:29:30,928-Speed 3173.84 samples/sec Loss 6.3636 Epoch: 1 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:29:46,938-Speed 3198.11 samples/sec Loss 6.3399 Epoch: 1 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:30:02,733-Speed 3241.58 samples/sec Loss 6.3897 Epoch: 1 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:30:56,100-[lfw][30000]XNorm: 22.249390 Training: 2021-03-16 03:30:56,101-[lfw][30000]Accuracy-Flip: 0.99567+-0.00327 Training: 2021-03-16 03:30:56,101-[lfw][30000]Accuracy-Highest: 0.99667 Training: 2021-03-16 03:31:58,189-[cfp_fp][30000]XNorm: 19.422705 Training: 2021-03-16 03:31:58,190-[cfp_fp][30000]Accuracy-Flip: 0.94229+-0.00990 Training: 2021-03-16 03:31:58,190-[cfp_fp][30000]Accuracy-Highest: 0.94414 Training: 2021-03-16 03:32:51,724-[agedb_30][30000]XNorm: 21.806323 Training: 2021-03-16 03:32:51,724-[agedb_30][30000]Accuracy-Flip: 0.96133+-0.00618 Training: 2021-03-16 03:32:51,724-[agedb_30][30000]Accuracy-Highest: 0.96133 Training: 2021-03-16 03:33:07,662-Speed 276.86 samples/sec Loss 6.3458 Epoch: 1 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:33:23,710-Speed 3190.53 samples/sec Loss 6.3633 Epoch: 1 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:33:39,498-Speed 3243.08 samples/sec Loss 6.3750 Epoch: 1 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:33:55,549-Speed 3189.91 samples/sec Loss 6.3064 Epoch: 1 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:34:11,484-Speed 3213.24 samples/sec Loss 6.3541 Epoch: 1 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:34:27,219-Speed 3253.94 samples/sec Loss 6.3584 Epoch: 1 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:34:43,228-Speed 3198.34 samples/sec Loss 6.3686 Epoch: 1 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:34:59,339-Speed 3177.96 samples/sec Loss 6.3022 Epoch: 1 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:35:15,059-Speed 3257.17 samples/sec Loss 6.3614 Epoch: 1 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:35:31,046-Speed 3202.72 samples/sec Loss 6.3684 Epoch: 1 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:35:47,033-Speed 3202.58 samples/sec Loss 6.3111 Epoch: 1 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:36:03,587-Speed 3093.07 samples/sec Loss 6.3317 Epoch: 1 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:36:19,630-Speed 3191.48 samples/sec Loss 6.2788 Epoch: 1 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:36:35,646-Speed 3196.99 samples/sec Loss 6.3297 Epoch: 1 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:36:52,235-Speed 3086.34 samples/sec Loss 6.3875 Epoch: 1 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:37:08,299-Speed 3187.36 samples/sec Loss 6.3755 Epoch: 1 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:37:24,385-Speed 3183.15 samples/sec Loss 6.3229 Epoch: 1 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:37:41,001-Speed 3081.27 samples/sec Loss 6.3399 Epoch: 1 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:37:56,931-Speed 3214.26 samples/sec Loss 6.3549 Epoch: 1 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:38:14,773-Speed 2869.72 samples/sec Loss 6.3316 Epoch: 1 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:38:30,737-Speed 3207.25 samples/sec Loss 6.3363 Epoch: 1 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:38:46,623-Speed 3223.18 samples/sec Loss 6.2805 Epoch: 1 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:39:02,737-Speed 3177.41 samples/sec Loss 6.2849 Epoch: 1 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:39:19,664-Speed 3024.93 samples/sec Loss 6.3076 Epoch: 1 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:39:36,298-Speed 3078.07 samples/sec Loss 6.3473 Epoch: 1 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:39:52,293-Speed 3201.03 samples/sec Loss 6.3632 Epoch: 1 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:40:08,187-Speed 3221.49 samples/sec Loss 6.3290 Epoch: 1 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:40:24,031-Speed 3231.59 samples/sec Loss 6.3135 Epoch: 1 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:40:40,017-Speed 3202.89 samples/sec Loss 6.3154 Epoch: 1 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:40:56,049-Speed 3193.77 samples/sec Loss 6.3308 Epoch: 1 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:41:11,896-Speed 3231.05 samples/sec Loss 6.3122 Epoch: 1 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:41:27,878-Speed 3203.60 samples/sec Loss 6.3389 Epoch: 1 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:41:43,967-Speed 3182.40 samples/sec Loss 6.3010 Epoch: 1 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:41:59,788-Speed 3236.38 samples/sec Loss 6.2833 Epoch: 1 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:42:15,912-Speed 3175.38 samples/sec Loss 6.3127 Epoch: 1 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:42:31,743-Speed 3234.38 samples/sec Loss 6.2837 Epoch: 1 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:42:47,678-Speed 3213.12 samples/sec Loss 6.3057 Epoch: 1 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:43:03,873-Speed 3161.59 samples/sec Loss 6.3216 Epoch: 1 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:43:20,813-Speed 3022.50 samples/sec Loss 6.3385 Epoch: 1 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:43:36,704-Speed 3221.88 samples/sec Loss 6.2604 Epoch: 1 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:44:29,605-[lfw][32000]XNorm: 22.030529 Training: 2021-03-16 03:44:29,605-[lfw][32000]Accuracy-Flip: 0.99583+-0.00261 Training: 2021-03-16 03:44:29,605-[lfw][32000]Accuracy-Highest: 0.99667 Training: 2021-03-16 03:45:31,401-[cfp_fp][32000]XNorm: 19.502031 Training: 2021-03-16 03:45:31,401-[cfp_fp][32000]Accuracy-Flip: 0.95086+-0.00869 Training: 2021-03-16 03:45:31,401-[cfp_fp][32000]Accuracy-Highest: 0.95086 Training: 2021-03-16 03:46:24,307-[agedb_30][32000]XNorm: 21.771178 Training: 2021-03-16 03:46:24,308-[agedb_30][32000]Accuracy-Flip: 0.95800+-0.00930 Training: 2021-03-16 03:46:24,308-[agedb_30][32000]Accuracy-Highest: 0.96133 Training: 2021-03-16 03:46:40,499-Speed 278.57 samples/sec Loss 6.2816 Epoch: 1 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:46:56,489-Speed 3202.02 samples/sec Loss 6.3169 Epoch: 1 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:47:12,341-Speed 3230.06 samples/sec Loss 6.2866 Epoch: 1 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:47:28,530-Speed 3162.71 samples/sec Loss 6.2395 Epoch: 1 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:47:44,491-Speed 3207.73 samples/sec Loss 6.3241 Epoch: 1 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:48:00,381-Speed 3222.27 samples/sec Loss 6.2846 Epoch: 1 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:48:16,334-Speed 3209.48 samples/sec Loss 6.2349 Epoch: 1 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:48:32,474-Speed 3172.46 samples/sec Loss 6.2989 Epoch: 1 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:48:48,362-Speed 3222.63 samples/sec Loss 6.2624 Epoch: 1 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:49:04,386-Speed 3195.34 samples/sec Loss 6.3284 Epoch: 1 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:49:20,268-Speed 3223.80 samples/sec Loss 6.3205 Epoch: 1 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:49:36,116-Speed 3230.89 samples/sec Loss 6.2356 Epoch: 1 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:49:52,107-Speed 3201.92 samples/sec Loss 6.2426 Epoch: 1 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:50:08,917-Speed 3045.86 samples/sec Loss 6.3265 Epoch: 1 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:50:24,887-Speed 3206.06 samples/sec Loss 6.2971 Epoch: 1 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:50:41,233-Speed 3132.24 samples/sec Loss 6.2383 Epoch: 1 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:50:58,189-Speed 3019.80 samples/sec Loss 6.2283 Epoch: 1 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:51:14,017-Speed 3234.83 samples/sec Loss 6.2454 Epoch: 1 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:51:30,138-Speed 3176.10 samples/sec Loss 6.2337 Epoch: 1 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:51:47,155-Speed 3008.75 samples/sec Loss 6.2400 Epoch: 1 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:52:04,815-Speed 2899.41 samples/sec Loss 6.2555 Epoch: 1 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:52:20,717-Speed 3219.78 samples/sec Loss 6.2044 Epoch: 1 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:52:37,049-Speed 3135.13 samples/sec Loss 6.2273 Epoch: 1 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:52:52,852-Speed 3239.92 samples/sec Loss 6.2727 Epoch: 1 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:53:08,735-Speed 3223.63 samples/sec Loss 6.2468 Epoch: 1 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:53:24,824-Speed 3182.43 samples/sec Loss 6.2872 Epoch: 1 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:53:41,436-Speed 3082.14 samples/sec Loss 6.2503 Epoch: 1 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:54:12,168-Speed 1666.07 samples/sec Loss 5.9899 Epoch: 2 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:54:28,055-Speed 3222.89 samples/sec Loss 5.5830 Epoch: 2 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:54:44,187-Speed 3173.92 samples/sec Loss 5.6527 Epoch: 2 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:55:00,181-Speed 3201.20 samples/sec Loss 5.6691 Epoch: 2 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:55:15,997-Speed 3237.42 samples/sec Loss 5.6666 Epoch: 2 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:55:32,129-Speed 3173.93 samples/sec Loss 5.6900 Epoch: 2 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:55:48,292-Speed 3167.82 samples/sec Loss 5.6361 Epoch: 2 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:56:04,159-Speed 3226.88 samples/sec Loss 5.7036 Epoch: 2 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:56:20,155-Speed 3200.91 samples/sec Loss 5.7277 Epoch: 2 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:56:36,370-Speed 3157.62 samples/sec Loss 5.7469 Epoch: 2 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:56:52,146-Speed 3245.62 samples/sec Loss 5.7964 Epoch: 2 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:57:08,141-Speed 3201.13 samples/sec Loss 5.7767 Epoch: 2 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:57:24,284-Speed 3171.73 samples/sec Loss 5.7435 Epoch: 2 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 03:58:17,497-[lfw][34000]XNorm: 21.312522 Training: 2021-03-16 03:58:17,498-[lfw][34000]Accuracy-Flip: 0.99550+-0.00402 Training: 2021-03-16 03:58:17,498-[lfw][34000]Accuracy-Highest: 0.99667 Training: 2021-03-16 03:59:19,599-[cfp_fp][34000]XNorm: 18.107964 Training: 2021-03-16 03:59:19,599-[cfp_fp][34000]Accuracy-Flip: 0.94743+-0.01134 Training: 2021-03-16 03:59:19,599-[cfp_fp][34000]Accuracy-Highest: 0.95086 Training: 2021-03-16 04:00:13,031-[agedb_30][34000]XNorm: 21.006216 Training: 2021-03-16 04:00:13,031-[agedb_30][34000]Accuracy-Flip: 0.96233+-0.00834 Training: 2021-03-16 04:00:13,031-[agedb_30][34000]Accuracy-Highest: 0.96233 Training: 2021-03-16 04:00:29,995-Speed 275.70 samples/sec Loss 5.8136 Epoch: 2 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:00:45,932-Speed 3212.63 samples/sec Loss 5.7981 Epoch: 2 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:01:01,963-Speed 3194.15 samples/sec Loss 5.7180 Epoch: 2 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:01:17,841-Speed 3224.71 samples/sec Loss 5.8053 Epoch: 2 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:01:33,911-Speed 3186.17 samples/sec Loss 5.8132 Epoch: 2 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:01:49,991-Speed 3184.18 samples/sec Loss 5.8498 Epoch: 2 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:02:05,924-Speed 3213.48 samples/sec Loss 5.8215 Epoch: 2 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:02:21,867-Speed 3211.43 samples/sec Loss 5.8209 Epoch: 2 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:02:38,009-Speed 3172.13 samples/sec Loss 5.9298 Epoch: 2 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:02:53,822-Speed 3237.90 samples/sec Loss 5.9172 Epoch: 2 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:03:09,834-Speed 3197.55 samples/sec Loss 5.9302 Epoch: 2 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:03:26,070-Speed 3153.69 samples/sec Loss 5.9339 Epoch: 2 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:03:42,003-Speed 3213.65 samples/sec Loss 5.8805 Epoch: 2 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:03:58,190-Speed 3162.96 samples/sec Loss 5.8511 Epoch: 2 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:04:14,444-Speed 3150.26 samples/sec Loss 5.9542 Epoch: 2 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:04:31,415-Speed 3016.92 samples/sec Loss 5.9115 Epoch: 2 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:04:47,234-Speed 3236.64 samples/sec Loss 5.9424 Epoch: 2 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:05:03,315-Speed 3184.12 samples/sec Loss 5.9620 Epoch: 2 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:05:20,267-Speed 3020.38 samples/sec Loss 5.8668 Epoch: 2 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:05:36,216-Speed 3210.31 samples/sec Loss 5.9752 Epoch: 2 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:05:52,181-Speed 3207.04 samples/sec Loss 5.9019 Epoch: 2 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:06:09,782-Speed 2909.03 samples/sec Loss 5.9714 Epoch: 2 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:06:26,587-Speed 3046.74 samples/sec Loss 5.9763 Epoch: 2 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:06:42,587-Speed 3200.16 samples/sec Loss 5.9115 Epoch: 2 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:06:58,641-Speed 3189.35 samples/sec Loss 5.8979 Epoch: 2 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:07:14,386-Speed 3251.83 samples/sec Loss 5.9449 Epoch: 2 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:07:30,620-Speed 3154.12 samples/sec Loss 5.9770 Epoch: 2 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:07:46,822-Speed 3160.16 samples/sec Loss 6.0003 Epoch: 2 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:08:04,647-Speed 2872.43 samples/sec Loss 5.9860 Epoch: 2 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:08:20,680-Speed 3193.59 samples/sec Loss 6.0125 Epoch: 2 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:08:36,623-Speed 3211.54 samples/sec Loss 6.0490 Epoch: 2 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:08:52,526-Speed 3219.55 samples/sec Loss 6.0207 Epoch: 2 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:09:08,618-Speed 3181.71 samples/sec Loss 5.9819 Epoch: 2 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:09:24,589-Speed 3205.91 samples/sec Loss 6.0150 Epoch: 2 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:09:40,433-Speed 3231.56 samples/sec Loss 5.9940 Epoch: 2 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:09:56,539-Speed 3179.10 samples/sec Loss 5.9609 Epoch: 2 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:10:12,855-Speed 3138.20 samples/sec Loss 5.9734 Epoch: 2 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:10:28,682-Speed 3234.91 samples/sec Loss 6.0329 Epoch: 2 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:10:45,071-Speed 3124.28 samples/sec Loss 6.0303 Epoch: 2 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:11:01,163-Speed 3181.84 samples/sec Loss 6.0253 Epoch: 2 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:11:54,537-[lfw][36000]XNorm: 22.433038 Training: 2021-03-16 04:11:54,538-[lfw][36000]Accuracy-Flip: 0.99633+-0.00233 Training: 2021-03-16 04:11:54,538-[lfw][36000]Accuracy-Highest: 0.99667 Training: 2021-03-16 04:12:56,605-[cfp_fp][36000]XNorm: 19.160465 Training: 2021-03-16 04:12:56,605-[cfp_fp][36000]Accuracy-Flip: 0.94971+-0.01167 Training: 2021-03-16 04:12:56,605-[cfp_fp][36000]Accuracy-Highest: 0.95086 Training: 2021-03-16 04:13:50,004-[agedb_30][36000]XNorm: 21.654902 Training: 2021-03-16 04:13:50,005-[agedb_30][36000]Accuracy-Flip: 0.96283+-0.00989 Training: 2021-03-16 04:13:50,005-[agedb_30][36000]Accuracy-Highest: 0.96283 Training: 2021-03-16 04:14:05,805-Speed 277.29 samples/sec Loss 5.9872 Epoch: 2 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:14:21,600-Speed 3241.64 samples/sec Loss 6.0176 Epoch: 2 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:14:37,936-Speed 3134.32 samples/sec Loss 6.0114 Epoch: 2 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:14:54,777-Speed 3040.24 samples/sec Loss 6.0955 Epoch: 2 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:15:10,787-Speed 3198.16 samples/sec Loss 6.0585 Epoch: 2 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:15:26,994-Speed 3159.15 samples/sec Loss 6.0803 Epoch: 2 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:15:42,800-Speed 3239.47 samples/sec Loss 6.0290 Epoch: 2 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:15:58,764-Speed 3207.19 samples/sec Loss 6.0102 Epoch: 2 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:16:14,759-Speed 3201.24 samples/sec Loss 6.0586 Epoch: 2 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:16:30,616-Speed 3228.96 samples/sec Loss 6.0652 Epoch: 2 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:16:46,477-Speed 3228.03 samples/sec Loss 5.9971 Epoch: 2 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:17:02,515-Speed 3192.59 samples/sec Loss 6.0112 Epoch: 2 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:17:18,401-Speed 3223.03 samples/sec Loss 6.0789 Epoch: 2 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:17:34,229-Speed 3234.82 samples/sec Loss 6.0840 Epoch: 2 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:17:50,067-Speed 3232.77 samples/sec Loss 6.1055 Epoch: 2 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 34 hours Training: 2021-03-16 04:18:06,104-Speed 3192.89 samples/sec Loss 6.0690 Epoch: 2 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:18:21,890-Speed 3243.34 samples/sec Loss 6.0248 Epoch: 2 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:18:37,828-Speed 3212.67 samples/sec Loss 6.0837 Epoch: 2 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:18:53,934-Speed 3179.03 samples/sec Loss 6.0945 Epoch: 2 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:19:11,699-Speed 2882.18 samples/sec Loss 6.0325 Epoch: 2 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:19:27,751-Speed 3189.59 samples/sec Loss 6.0570 Epoch: 2 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:19:43,607-Speed 3229.32 samples/sec Loss 6.0498 Epoch: 2 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:20:00,261-Speed 3074.35 samples/sec Loss 6.0524 Epoch: 2 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:20:17,247-Speed 3014.39 samples/sec Loss 6.1479 Epoch: 2 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:20:34,079-Speed 3041.78 samples/sec Loss 6.0325 Epoch: 2 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:20:50,123-Speed 3191.48 samples/sec Loss 6.0700 Epoch: 2 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:21:05,869-Speed 3251.59 samples/sec Loss 6.0412 Epoch: 2 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:21:21,880-Speed 3197.87 samples/sec Loss 6.0614 Epoch: 2 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:21:37,956-Speed 3184.97 samples/sec Loss 6.0958 Epoch: 2 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:21:53,915-Speed 3208.38 samples/sec Loss 6.1657 Epoch: 2 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:22:10,848-Speed 3023.81 samples/sec Loss 6.1031 Epoch: 2 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:22:26,554-Speed 3260.00 samples/sec Loss 6.0909 Epoch: 2 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:22:43,190-Speed 3077.75 samples/sec Loss 6.0393 Epoch: 2 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:22:59,493-Speed 3140.67 samples/sec Loss 6.0564 Epoch: 2 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:23:15,332-Speed 3232.57 samples/sec Loss 6.0900 Epoch: 2 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:23:31,252-Speed 3216.21 samples/sec Loss 6.1030 Epoch: 2 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:23:47,491-Speed 3153.09 samples/sec Loss 5.9934 Epoch: 2 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:24:03,180-Speed 3263.52 samples/sec Loss 6.0970 Epoch: 2 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:24:19,516-Speed 3134.13 samples/sec Loss 6.0468 Epoch: 2 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:24:35,484-Speed 3206.65 samples/sec Loss 6.0591 Epoch: 2 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:25:28,831-[lfw][38000]XNorm: 22.576313 Training: 2021-03-16 04:25:28,831-[lfw][38000]Accuracy-Flip: 0.99650+-0.00217 Training: 2021-03-16 04:25:28,831-[lfw][38000]Accuracy-Highest: 0.99667 Training: 2021-03-16 04:26:30,647-[cfp_fp][38000]XNorm: 19.365241 Training: 2021-03-16 04:26:30,647-[cfp_fp][38000]Accuracy-Flip: 0.94743+-0.00932 Training: 2021-03-16 04:26:30,647-[cfp_fp][38000]Accuracy-Highest: 0.95086 Training: 2021-03-16 04:27:23,788-[agedb_30][38000]XNorm: 22.059764 Training: 2021-03-16 04:27:23,788-[agedb_30][38000]Accuracy-Flip: 0.96083+-0.00638 Training: 2021-03-16 04:27:23,788-[agedb_30][38000]Accuracy-Highest: 0.96283 Training: 2021-03-16 04:27:39,493-Speed 278.25 samples/sec Loss 6.0341 Epoch: 2 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:27:55,507-Speed 3197.14 samples/sec Loss 6.0968 Epoch: 2 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:28:11,637-Speed 3174.40 samples/sec Loss 6.0639 Epoch: 2 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:28:27,421-Speed 3243.92 samples/sec Loss 6.0827 Epoch: 2 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:28:43,439-Speed 3196.55 samples/sec Loss 6.0599 Epoch: 2 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:29:00,302-Speed 3036.16 samples/sec Loss 6.1112 Epoch: 2 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:29:16,172-Speed 3226.45 samples/sec Loss 6.1195 Epoch: 2 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:29:32,264-Speed 3181.79 samples/sec Loss 6.1188 Epoch: 2 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:29:48,237-Speed 3205.39 samples/sec Loss 6.1179 Epoch: 2 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:30:04,118-Speed 3224.13 samples/sec Loss 6.0883 Epoch: 2 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:30:19,925-Speed 3239.29 samples/sec Loss 6.0711 Epoch: 2 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:30:36,308-Speed 3125.24 samples/sec Loss 6.0774 Epoch: 2 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:30:52,181-Speed 3225.62 samples/sec Loss 6.0663 Epoch: 2 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:31:08,309-Speed 3174.68 samples/sec Loss 6.1399 Epoch: 2 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:31:24,455-Speed 3171.23 samples/sec Loss 6.1335 Epoch: 2 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:31:40,294-Speed 3232.65 samples/sec Loss 6.0914 Epoch: 2 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:31:56,303-Speed 3198.28 samples/sec Loss 6.1101 Epoch: 2 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:32:12,535-Speed 3154.42 samples/sec Loss 6.0437 Epoch: 2 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:32:28,519-Speed 3203.21 samples/sec Loss 6.0522 Epoch: 2 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:32:44,330-Speed 3238.28 samples/sec Loss 6.1114 Epoch: 2 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:33:00,308-Speed 3204.55 samples/sec Loss 6.1038 Epoch: 2 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:33:18,631-Speed 2794.34 samples/sec Loss 6.1129 Epoch: 2 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:33:34,740-Speed 3178.50 samples/sec Loss 6.0432 Epoch: 2 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:33:50,918-Speed 3164.97 samples/sec Loss 6.0836 Epoch: 2 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:34:07,584-Speed 3072.03 samples/sec Loss 6.1452 Epoch: 2 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:34:25,281-Speed 2893.30 samples/sec Loss 6.1016 Epoch: 2 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:34:41,194-Speed 3217.50 samples/sec Loss 6.1498 Epoch: 2 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:34:57,235-Speed 3191.94 samples/sec Loss 6.0960 Epoch: 2 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:35:13,085-Speed 3230.39 samples/sec Loss 6.1293 Epoch: 2 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:35:29,069-Speed 3203.32 samples/sec Loss 6.0726 Epoch: 2 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:35:45,199-Speed 3174.33 samples/sec Loss 6.0556 Epoch: 2 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:36:01,143-Speed 3211.35 samples/sec Loss 6.0785 Epoch: 2 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:36:17,971-Speed 3042.55 samples/sec Loss 6.1200 Epoch: 2 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:36:33,920-Speed 3210.35 samples/sec Loss 6.0159 Epoch: 2 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:36:50,624-Speed 3065.18 samples/sec Loss 6.0825 Epoch: 2 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:37:06,847-Speed 3156.30 samples/sec Loss 6.0446 Epoch: 2 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:37:22,756-Speed 3218.34 samples/sec Loss 6.0934 Epoch: 2 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:37:38,574-Speed 3236.93 samples/sec Loss 6.1019 Epoch: 2 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:37:54,649-Speed 3185.10 samples/sec Loss 6.1085 Epoch: 2 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:38:10,734-Speed 3183.13 samples/sec Loss 6.0814 Epoch: 2 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:39:03,889-[lfw][40000]XNorm: 22.461830 Training: 2021-03-16 04:39:03,889-[lfw][40000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-16 04:39:03,889-[lfw][40000]Accuracy-Highest: 0.99667 Training: 2021-03-16 04:40:05,498-[cfp_fp][40000]XNorm: 19.264059 Training: 2021-03-16 04:40:05,498-[cfp_fp][40000]Accuracy-Flip: 0.93757+-0.01343 Training: 2021-03-16 04:40:05,498-[cfp_fp][40000]Accuracy-Highest: 0.95086 Training: 2021-03-16 04:40:58,496-[agedb_30][40000]XNorm: 21.627791 Training: 2021-03-16 04:40:58,497-[agedb_30][40000]Accuracy-Flip: 0.95933+-0.01236 Training: 2021-03-16 04:40:58,497-[agedb_30][40000]Accuracy-Highest: 0.96283 Training: 2021-03-16 04:41:14,266-Speed 278.97 samples/sec Loss 6.1020 Epoch: 2 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:41:30,397-Speed 3174.14 samples/sec Loss 6.1330 Epoch: 2 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:41:46,262-Speed 3227.43 samples/sec Loss 6.0654 Epoch: 2 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:42:02,162-Speed 3220.14 samples/sec Loss 6.0585 Epoch: 2 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:42:18,076-Speed 3217.36 samples/sec Loss 6.0948 Epoch: 2 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:42:34,315-Speed 3153.05 samples/sec Loss 6.0372 Epoch: 2 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:42:49,969-Speed 3270.77 samples/sec Loss 6.0636 Epoch: 2 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:43:06,090-Speed 3176.19 samples/sec Loss 6.0935 Epoch: 2 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:43:22,067-Speed 3204.70 samples/sec Loss 6.0628 Epoch: 2 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:43:38,716-Speed 3075.27 samples/sec Loss 6.1125 Epoch: 2 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:43:55,048-Speed 3135.12 samples/sec Loss 6.0740 Epoch: 2 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:44:11,245-Speed 3161.16 samples/sec Loss 6.0846 Epoch: 2 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:44:27,093-Speed 3230.78 samples/sec Loss 6.1368 Epoch: 2 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:44:43,299-Speed 3159.34 samples/sec Loss 6.0801 Epoch: 2 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:44:59,337-Speed 3192.48 samples/sec Loss 6.1127 Epoch: 2 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:45:15,362-Speed 3195.31 samples/sec Loss 6.0700 Epoch: 2 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:45:31,486-Speed 3175.37 samples/sec Loss 6.1191 Epoch: 2 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:45:47,598-Speed 3177.91 samples/sec Loss 6.0827 Epoch: 2 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:46:03,387-Speed 3242.89 samples/sec Loss 6.1063 Epoch: 2 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:46:19,397-Speed 3198.04 samples/sec Loss 6.0812 Epoch: 2 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:46:35,486-Speed 3182.30 samples/sec Loss 6.0569 Epoch: 2 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:46:51,490-Speed 3199.36 samples/sec Loss 6.0847 Epoch: 2 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:47:07,601-Speed 3178.13 samples/sec Loss 6.0412 Epoch: 2 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:47:25,901-Speed 2797.88 samples/sec Loss 6.0915 Epoch: 2 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:47:41,799-Speed 3220.55 samples/sec Loss 6.1282 Epoch: 2 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:47:57,744-Speed 3211.12 samples/sec Loss 6.1243 Epoch: 2 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:48:14,480-Speed 3059.41 samples/sec Loss 6.0520 Epoch: 2 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:48:32,104-Speed 2905.13 samples/sec Loss 6.1187 Epoch: 2 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:48:48,115-Speed 3198.00 samples/sec Loss 6.0404 Epoch: 2 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:49:04,356-Speed 3152.50 samples/sec Loss 6.0703 Epoch: 2 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:49:20,315-Speed 3208.47 samples/sec Loss 6.0955 Epoch: 2 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:49:36,214-Speed 3220.30 samples/sec Loss 6.1095 Epoch: 2 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:49:52,280-Speed 3187.01 samples/sec Loss 6.0734 Epoch: 2 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:50:08,477-Speed 3161.20 samples/sec Loss 6.1275 Epoch: 2 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:50:25,286-Speed 3046.08 samples/sec Loss 6.1036 Epoch: 2 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:50:41,437-Speed 3170.20 samples/sec Loss 6.0580 Epoch: 2 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:50:57,600-Speed 3167.83 samples/sec Loss 6.0252 Epoch: 2 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:51:13,483-Speed 3223.63 samples/sec Loss 6.1081 Epoch: 2 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:51:30,398-Speed 3026.92 samples/sec Loss 6.1138 Epoch: 2 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:51:46,419-Speed 3195.84 samples/sec Loss 6.0545 Epoch: 2 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:52:39,351-[lfw][42000]XNorm: 21.930508 Training: 2021-03-16 04:52:39,352-[lfw][42000]Accuracy-Flip: 0.99633+-0.00245 Training: 2021-03-16 04:52:39,352-[lfw][42000]Accuracy-Highest: 0.99667 Training: 2021-03-16 04:53:41,235-[cfp_fp][42000]XNorm: 18.486817 Training: 2021-03-16 04:53:41,235-[cfp_fp][42000]Accuracy-Flip: 0.92886+-0.01165 Training: 2021-03-16 04:53:41,235-[cfp_fp][42000]Accuracy-Highest: 0.95086 Training: 2021-03-16 04:54:34,864-[agedb_30][42000]XNorm: 21.090036 Training: 2021-03-16 04:54:34,865-[agedb_30][42000]Accuracy-Flip: 0.96200+-0.00829 Training: 2021-03-16 04:54:34,865-[agedb_30][42000]Accuracy-Highest: 0.96283 Training: 2021-03-16 04:54:50,536-Speed 278.09 samples/sec Loss 6.0469 Epoch: 2 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:55:06,497-Speed 3207.84 samples/sec Loss 6.1158 Epoch: 2 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:55:22,480-Speed 3203.58 samples/sec Loss 6.1286 Epoch: 2 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:55:38,311-Speed 3234.33 samples/sec Loss 6.0749 Epoch: 2 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:55:54,414-Speed 3179.56 samples/sec Loss 6.0511 Epoch: 2 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:56:10,248-Speed 3233.61 samples/sec Loss 6.0963 Epoch: 2 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:56:25,999-Speed 3250.73 samples/sec Loss 6.0411 Epoch: 2 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:56:42,473-Speed 3107.93 samples/sec Loss 6.1182 Epoch: 2 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:56:58,373-Speed 3220.22 samples/sec Loss 6.0702 Epoch: 2 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:57:14,213-Speed 3232.42 samples/sec Loss 5.9896 Epoch: 2 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:57:30,442-Speed 3155.10 samples/sec Loss 6.0592 Epoch: 2 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:57:46,282-Speed 3232.43 samples/sec Loss 6.0646 Epoch: 2 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:58:02,047-Speed 3247.64 samples/sec Loss 6.1031 Epoch: 2 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:58:18,865-Speed 3044.63 samples/sec Loss 6.0810 Epoch: 2 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:58:34,828-Speed 3207.44 samples/sec Loss 6.0433 Epoch: 2 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:58:50,509-Speed 3265.16 samples/sec Loss 6.0574 Epoch: 2 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:59:06,488-Speed 3204.37 samples/sec Loss 6.0508 Epoch: 2 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:59:22,608-Speed 3176.15 samples/sec Loss 6.0641 Epoch: 2 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:59:38,466-Speed 3228.76 samples/sec Loss 6.0345 Epoch: 2 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 04:59:54,535-Speed 3186.47 samples/sec Loss 6.0887 Epoch: 2 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:00:10,310-Speed 3245.76 samples/sec Loss 6.1255 Epoch: 2 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:00:26,041-Speed 3254.66 samples/sec Loss 6.0370 Epoch: 2 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:00:42,154-Speed 3177.71 samples/sec Loss 6.0662 Epoch: 2 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:00:58,124-Speed 3206.22 samples/sec Loss 6.0560 Epoch: 2 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:01:13,972-Speed 3230.77 samples/sec Loss 6.0910 Epoch: 2 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:01:29,942-Speed 3206.04 samples/sec Loss 6.0335 Epoch: 2 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:01:46,859-Speed 3026.54 samples/sec Loss 6.1131 Epoch: 2 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:02:03,516-Speed 3073.93 samples/sec Loss 6.0851 Epoch: 2 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:02:20,462-Speed 3021.48 samples/sec Loss 6.0593 Epoch: 2 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:02:38,177-Speed 2890.28 samples/sec Loss 6.0635 Epoch: 2 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:02:54,113-Speed 3212.99 samples/sec Loss 6.0830 Epoch: 2 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:03:10,414-Speed 3140.90 samples/sec Loss 6.0507 Epoch: 2 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:03:26,619-Speed 3159.77 samples/sec Loss 6.0637 Epoch: 2 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:03:42,584-Speed 3207.00 samples/sec Loss 6.0977 Epoch: 2 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:03:58,523-Speed 3212.27 samples/sec Loss 6.0053 Epoch: 2 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:04:14,360-Speed 3233.05 samples/sec Loss 6.0519 Epoch: 2 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:04:30,332-Speed 3205.79 samples/sec Loss 6.0564 Epoch: 2 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:04:46,399-Speed 3186.71 samples/sec Loss 6.0978 Epoch: 2 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:05:03,037-Speed 3077.44 samples/sec Loss 6.0556 Epoch: 2 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:05:18,975-Speed 3212.51 samples/sec Loss 5.9966 Epoch: 2 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:06:12,246-[lfw][44000]XNorm: 22.313184 Training: 2021-03-16 05:06:12,247-[lfw][44000]Accuracy-Flip: 0.99567+-0.00327 Training: 2021-03-16 05:06:12,247-[lfw][44000]Accuracy-Highest: 0.99667 Training: 2021-03-16 05:07:14,178-[cfp_fp][44000]XNorm: 18.868175 Training: 2021-03-16 05:07:14,178-[cfp_fp][44000]Accuracy-Flip: 0.95329+-0.01069 Training: 2021-03-16 05:07:14,178-[cfp_fp][44000]Accuracy-Highest: 0.95329 Training: 2021-03-16 05:08:07,488-[agedb_30][44000]XNorm: 21.454803 Training: 2021-03-16 05:08:07,488-[agedb_30][44000]Accuracy-Flip: 0.95850+-0.01253 Training: 2021-03-16 05:08:07,489-[agedb_30][44000]Accuracy-Highest: 0.96283 Training: 2021-03-16 05:08:23,266-Speed 277.82 samples/sec Loss 6.0266 Epoch: 2 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:08:40,005-Speed 3058.88 samples/sec Loss 6.0525 Epoch: 2 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:08:55,943-Speed 3212.40 samples/sec Loss 6.0718 Epoch: 2 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:09:12,003-Speed 3188.26 samples/sec Loss 6.0098 Epoch: 2 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:09:27,931-Speed 3214.62 samples/sec Loss 6.0247 Epoch: 2 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:09:43,783-Speed 3229.84 samples/sec Loss 6.0871 Epoch: 2 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:09:59,629-Speed 3231.27 samples/sec Loss 6.0816 Epoch: 2 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:10:15,839-Speed 3158.71 samples/sec Loss 6.0298 Epoch: 2 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:10:31,774-Speed 3213.09 samples/sec Loss 6.0717 Epoch: 2 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:10:47,656-Speed 3223.86 samples/sec Loss 6.0632 Epoch: 2 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:11:03,739-Speed 3183.59 samples/sec Loss 6.0795 Epoch: 2 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:11:19,845-Speed 3179.09 samples/sec Loss 6.0669 Epoch: 2 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:11:35,750-Speed 3219.20 samples/sec Loss 6.0334 Epoch: 2 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:11:51,654-Speed 3219.39 samples/sec Loss 6.0402 Epoch: 2 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:12:07,497-Speed 3231.75 samples/sec Loss 6.0300 Epoch: 2 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:12:23,603-Speed 3179.00 samples/sec Loss 6.0042 Epoch: 2 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:12:40,518-Speed 3027.12 samples/sec Loss 6.0486 Epoch: 2 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:12:56,375-Speed 3228.80 samples/sec Loss 6.0187 Epoch: 2 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:13:12,243-Speed 3226.66 samples/sec Loss 6.0195 Epoch: 2 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:13:28,387-Speed 3171.57 samples/sec Loss 6.0703 Epoch: 2 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:13:44,359-Speed 3205.85 samples/sec Loss 6.0097 Epoch: 2 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:14:00,516-Speed 3168.92 samples/sec Loss 5.9745 Epoch: 2 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:14:16,588-Speed 3185.72 samples/sec Loss 6.0241 Epoch: 2 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:14:32,362-Speed 3245.97 samples/sec Loss 6.0439 Epoch: 2 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:14:48,190-Speed 3234.93 samples/sec Loss 6.0412 Epoch: 2 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:15:04,433-Speed 3152.16 samples/sec Loss 5.9925 Epoch: 2 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:15:20,497-Speed 3187.31 samples/sec Loss 6.0075 Epoch: 2 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:15:36,410-Speed 3217.82 samples/sec Loss 6.0553 Epoch: 2 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:15:53,467-Speed 3001.71 samples/sec Loss 6.0208 Epoch: 2 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:16:10,074-Speed 3083.19 samples/sec Loss 6.0106 Epoch: 2 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:16:26,791-Speed 3062.76 samples/sec Loss 6.0729 Epoch: 2 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:16:43,864-Speed 2998.94 samples/sec Loss 6.0393 Epoch: 2 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:17:00,120-Speed 3149.74 samples/sec Loss 5.9921 Epoch: 2 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:17:15,831-Speed 3258.90 samples/sec Loss 6.0499 Epoch: 2 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:17:31,762-Speed 3213.94 samples/sec Loss 5.9807 Epoch: 2 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:17:47,892-Speed 3174.32 samples/sec Loss 6.0039 Epoch: 2 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:18:03,942-Speed 3190.26 samples/sec Loss 6.0456 Epoch: 2 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:18:20,026-Speed 3183.39 samples/sec Loss 6.0150 Epoch: 2 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:18:36,200-Speed 3165.68 samples/sec Loss 6.1005 Epoch: 2 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:18:52,113-Speed 3217.42 samples/sec Loss 6.0818 Epoch: 2 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:19:45,140-[lfw][46000]XNorm: 23.994245 Training: 2021-03-16 05:19:45,141-[lfw][46000]Accuracy-Flip: 0.99700+-0.00306 Training: 2021-03-16 05:19:45,141-[lfw][46000]Accuracy-Highest: 0.99700 Training: 2021-03-16 05:20:47,094-[cfp_fp][46000]XNorm: 20.412917 Training: 2021-03-16 05:20:47,095-[cfp_fp][46000]Accuracy-Flip: 0.93857+-0.01181 Training: 2021-03-16 05:20:47,095-[cfp_fp][46000]Accuracy-Highest: 0.95329 Training: 2021-03-16 05:21:40,024-[agedb_30][46000]XNorm: 23.182585 Training: 2021-03-16 05:21:40,025-[agedb_30][46000]Accuracy-Flip: 0.95667+-0.01167 Training: 2021-03-16 05:21:40,025-[agedb_30][46000]Accuracy-Highest: 0.96283 Training: 2021-03-16 05:21:56,032-Speed 278.38 samples/sec Loss 6.0026 Epoch: 2 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:22:12,050-Speed 3196.49 samples/sec Loss 5.9834 Epoch: 2 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:22:28,849-Speed 3048.00 samples/sec Loss 5.9662 Epoch: 2 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:22:45,940-Speed 2995.73 samples/sec Loss 6.0442 Epoch: 2 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:23:01,722-Speed 3244.46 samples/sec Loss 6.0220 Epoch: 2 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:23:17,678-Speed 3208.87 samples/sec Loss 6.0023 Epoch: 2 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:23:33,891-Speed 3157.99 samples/sec Loss 5.9919 Epoch: 2 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 33 hours Training: 2021-03-16 05:23:49,793-Speed 3219.89 samples/sec Loss 5.9461 Epoch: 2 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:24:05,731-Speed 3212.44 samples/sec Loss 6.0086 Epoch: 2 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:24:22,004-Speed 3146.49 samples/sec Loss 5.9915 Epoch: 2 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:24:38,056-Speed 3189.72 samples/sec Loss 6.0322 Epoch: 2 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:24:53,845-Speed 3242.91 samples/sec Loss 6.0209 Epoch: 2 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:25:09,557-Speed 3258.80 samples/sec Loss 6.0377 Epoch: 2 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:25:25,532-Speed 3205.03 samples/sec Loss 6.0035 Epoch: 2 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:25:41,440-Speed 3218.66 samples/sec Loss 5.9526 Epoch: 2 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:25:57,432-Speed 3201.67 samples/sec Loss 6.0592 Epoch: 2 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:26:13,627-Speed 3161.60 samples/sec Loss 5.9939 Epoch: 2 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:26:29,407-Speed 3244.58 samples/sec Loss 5.9389 Epoch: 2 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:26:45,418-Speed 3197.83 samples/sec Loss 5.9946 Epoch: 2 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:27:02,507-Speed 2996.15 samples/sec Loss 5.9968 Epoch: 2 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:27:18,580-Speed 3185.74 samples/sec Loss 5.9962 Epoch: 2 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:27:34,532-Speed 3209.79 samples/sec Loss 6.0055 Epoch: 2 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:27:50,258-Speed 3255.74 samples/sec Loss 5.9858 Epoch: 2 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:28:06,178-Speed 3216.22 samples/sec Loss 5.9881 Epoch: 2 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:28:22,232-Speed 3189.29 samples/sec Loss 6.0426 Epoch: 2 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:28:38,272-Speed 3192.17 samples/sec Loss 6.0483 Epoch: 2 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:28:54,267-Speed 3201.09 samples/sec Loss 6.0301 Epoch: 2 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:29:10,184-Speed 3216.82 samples/sec Loss 6.0260 Epoch: 2 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:29:26,386-Speed 3160.12 samples/sec Loss 5.9894 Epoch: 2 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:29:42,363-Speed 3204.65 samples/sec Loss 6.0004 Epoch: 2 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:29:59,224-Speed 3036.77 samples/sec Loss 6.0076 Epoch: 2 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:30:16,331-Speed 2993.07 samples/sec Loss 5.9987 Epoch: 2 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:30:32,209-Speed 3224.62 samples/sec Loss 6.0201 Epoch: 2 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:30:49,033-Speed 3043.48 samples/sec Loss 6.0063 Epoch: 2 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:31:04,874-Speed 3232.07 samples/sec Loss 6.0187 Epoch: 2 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:31:21,717-Speed 3039.99 samples/sec Loss 6.0659 Epoch: 2 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:31:37,703-Speed 3202.99 samples/sec Loss 6.0125 Epoch: 2 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:31:53,689-Speed 3202.88 samples/sec Loss 6.0136 Epoch: 2 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:32:09,748-Speed 3188.26 samples/sec Loss 5.9983 Epoch: 2 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:32:25,612-Speed 3227.61 samples/sec Loss 5.9157 Epoch: 2 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:33:18,713-[lfw][48000]XNorm: 22.468912 Training: 2021-03-16 05:33:18,714-[lfw][48000]Accuracy-Flip: 0.99450+-0.00366 Training: 2021-03-16 05:33:18,714-[lfw][48000]Accuracy-Highest: 0.99700 Training: 2021-03-16 05:34:20,725-[cfp_fp][48000]XNorm: 19.894665 Training: 2021-03-16 05:34:20,726-[cfp_fp][48000]Accuracy-Flip: 0.95014+-0.00984 Training: 2021-03-16 05:34:20,726-[cfp_fp][48000]Accuracy-Highest: 0.95329 Training: 2021-03-16 05:35:13,663-[agedb_30][48000]XNorm: 21.823299 Training: 2021-03-16 05:35:13,663-[agedb_30][48000]Accuracy-Flip: 0.95950+-0.00978 Training: 2021-03-16 05:35:13,663-[agedb_30][48000]Accuracy-Highest: 0.96283 Training: 2021-03-16 05:35:29,995-Speed 277.68 samples/sec Loss 5.9465 Epoch: 2 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:35:46,075-Speed 3184.11 samples/sec Loss 5.9979 Epoch: 2 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:36:01,900-Speed 3235.51 samples/sec Loss 6.0115 Epoch: 2 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:36:17,806-Speed 3219.11 samples/sec Loss 6.0463 Epoch: 2 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:36:33,790-Speed 3203.33 samples/sec Loss 5.9674 Epoch: 2 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:36:50,628-Speed 3040.77 samples/sec Loss 6.0128 Epoch: 2 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:37:07,627-Speed 3012.10 samples/sec Loss 5.9854 Epoch: 2 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:37:23,774-Speed 3170.93 samples/sec Loss 5.9613 Epoch: 2 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:37:39,559-Speed 3243.70 samples/sec Loss 5.9711 Epoch: 2 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:37:55,436-Speed 3224.90 samples/sec Loss 5.9992 Epoch: 2 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:38:11,301-Speed 3227.34 samples/sec Loss 5.9726 Epoch: 2 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:38:27,001-Speed 3261.25 samples/sec Loss 5.9868 Epoch: 2 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:38:42,837-Speed 3233.14 samples/sec Loss 6.0351 Epoch: 2 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:38:58,761-Speed 3215.43 samples/sec Loss 5.9869 Epoch: 2 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:39:14,577-Speed 3237.30 samples/sec Loss 5.9758 Epoch: 2 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:39:30,509-Speed 3213.77 samples/sec Loss 5.9530 Epoch: 2 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:39:46,718-Speed 3158.82 samples/sec Loss 5.9807 Epoch: 2 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:40:02,528-Speed 3238.68 samples/sec Loss 5.9857 Epoch: 2 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:40:18,639-Speed 3178.08 samples/sec Loss 6.0032 Epoch: 2 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:40:34,526-Speed 3222.76 samples/sec Loss 5.9913 Epoch: 2 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:40:50,237-Speed 3259.02 samples/sec Loss 5.9836 Epoch: 2 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:41:06,274-Speed 3192.68 samples/sec Loss 5.9596 Epoch: 2 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:41:22,167-Speed 3221.69 samples/sec Loss 6.0226 Epoch: 2 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:41:38,846-Speed 3069.67 samples/sec Loss 6.0126 Epoch: 2 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:41:54,762-Speed 3217.06 samples/sec Loss 5.9855 Epoch: 2 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:42:10,488-Speed 3255.93 samples/sec Loss 6.0093 Epoch: 2 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:42:26,350-Speed 3227.97 samples/sec Loss 5.9554 Epoch: 2 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:42:42,417-Speed 3186.60 samples/sec Loss 6.0084 Epoch: 2 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:42:58,549-Speed 3174.00 samples/sec Loss 5.9944 Epoch: 2 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:43:14,280-Speed 3254.81 samples/sec Loss 5.9473 Epoch: 2 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:43:30,402-Speed 3175.84 samples/sec Loss 5.9978 Epoch: 2 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:43:46,559-Speed 3168.97 samples/sec Loss 5.9779 Epoch: 2 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:44:02,350-Speed 3242.45 samples/sec Loss 5.9736 Epoch: 2 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:44:19,506-Speed 2984.51 samples/sec Loss 5.9896 Epoch: 2 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:44:36,511-Speed 3010.99 samples/sec Loss 6.0087 Epoch: 2 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:44:54,139-Speed 2904.58 samples/sec Loss 6.0143 Epoch: 2 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:45:09,984-Speed 3231.31 samples/sec Loss 5.9687 Epoch: 2 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:45:26,700-Speed 3063.04 samples/sec Loss 5.9488 Epoch: 2 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:45:42,594-Speed 3221.50 samples/sec Loss 5.9532 Epoch: 2 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:45:58,440-Speed 3231.08 samples/sec Loss 5.9777 Epoch: 2 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:46:51,555-[lfw][50000]XNorm: 24.003118 Training: 2021-03-16 05:46:51,555-[lfw][50000]Accuracy-Flip: 0.99650+-0.00283 Training: 2021-03-16 05:46:51,555-[lfw][50000]Accuracy-Highest: 0.99700 Training: 2021-03-16 05:47:53,316-[cfp_fp][50000]XNorm: 20.279611 Training: 2021-03-16 05:47:53,316-[cfp_fp][50000]Accuracy-Flip: 0.95043+-0.00919 Training: 2021-03-16 05:47:53,316-[cfp_fp][50000]Accuracy-Highest: 0.95329 Training: 2021-03-16 05:48:46,532-[agedb_30][50000]XNorm: 22.636172 Training: 2021-03-16 05:48:46,533-[agedb_30][50000]Accuracy-Flip: 0.96100+-0.01086 Training: 2021-03-16 05:48:46,533-[agedb_30][50000]Accuracy-Highest: 0.96283 Training: 2021-03-16 05:49:02,727-Speed 277.83 samples/sec Loss 6.0024 Epoch: 2 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:49:32,731-Speed 1706.45 samples/sec Loss 5.6041 Epoch: 3 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:49:48,929-Speed 3160.96 samples/sec Loss 5.3239 Epoch: 3 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:50:04,790-Speed 3228.31 samples/sec Loss 5.3672 Epoch: 3 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:50:20,794-Speed 3199.23 samples/sec Loss 5.3519 Epoch: 3 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:50:36,829-Speed 3193.14 samples/sec Loss 5.3720 Epoch: 3 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:50:52,768-Speed 3212.34 samples/sec Loss 5.3703 Epoch: 3 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:51:08,461-Speed 3262.81 samples/sec Loss 5.4832 Epoch: 3 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:51:24,895-Speed 3115.61 samples/sec Loss 5.4362 Epoch: 3 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:51:40,799-Speed 3219.28 samples/sec Loss 5.4171 Epoch: 3 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:51:56,695-Speed 3221.11 samples/sec Loss 5.4639 Epoch: 3 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:52:13,182-Speed 3105.55 samples/sec Loss 5.5063 Epoch: 3 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:52:28,894-Speed 3258.78 samples/sec Loss 5.5085 Epoch: 3 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:52:44,838-Speed 3211.38 samples/sec Loss 5.4376 Epoch: 3 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:53:00,706-Speed 3226.68 samples/sec Loss 5.4853 Epoch: 3 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:53:16,518-Speed 3238.23 samples/sec Loss 5.4949 Epoch: 3 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:53:32,445-Speed 3214.70 samples/sec Loss 5.5654 Epoch: 3 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:53:48,461-Speed 3196.96 samples/sec Loss 5.5666 Epoch: 3 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:54:04,170-Speed 3259.33 samples/sec Loss 5.5600 Epoch: 3 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:54:20,324-Speed 3169.58 samples/sec Loss 5.5939 Epoch: 3 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:54:36,228-Speed 3219.42 samples/sec Loss 5.5728 Epoch: 3 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:54:52,148-Speed 3216.24 samples/sec Loss 5.5895 Epoch: 3 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:55:08,233-Speed 3183.14 samples/sec Loss 5.6169 Epoch: 3 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:55:24,021-Speed 3243.15 samples/sec Loss 5.6246 Epoch: 3 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:55:39,963-Speed 3211.68 samples/sec Loss 5.6256 Epoch: 3 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:55:57,067-Speed 2993.61 samples/sec Loss 5.5915 Epoch: 3 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:56:13,121-Speed 3189.23 samples/sec Loss 5.6186 Epoch: 3 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:56:28,948-Speed 3235.01 samples/sec Loss 5.6721 Epoch: 3 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:56:44,838-Speed 3222.40 samples/sec Loss 5.6560 Epoch: 3 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:57:00,896-Speed 3188.55 samples/sec Loss 5.7000 Epoch: 3 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:57:16,573-Speed 3265.92 samples/sec Loss 5.6783 Epoch: 3 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:57:32,279-Speed 3259.96 samples/sec Loss 5.6978 Epoch: 3 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:57:47,917-Speed 3274.27 samples/sec Loss 5.6735 Epoch: 3 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:58:03,813-Speed 3221.08 samples/sec Loss 5.6586 Epoch: 3 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:58:19,819-Speed 3198.96 samples/sec Loss 5.6281 Epoch: 3 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:58:36,426-Speed 3083.02 samples/sec Loss 5.6566 Epoch: 3 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:58:53,292-Speed 3035.93 samples/sec Loss 5.7328 Epoch: 3 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:59:10,338-Speed 3003.59 samples/sec Loss 5.7271 Epoch: 3 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:59:26,219-Speed 3224.07 samples/sec Loss 5.6774 Epoch: 3 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 05:59:42,906-Speed 3068.51 samples/sec Loss 5.6380 Epoch: 3 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:00:35,992-[lfw][52000]XNorm: 21.498636 Training: 2021-03-16 06:00:35,992-[lfw][52000]Accuracy-Flip: 0.99633+-0.00323 Training: 2021-03-16 06:00:35,992-[lfw][52000]Accuracy-Highest: 0.99700 Training: 2021-03-16 06:01:37,580-[cfp_fp][52000]XNorm: 18.476021 Training: 2021-03-16 06:01:37,580-[cfp_fp][52000]Accuracy-Flip: 0.94129+-0.01027 Training: 2021-03-16 06:01:37,580-[cfp_fp][52000]Accuracy-Highest: 0.95329 Training: 2021-03-16 06:02:30,668-[agedb_30][52000]XNorm: 20.915132 Training: 2021-03-16 06:02:30,668-[agedb_30][52000]Accuracy-Flip: 0.96033+-0.00852 Training: 2021-03-16 06:02:30,668-[agedb_30][52000]Accuracy-Highest: 0.96283 Training: 2021-03-16 06:02:46,376-Speed 279.07 samples/sec Loss 5.7119 Epoch: 3 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:03:03,323-Speed 3021.26 samples/sec Loss 5.7267 Epoch: 3 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:03:19,167-Speed 3231.63 samples/sec Loss 5.6930 Epoch: 3 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:03:35,107-Speed 3212.15 samples/sec Loss 5.7558 Epoch: 3 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:03:51,043-Speed 3212.95 samples/sec Loss 5.6979 Epoch: 3 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:04:07,006-Speed 3207.43 samples/sec Loss 5.7230 Epoch: 3 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:04:22,628-Speed 3277.45 samples/sec Loss 5.7737 Epoch: 3 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:04:38,537-Speed 3218.42 samples/sec Loss 5.7042 Epoch: 3 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:04:54,377-Speed 3232.56 samples/sec Loss 5.7260 Epoch: 3 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:05:10,380-Speed 3199.48 samples/sec Loss 5.7338 Epoch: 3 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:05:26,945-Speed 3090.85 samples/sec Loss 5.7633 Epoch: 3 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:05:42,732-Speed 3243.45 samples/sec Loss 5.7235 Epoch: 3 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:05:58,464-Speed 3254.49 samples/sec Loss 5.7806 Epoch: 3 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:06:15,174-Speed 3064.09 samples/sec Loss 5.7657 Epoch: 3 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:06:30,870-Speed 3262.19 samples/sec Loss 5.7555 Epoch: 3 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:06:46,659-Speed 3242.80 samples/sec Loss 5.7513 Epoch: 3 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:07:02,742-Speed 3183.69 samples/sec Loss 5.8119 Epoch: 3 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:07:18,557-Speed 3237.37 samples/sec Loss 5.7310 Epoch: 3 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:07:34,277-Speed 3257.22 samples/sec Loss 5.7957 Epoch: 3 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:07:50,038-Speed 3248.65 samples/sec Loss 5.7262 Epoch: 3 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:08:05,802-Speed 3247.97 samples/sec Loss 5.7557 Epoch: 3 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:08:21,823-Speed 3195.95 samples/sec Loss 5.7478 Epoch: 3 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:08:37,827-Speed 3199.22 samples/sec Loss 5.8052 Epoch: 3 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:08:53,720-Speed 3221.65 samples/sec Loss 5.7560 Epoch: 3 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:09:09,489-Speed 3246.94 samples/sec Loss 5.7651 Epoch: 3 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:09:25,225-Speed 3253.83 samples/sec Loss 5.8091 Epoch: 3 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:09:41,009-Speed 3243.89 samples/sec Loss 5.8400 Epoch: 3 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:09:57,697-Speed 3068.13 samples/sec Loss 5.8736 Epoch: 3 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:10:13,497-Speed 3240.68 samples/sec Loss 5.8337 Epoch: 3 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:10:29,404-Speed 3218.82 samples/sec Loss 5.8191 Epoch: 3 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:10:45,171-Speed 3247.31 samples/sec Loss 5.8263 Epoch: 3 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:11:00,937-Speed 3247.63 samples/sec Loss 5.7730 Epoch: 3 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:11:16,717-Speed 3244.61 samples/sec Loss 5.8570 Epoch: 3 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:11:32,469-Speed 3250.60 samples/sec Loss 5.7898 Epoch: 3 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:11:48,140-Speed 3267.33 samples/sec Loss 5.8112 Epoch: 3 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:12:03,936-Speed 3241.23 samples/sec Loss 5.7203 Epoch: 3 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:12:19,902-Speed 3207.00 samples/sec Loss 5.7792 Epoch: 3 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:12:35,903-Speed 3199.88 samples/sec Loss 5.7723 Epoch: 3 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:12:52,465-Speed 3091.61 samples/sec Loss 5.8026 Epoch: 3 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:13:09,520-Speed 3002.07 samples/sec Loss 5.7987 Epoch: 3 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:14:02,681-[lfw][54000]XNorm: 21.903616 Training: 2021-03-16 06:14:02,682-[lfw][54000]Accuracy-Flip: 0.99600+-0.00281 Training: 2021-03-16 06:14:02,682-[lfw][54000]Accuracy-Highest: 0.99700 Training: 2021-03-16 06:15:04,550-[cfp_fp][54000]XNorm: 18.613318 Training: 2021-03-16 06:15:04,550-[cfp_fp][54000]Accuracy-Flip: 0.94271+-0.01078 Training: 2021-03-16 06:15:04,550-[cfp_fp][54000]Accuracy-Highest: 0.95329 Training: 2021-03-16 06:15:57,804-[agedb_30][54000]XNorm: 21.084846 Training: 2021-03-16 06:15:57,804-[agedb_30][54000]Accuracy-Flip: 0.95633+-0.01152 Training: 2021-03-16 06:15:57,804-[agedb_30][54000]Accuracy-Highest: 0.96283 Training: 2021-03-16 06:16:14,446-Speed 276.87 samples/sec Loss 5.8335 Epoch: 3 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:16:31,522-Speed 2998.48 samples/sec Loss 5.8353 Epoch: 3 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:16:47,499-Speed 3204.60 samples/sec Loss 5.8441 Epoch: 3 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:17:03,344-Speed 3231.51 samples/sec Loss 5.8292 Epoch: 3 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:17:20,058-Speed 3063.39 samples/sec Loss 5.8346 Epoch: 3 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:17:36,435-Speed 3126.41 samples/sec Loss 5.8075 Epoch: 3 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:17:52,238-Speed 3239.96 samples/sec Loss 5.8136 Epoch: 3 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:18:08,143-Speed 3219.13 samples/sec Loss 5.8203 Epoch: 3 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:18:24,426-Speed 3144.50 samples/sec Loss 5.8296 Epoch: 3 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:18:40,162-Speed 3253.97 samples/sec Loss 5.8182 Epoch: 3 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:18:56,263-Speed 3179.98 samples/sec Loss 5.7996 Epoch: 3 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:19:12,084-Speed 3236.37 samples/sec Loss 5.8929 Epoch: 3 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:19:27,932-Speed 3230.84 samples/sec Loss 5.8455 Epoch: 3 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:19:43,680-Speed 3251.22 samples/sec Loss 5.8068 Epoch: 3 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:20:00,016-Speed 3134.21 samples/sec Loss 5.8285 Epoch: 3 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 32 hours Training: 2021-03-16 06:20:16,639-Speed 3080.22 samples/sec Loss 5.8155 Epoch: 3 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:20:32,713-Speed 3185.43 samples/sec Loss 5.8375 Epoch: 3 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:20:49,673-Speed 3018.97 samples/sec Loss 5.8471 Epoch: 3 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:21:05,811-Speed 3172.70 samples/sec Loss 5.7956 Epoch: 3 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:21:21,695-Speed 3223.42 samples/sec Loss 5.8715 Epoch: 3 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:21:37,829-Speed 3173.65 samples/sec Loss 5.8225 Epoch: 3 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:21:53,894-Speed 3187.04 samples/sec Loss 5.8457 Epoch: 3 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:22:10,057-Speed 3167.85 samples/sec Loss 5.8397 Epoch: 3 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:22:26,097-Speed 3192.08 samples/sec Loss 5.8162 Epoch: 3 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:22:42,231-Speed 3173.56 samples/sec Loss 5.8703 Epoch: 3 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:22:58,212-Speed 3203.88 samples/sec Loss 5.8550 Epoch: 3 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:23:14,098-Speed 3222.98 samples/sec Loss 5.8767 Epoch: 3 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:23:30,023-Speed 3215.35 samples/sec Loss 5.8283 Epoch: 3 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:23:45,739-Speed 3257.95 samples/sec Loss 5.9131 Epoch: 3 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:24:01,518-Speed 3244.88 samples/sec Loss 5.8467 Epoch: 3 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:24:17,735-Speed 3157.34 samples/sec Loss 5.8542 Epoch: 3 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:24:34,641-Speed 3028.56 samples/sec Loss 5.8774 Epoch: 3 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:24:50,413-Speed 3246.43 samples/sec Loss 5.8543 Epoch: 3 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:25:06,371-Speed 3208.34 samples/sec Loss 5.7911 Epoch: 3 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:25:22,476-Speed 3179.25 samples/sec Loss 5.8551 Epoch: 3 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:25:38,304-Speed 3234.93 samples/sec Loss 5.8562 Epoch: 3 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:25:54,647-Speed 3132.95 samples/sec Loss 5.8616 Epoch: 3 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:26:10,560-Speed 3217.65 samples/sec Loss 5.8433 Epoch: 3 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:26:26,919-Speed 3129.77 samples/sec Loss 5.8394 Epoch: 3 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:26:43,010-Speed 3182.01 samples/sec Loss 5.8843 Epoch: 3 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:27:36,193-[lfw][56000]XNorm: 22.301102 Training: 2021-03-16 06:27:36,194-[lfw][56000]Accuracy-Flip: 0.99583+-0.00291 Training: 2021-03-16 06:27:36,194-[lfw][56000]Accuracy-Highest: 0.99700 Training: 2021-03-16 06:28:37,943-[cfp_fp][56000]XNorm: 19.125153 Training: 2021-03-16 06:28:37,943-[cfp_fp][56000]Accuracy-Flip: 0.95000+-0.00994 Training: 2021-03-16 06:28:37,943-[cfp_fp][56000]Accuracy-Highest: 0.95329 Training: 2021-03-16 06:29:31,083-[agedb_30][56000]XNorm: 21.977139 Training: 2021-03-16 06:29:31,084-[agedb_30][56000]Accuracy-Flip: 0.96117+-0.00687 Training: 2021-03-16 06:29:31,084-[agedb_30][56000]Accuracy-Highest: 0.96283 Training: 2021-03-16 06:29:47,118-Speed 278.10 samples/sec Loss 5.8753 Epoch: 3 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:30:02,867-Speed 3251.22 samples/sec Loss 5.8717 Epoch: 3 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:30:20,969-Speed 2828.44 samples/sec Loss 5.8600 Epoch: 3 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:30:36,961-Speed 3201.63 samples/sec Loss 5.8101 Epoch: 3 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:30:53,527-Speed 3090.76 samples/sec Loss 5.8824 Epoch: 3 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:31:09,448-Speed 3216.02 samples/sec Loss 5.8164 Epoch: 3 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:31:25,554-Speed 3179.01 samples/sec Loss 5.8867 Epoch: 3 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:31:41,678-Speed 3175.55 samples/sec Loss 5.8715 Epoch: 3 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:31:57,813-Speed 3173.38 samples/sec Loss 5.8649 Epoch: 3 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:32:14,359-Speed 3094.43 samples/sec Loss 5.8582 Epoch: 3 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:32:30,311-Speed 3209.86 samples/sec Loss 5.8490 Epoch: 3 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:32:46,496-Speed 3163.49 samples/sec Loss 5.8825 Epoch: 3 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:33:02,568-Speed 3185.76 samples/sec Loss 5.8411 Epoch: 3 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:33:18,373-Speed 3239.43 samples/sec Loss 5.8603 Epoch: 3 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:33:34,420-Speed 3190.89 samples/sec Loss 5.8628 Epoch: 3 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:33:50,281-Speed 3228.07 samples/sec Loss 5.8935 Epoch: 3 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:34:06,094-Speed 3237.97 samples/sec Loss 5.8790 Epoch: 3 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:34:22,303-Speed 3158.81 samples/sec Loss 5.9139 Epoch: 3 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:34:38,985-Speed 3069.23 samples/sec Loss 5.8541 Epoch: 3 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:34:55,110-Speed 3175.22 samples/sec Loss 5.8647 Epoch: 3 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:35:11,031-Speed 3215.99 samples/sec Loss 5.8843 Epoch: 3 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:35:27,740-Speed 3064.39 samples/sec Loss 5.8843 Epoch: 3 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:35:43,704-Speed 3207.28 samples/sec Loss 5.8950 Epoch: 3 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:35:59,652-Speed 3210.46 samples/sec Loss 5.8855 Epoch: 3 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:36:15,632-Speed 3204.30 samples/sec Loss 5.8364 Epoch: 3 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:36:31,665-Speed 3193.41 samples/sec Loss 5.8720 Epoch: 3 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:36:47,721-Speed 3189.01 samples/sec Loss 5.9222 Epoch: 3 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:37:03,427-Speed 3260.02 samples/sec Loss 5.8420 Epoch: 3 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:37:19,711-Speed 3144.25 samples/sec Loss 5.8565 Epoch: 3 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:37:35,639-Speed 3214.52 samples/sec Loss 5.8841 Epoch: 3 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:37:51,468-Speed 3234.67 samples/sec Loss 5.7983 Epoch: 3 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:38:07,329-Speed 3228.04 samples/sec Loss 5.9100 Epoch: 3 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:38:23,220-Speed 3222.15 samples/sec Loss 5.8618 Epoch: 3 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:38:39,925-Speed 3064.95 samples/sec Loss 5.9101 Epoch: 3 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:38:55,889-Speed 3207.33 samples/sec Loss 5.8989 Epoch: 3 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:39:11,682-Speed 3242.12 samples/sec Loss 5.8455 Epoch: 3 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:39:27,700-Speed 3196.52 samples/sec Loss 5.8732 Epoch: 3 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:39:43,562-Speed 3227.89 samples/sec Loss 5.8466 Epoch: 3 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:39:59,291-Speed 3255.26 samples/sec Loss 5.8216 Epoch: 3 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:40:15,250-Speed 3208.24 samples/sec Loss 5.7909 Epoch: 3 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:41:08,355-[lfw][58000]XNorm: 22.480267 Training: 2021-03-16 06:41:08,355-[lfw][58000]Accuracy-Flip: 0.99467+-0.00407 Training: 2021-03-16 06:41:08,355-[lfw][58000]Accuracy-Highest: 0.99700 Training: 2021-03-16 06:42:10,128-[cfp_fp][58000]XNorm: 19.721222 Training: 2021-03-16 06:42:10,129-[cfp_fp][58000]Accuracy-Flip: 0.95029+-0.01238 Training: 2021-03-16 06:42:10,129-[cfp_fp][58000]Accuracy-Highest: 0.95329 Training: 2021-03-16 06:43:03,425-[agedb_30][58000]XNorm: 22.131291 Training: 2021-03-16 06:43:03,426-[agedb_30][58000]Accuracy-Flip: 0.95733+-0.00923 Training: 2021-03-16 06:43:03,426-[agedb_30][58000]Accuracy-Highest: 0.96283 Training: 2021-03-16 06:43:19,384-Speed 278.06 samples/sec Loss 5.8717 Epoch: 3 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:43:35,180-Speed 3241.52 samples/sec Loss 5.9711 Epoch: 3 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:43:51,122-Speed 3211.73 samples/sec Loss 5.8623 Epoch: 3 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:44:06,998-Speed 3225.03 samples/sec Loss 5.8862 Epoch: 3 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:44:23,782-Speed 3050.63 samples/sec Loss 5.8771 Epoch: 3 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:44:40,452-Speed 3071.49 samples/sec Loss 5.8787 Epoch: 3 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:44:57,261-Speed 3046.17 samples/sec Loss 5.9423 Epoch: 3 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:45:14,250-Speed 3013.68 samples/sec Loss 5.8579 Epoch: 3 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:45:30,085-Speed 3233.44 samples/sec Loss 5.8940 Epoch: 3 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:45:46,006-Speed 3216.08 samples/sec Loss 5.8385 Epoch: 3 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:46:01,989-Speed 3203.51 samples/sec Loss 5.8653 Epoch: 3 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:46:18,044-Speed 3189.18 samples/sec Loss 5.9034 Epoch: 3 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:46:34,590-Speed 3094.47 samples/sec Loss 5.8908 Epoch: 3 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:46:50,337-Speed 3251.57 samples/sec Loss 5.9170 Epoch: 3 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:47:06,303-Speed 3206.82 samples/sec Loss 5.8659 Epoch: 3 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:47:22,206-Speed 3219.67 samples/sec Loss 5.8792 Epoch: 3 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:47:38,068-Speed 3227.92 samples/sec Loss 5.8644 Epoch: 3 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:47:54,006-Speed 3212.51 samples/sec Loss 5.8439 Epoch: 3 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:48:09,688-Speed 3265.11 samples/sec Loss 5.8616 Epoch: 3 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:48:25,463-Speed 3245.77 samples/sec Loss 5.8514 Epoch: 3 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:48:41,200-Speed 3253.56 samples/sec Loss 5.8713 Epoch: 3 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:48:58,004-Speed 3047.01 samples/sec Loss 5.9031 Epoch: 3 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:49:13,932-Speed 3214.45 samples/sec Loss 5.8692 Epoch: 3 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:49:29,930-Speed 3200.47 samples/sec Loss 5.8549 Epoch: 3 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:49:46,632-Speed 3065.71 samples/sec Loss 5.8570 Epoch: 3 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:50:02,357-Speed 3256.00 samples/sec Loss 5.8967 Epoch: 3 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:50:18,417-Speed 3188.18 samples/sec Loss 5.8727 Epoch: 3 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:50:34,009-Speed 3283.81 samples/sec Loss 5.8521 Epoch: 3 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:50:50,096-Speed 3182.81 samples/sec Loss 5.8871 Epoch: 3 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:51:06,018-Speed 3215.80 samples/sec Loss 5.8764 Epoch: 3 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:51:21,926-Speed 3218.49 samples/sec Loss 5.8867 Epoch: 3 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:51:37,730-Speed 3239.80 samples/sec Loss 5.9110 Epoch: 3 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:51:53,711-Speed 3204.00 samples/sec Loss 5.8623 Epoch: 3 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:52:09,858-Speed 3170.86 samples/sec Loss 5.8709 Epoch: 3 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:52:25,778-Speed 3216.19 samples/sec Loss 5.8382 Epoch: 3 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:52:41,813-Speed 3193.05 samples/sec Loss 5.8663 Epoch: 3 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:52:57,621-Speed 3239.04 samples/sec Loss 5.8706 Epoch: 3 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:53:14,184-Speed 3091.31 samples/sec Loss 5.8363 Epoch: 3 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:53:30,247-Speed 3187.50 samples/sec Loss 5.9064 Epoch: 3 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:53:46,089-Speed 3232.13 samples/sec Loss 5.9234 Epoch: 3 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:54:39,255-[lfw][60000]XNorm: 21.200575 Training: 2021-03-16 06:54:39,256-[lfw][60000]Accuracy-Flip: 0.99667+-0.00236 Training: 2021-03-16 06:54:39,256-[lfw][60000]Accuracy-Highest: 0.99700 Training: 2021-03-16 06:55:41,126-[cfp_fp][60000]XNorm: 18.114383 Training: 2021-03-16 06:55:41,126-[cfp_fp][60000]Accuracy-Flip: 0.95200+-0.00791 Training: 2021-03-16 06:55:41,126-[cfp_fp][60000]Accuracy-Highest: 0.95329 Training: 2021-03-16 06:56:34,333-[agedb_30][60000]XNorm: 20.280381 Training: 2021-03-16 06:56:34,333-[agedb_30][60000]Accuracy-Flip: 0.96750+-0.00712 Training: 2021-03-16 06:56:34,333-[agedb_30][60000]Accuracy-Highest: 0.96750 Training: 2021-03-16 06:56:50,421-Speed 277.76 samples/sec Loss 5.8435 Epoch: 3 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:57:06,506-Speed 3183.18 samples/sec Loss 5.8499 Epoch: 3 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:57:22,591-Speed 3183.32 samples/sec Loss 5.8363 Epoch: 3 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:57:38,243-Speed 3271.12 samples/sec Loss 5.8887 Epoch: 3 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:57:54,321-Speed 3184.69 samples/sec Loss 5.8641 Epoch: 3 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:58:10,429-Speed 3178.52 samples/sec Loss 5.8653 Epoch: 3 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:58:26,280-Speed 3230.17 samples/sec Loss 5.8401 Epoch: 3 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:58:43,281-Speed 3011.66 samples/sec Loss 5.8286 Epoch: 3 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:59:00,330-Speed 3003.23 samples/sec Loss 5.8616 Epoch: 3 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:59:17,278-Speed 3021.13 samples/sec Loss 5.8564 Epoch: 3 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:59:34,139-Speed 3036.68 samples/sec Loss 5.8134 Epoch: 3 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 06:59:49,912-Speed 3246.14 samples/sec Loss 5.8371 Epoch: 3 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:00:05,783-Speed 3226.20 samples/sec Loss 5.8612 Epoch: 3 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:00:21,509-Speed 3255.77 samples/sec Loss 5.8610 Epoch: 3 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:00:38,164-Speed 3074.26 samples/sec Loss 5.8927 Epoch: 3 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:00:53,925-Speed 3248.63 samples/sec Loss 5.8421 Epoch: 3 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:01:09,888-Speed 3207.64 samples/sec Loss 5.8852 Epoch: 3 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:01:25,757-Speed 3226.39 samples/sec Loss 5.8501 Epoch: 3 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:01:41,500-Speed 3252.46 samples/sec Loss 5.8476 Epoch: 3 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:01:57,335-Speed 3233.51 samples/sec Loss 5.8933 Epoch: 3 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:02:13,322-Speed 3202.53 samples/sec Loss 5.8733 Epoch: 3 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:02:29,393-Speed 3186.14 samples/sec Loss 5.8903 Epoch: 3 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:02:45,452-Speed 3188.21 samples/sec Loss 5.8384 Epoch: 3 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:03:01,454-Speed 3199.77 samples/sec Loss 5.8651 Epoch: 3 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:03:17,498-Speed 3191.29 samples/sec Loss 5.8573 Epoch: 3 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:03:34,331-Speed 3041.68 samples/sec Loss 5.8068 Epoch: 3 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:03:50,298-Speed 3206.74 samples/sec Loss 5.8650 Epoch: 3 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:04:07,224-Speed 3025.12 samples/sec Loss 5.8436 Epoch: 3 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:04:23,059-Speed 3233.33 samples/sec Loss 5.8919 Epoch: 3 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:04:38,979-Speed 3216.26 samples/sec Loss 5.8303 Epoch: 3 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:04:54,917-Speed 3212.58 samples/sec Loss 5.8530 Epoch: 3 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:05:10,785-Speed 3226.74 samples/sec Loss 5.8464 Epoch: 3 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:05:26,941-Speed 3169.15 samples/sec Loss 5.8449 Epoch: 3 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:05:42,968-Speed 3194.74 samples/sec Loss 5.8773 Epoch: 3 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:05:58,841-Speed 3225.75 samples/sec Loss 5.8750 Epoch: 3 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:06:14,777-Speed 3212.92 samples/sec Loss 5.8772 Epoch: 3 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:06:30,743-Speed 3206.81 samples/sec Loss 5.8313 Epoch: 3 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:06:47,062-Speed 3137.47 samples/sec Loss 5.8974 Epoch: 3 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:07:03,273-Speed 3158.53 samples/sec Loss 5.8360 Epoch: 3 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:07:20,214-Speed 3022.39 samples/sec Loss 5.8133 Epoch: 3 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:08:13,557-[lfw][62000]XNorm: 23.344648 Training: 2021-03-16 07:08:13,558-[lfw][62000]Accuracy-Flip: 0.99717+-0.00299 Training: 2021-03-16 07:08:13,558-[lfw][62000]Accuracy-Highest: 0.99717 Training: 2021-03-16 07:09:15,361-[cfp_fp][62000]XNorm: 20.226853 Training: 2021-03-16 07:09:15,361-[cfp_fp][62000]Accuracy-Flip: 0.94243+-0.01036 Training: 2021-03-16 07:09:15,361-[cfp_fp][62000]Accuracy-Highest: 0.95329 Training: 2021-03-16 07:10:08,657-[agedb_30][62000]XNorm: 22.604772 Training: 2021-03-16 07:10:08,657-[agedb_30][62000]Accuracy-Flip: 0.95767+-0.00987 Training: 2021-03-16 07:10:08,657-[agedb_30][62000]Accuracy-Highest: 0.96750 Training: 2021-03-16 07:10:24,591-Speed 277.69 samples/sec Loss 5.8266 Epoch: 3 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:10:40,207-Speed 3278.76 samples/sec Loss 5.8226 Epoch: 3 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:10:56,017-Speed 3238.58 samples/sec Loss 5.8535 Epoch: 3 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:11:11,941-Speed 3215.38 samples/sec Loss 5.8391 Epoch: 3 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:11:28,111-Speed 3166.49 samples/sec Loss 5.8986 Epoch: 3 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:11:43,959-Speed 3230.87 samples/sec Loss 5.8641 Epoch: 3 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:11:59,845-Speed 3223.08 samples/sec Loss 5.8835 Epoch: 3 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:12:15,542-Speed 3261.79 samples/sec Loss 5.8765 Epoch: 3 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:12:31,482-Speed 3212.28 samples/sec Loss 5.8081 Epoch: 3 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:12:48,088-Speed 3083.33 samples/sec Loss 5.8648 Epoch: 3 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:13:03,702-Speed 3279.19 samples/sec Loss 5.8378 Epoch: 3 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:13:20,666-Speed 3018.22 samples/sec Loss 5.8333 Epoch: 3 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:13:38,335-Speed 2897.83 samples/sec Loss 5.8347 Epoch: 3 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:13:54,305-Speed 3206.14 samples/sec Loss 5.8944 Epoch: 3 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:14:10,242-Speed 3212.66 samples/sec Loss 5.8509 Epoch: 3 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:14:26,256-Speed 3197.40 samples/sec Loss 5.8804 Epoch: 3 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:14:43,004-Speed 3057.20 samples/sec Loss 5.8849 Epoch: 3 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:14:58,801-Speed 3241.37 samples/sec Loss 5.8343 Epoch: 3 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:15:14,947-Speed 3171.06 samples/sec Loss 5.7997 Epoch: 3 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:15:30,783-Speed 3233.34 samples/sec Loss 5.8211 Epoch: 3 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:15:46,593-Speed 3238.46 samples/sec Loss 5.8223 Epoch: 3 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:16:02,771-Speed 3164.88 samples/sec Loss 5.8704 Epoch: 3 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:16:18,810-Speed 3192.33 samples/sec Loss 5.8774 Epoch: 3 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:16:34,647-Speed 3233.00 samples/sec Loss 5.8454 Epoch: 3 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:16:50,603-Speed 3208.92 samples/sec Loss 5.8157 Epoch: 3 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:17:06,331-Speed 3255.51 samples/sec Loss 5.8146 Epoch: 3 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:17:22,459-Speed 3174.63 samples/sec Loss 5.8955 Epoch: 3 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:17:38,357-Speed 3220.70 samples/sec Loss 5.8266 Epoch: 3 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:17:55,160-Speed 3047.24 samples/sec Loss 5.8642 Epoch: 3 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:18:11,068-Speed 3218.50 samples/sec Loss 5.8153 Epoch: 3 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:18:27,792-Speed 3061.67 samples/sec Loss 5.8439 Epoch: 3 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:18:43,750-Speed 3208.39 samples/sec Loss 5.9078 Epoch: 3 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:18:59,721-Speed 3205.91 samples/sec Loss 5.8923 Epoch: 3 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:19:15,705-Speed 3203.36 samples/sec Loss 5.8114 Epoch: 3 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:19:31,690-Speed 3203.23 samples/sec Loss 5.8538 Epoch: 3 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:19:47,595-Speed 3219.04 samples/sec Loss 5.8576 Epoch: 3 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:20:03,406-Speed 3238.39 samples/sec Loss 5.8532 Epoch: 3 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:20:19,324-Speed 3216.73 samples/sec Loss 5.8203 Epoch: 3 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:20:35,471-Speed 3170.92 samples/sec Loss 5.7836 Epoch: 3 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:20:51,327-Speed 3229.19 samples/sec Loss 5.8085 Epoch: 3 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:21:44,385-[lfw][64000]XNorm: 21.461468 Training: 2021-03-16 07:21:44,385-[lfw][64000]Accuracy-Flip: 0.99483+-0.00411 Training: 2021-03-16 07:21:44,386-[lfw][64000]Accuracy-Highest: 0.99717 Training: 2021-03-16 07:22:46,196-[cfp_fp][64000]XNorm: 18.723041 Training: 2021-03-16 07:22:46,196-[cfp_fp][64000]Accuracy-Flip: 0.94343+-0.00933 Training: 2021-03-16 07:22:46,196-[cfp_fp][64000]Accuracy-Highest: 0.95329 Training: 2021-03-16 07:23:39,403-[agedb_30][64000]XNorm: 20.699902 Training: 2021-03-16 07:23:39,404-[agedb_30][64000]Accuracy-Flip: 0.96350+-0.00970 Training: 2021-03-16 07:23:39,404-[agedb_30][64000]Accuracy-Highest: 0.96750 Training: 2021-03-16 07:23:55,187-Speed 278.47 samples/sec Loss 5.8278 Epoch: 3 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:24:11,884-Speed 3066.52 samples/sec Loss 5.8214 Epoch: 3 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:24:27,823-Speed 3212.26 samples/sec Loss 5.8363 Epoch: 3 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 31 hours Training: 2021-03-16 07:24:43,931-Speed 3178.73 samples/sec Loss 5.7905 Epoch: 3 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:24:59,840-Speed 3218.42 samples/sec Loss 5.7503 Epoch: 3 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:25:15,900-Speed 3188.15 samples/sec Loss 5.8373 Epoch: 3 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:25:31,705-Speed 3239.53 samples/sec Loss 5.8064 Epoch: 3 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:25:47,685-Speed 3204.03 samples/sec Loss 5.8236 Epoch: 3 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:26:03,423-Speed 3253.42 samples/sec Loss 5.8292 Epoch: 3 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:26:19,706-Speed 3144.44 samples/sec Loss 5.8428 Epoch: 3 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:26:35,426-Speed 3257.11 samples/sec Loss 5.8495 Epoch: 3 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:26:51,324-Speed 3220.82 samples/sec Loss 5.8436 Epoch: 3 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:27:07,466-Speed 3171.83 samples/sec Loss 5.8316 Epoch: 3 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:27:24,220-Speed 3056.06 samples/sec Loss 5.8996 Epoch: 3 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:27:40,803-Speed 3087.69 samples/sec Loss 5.8787 Epoch: 3 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:27:57,444-Speed 3076.71 samples/sec Loss 5.8844 Epoch: 3 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:28:14,220-Speed 3052.08 samples/sec Loss 5.8706 Epoch: 3 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:28:30,018-Speed 3240.99 samples/sec Loss 5.8201 Epoch: 3 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:28:46,772-Speed 3056.24 samples/sec Loss 5.8227 Epoch: 3 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:29:02,634-Speed 3227.80 samples/sec Loss 5.8426 Epoch: 3 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:29:18,619-Speed 3203.20 samples/sec Loss 5.8240 Epoch: 3 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:29:34,658-Speed 3192.18 samples/sec Loss 5.7728 Epoch: 3 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:29:50,426-Speed 3247.23 samples/sec Loss 5.7830 Epoch: 3 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:30:06,721-Speed 3142.26 samples/sec Loss 5.8102 Epoch: 3 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:30:22,726-Speed 3199.05 samples/sec Loss 5.8579 Epoch: 3 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:30:38,779-Speed 3189.62 samples/sec Loss 5.8563 Epoch: 3 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:30:54,540-Speed 3248.57 samples/sec Loss 5.8221 Epoch: 3 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:31:10,743-Speed 3160.16 samples/sec Loss 5.7893 Epoch: 3 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:31:26,709-Speed 3206.87 samples/sec Loss 5.8183 Epoch: 3 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:31:42,691-Speed 3203.70 samples/sec Loss 5.8554 Epoch: 3 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:31:59,492-Speed 3047.42 samples/sec Loss 5.7981 Epoch: 3 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:32:15,416-Speed 3215.39 samples/sec Loss 5.8018 Epoch: 3 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:32:31,284-Speed 3226.89 samples/sec Loss 5.8367 Epoch: 3 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:32:47,706-Speed 3117.79 samples/sec Loss 5.8584 Epoch: 3 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:33:04,577-Speed 3034.97 samples/sec Loss 5.9163 Epoch: 3 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:33:20,754-Speed 3165.07 samples/sec Loss 5.8175 Epoch: 3 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:33:36,538-Speed 3243.81 samples/sec Loss 5.8526 Epoch: 3 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:33:52,434-Speed 3221.14 samples/sec Loss 5.8392 Epoch: 3 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:34:08,407-Speed 3205.42 samples/sec Loss 5.8376 Epoch: 3 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:34:24,439-Speed 3193.72 samples/sec Loss 5.8058 Epoch: 3 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:35:17,393-[lfw][66000]XNorm: 22.059034 Training: 2021-03-16 07:35:17,394-[lfw][66000]Accuracy-Flip: 0.99533+-0.00386 Training: 2021-03-16 07:35:17,394-[lfw][66000]Accuracy-Highest: 0.99717 Training: 2021-03-16 07:36:19,223-[cfp_fp][66000]XNorm: 19.023002 Training: 2021-03-16 07:36:19,223-[cfp_fp][66000]Accuracy-Flip: 0.94043+-0.00748 Training: 2021-03-16 07:36:19,223-[cfp_fp][66000]Accuracy-Highest: 0.95329 Training: 2021-03-16 07:37:12,502-[agedb_30][66000]XNorm: 21.346816 Training: 2021-03-16 07:37:12,503-[agedb_30][66000]Accuracy-Flip: 0.96083+-0.00864 Training: 2021-03-16 07:37:12,503-[agedb_30][66000]Accuracy-Highest: 0.96750 Training: 2021-03-16 07:37:28,418-Speed 278.29 samples/sec Loss 5.8212 Epoch: 3 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:37:44,221-Speed 3240.12 samples/sec Loss 5.8401 Epoch: 3 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:38:00,135-Speed 3217.39 samples/sec Loss 5.8459 Epoch: 3 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:38:16,195-Speed 3188.04 samples/sec Loss 5.8294 Epoch: 3 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:38:33,378-Speed 2979.80 samples/sec Loss 5.7757 Epoch: 3 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:38:49,206-Speed 3234.98 samples/sec Loss 5.7934 Epoch: 3 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:39:05,391-Speed 3163.53 samples/sec Loss 5.8306 Epoch: 3 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:39:21,321-Speed 3214.15 samples/sec Loss 5.7949 Epoch: 3 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:39:37,612-Speed 3143.02 samples/sec Loss 5.7857 Epoch: 3 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:39:53,511-Speed 3220.40 samples/sec Loss 5.8530 Epoch: 3 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:40:09,445-Speed 3213.21 samples/sec Loss 5.8023 Epoch: 3 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:40:25,421-Speed 3204.99 samples/sec Loss 5.8386 Epoch: 3 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:40:41,286-Speed 3227.37 samples/sec Loss 5.8125 Epoch: 3 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:40:57,253-Speed 3206.78 samples/sec Loss 5.8106 Epoch: 3 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:41:13,313-Speed 3187.98 samples/sec Loss 5.8808 Epoch: 3 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:41:42,662-Speed 1744.57 samples/sec Loss 5.4076 Epoch: 4 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:41:59,743-Speed 2997.60 samples/sec Loss 5.1928 Epoch: 4 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:42:16,494-Speed 3056.71 samples/sec Loss 5.2036 Epoch: 4 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:42:33,223-Speed 3060.57 samples/sec Loss 5.2344 Epoch: 4 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:42:49,984-Speed 3054.83 samples/sec Loss 5.2605 Epoch: 4 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:43:06,179-Speed 3161.51 samples/sec Loss 5.2503 Epoch: 4 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:43:22,294-Speed 3177.30 samples/sec Loss 5.2764 Epoch: 4 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:43:38,942-Speed 3075.62 samples/sec Loss 5.2839 Epoch: 4 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:43:55,104-Speed 3167.95 samples/sec Loss 5.3171 Epoch: 4 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:44:11,454-Speed 3131.68 samples/sec Loss 5.3227 Epoch: 4 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:44:27,700-Speed 3151.60 samples/sec Loss 5.3185 Epoch: 4 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:44:43,855-Speed 3169.33 samples/sec Loss 5.3569 Epoch: 4 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:45:00,049-Speed 3161.91 samples/sec Loss 5.3183 Epoch: 4 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:45:15,808-Speed 3249.01 samples/sec Loss 5.3691 Epoch: 4 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:45:31,538-Speed 3254.92 samples/sec Loss 5.3330 Epoch: 4 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:45:47,573-Speed 3193.21 samples/sec Loss 5.3768 Epoch: 4 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:46:03,334-Speed 3248.51 samples/sec Loss 5.3751 Epoch: 4 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:46:19,382-Speed 3190.71 samples/sec Loss 5.4465 Epoch: 4 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:46:36,199-Speed 3044.53 samples/sec Loss 5.3836 Epoch: 4 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:46:52,136-Speed 3212.83 samples/sec Loss 5.4477 Epoch: 4 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:47:08,114-Speed 3204.36 samples/sec Loss 5.4304 Epoch: 4 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:47:24,014-Speed 3220.27 samples/sec Loss 5.4626 Epoch: 4 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:47:40,786-Speed 3052.73 samples/sec Loss 5.4470 Epoch: 4 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:47:56,697-Speed 3218.05 samples/sec Loss 5.4415 Epoch: 4 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:48:12,885-Speed 3163.08 samples/sec Loss 5.4581 Epoch: 4 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:49:06,000-[lfw][68000]XNorm: 22.825502 Training: 2021-03-16 07:49:06,001-[lfw][68000]Accuracy-Flip: 0.99667+-0.00333 Training: 2021-03-16 07:49:06,001-[lfw][68000]Accuracy-Highest: 0.99717 Training: 2021-03-16 07:50:08,236-[cfp_fp][68000]XNorm: 19.697316 Training: 2021-03-16 07:50:08,236-[cfp_fp][68000]Accuracy-Flip: 0.94843+-0.01140 Training: 2021-03-16 07:50:08,236-[cfp_fp][68000]Accuracy-Highest: 0.95329 Training: 2021-03-16 07:51:01,645-[agedb_30][68000]XNorm: 22.894038 Training: 2021-03-16 07:51:01,646-[agedb_30][68000]Accuracy-Flip: 0.95733+-0.01340 Training: 2021-03-16 07:51:01,646-[agedb_30][68000]Accuracy-Highest: 0.96750 Training: 2021-03-16 07:51:17,633-Speed 277.13 samples/sec Loss 5.5207 Epoch: 4 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:51:33,589-Speed 3208.99 samples/sec Loss 5.5224 Epoch: 4 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:51:49,610-Speed 3195.82 samples/sec Loss 5.5171 Epoch: 4 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:52:05,503-Speed 3221.84 samples/sec Loss 5.4990 Epoch: 4 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:52:21,302-Speed 3240.73 samples/sec Loss 5.4773 Epoch: 4 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:52:37,354-Speed 3189.74 samples/sec Loss 5.5050 Epoch: 4 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:52:53,311-Speed 3208.77 samples/sec Loss 5.4989 Epoch: 4 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:53:09,307-Speed 3200.84 samples/sec Loss 5.5522 Epoch: 4 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:53:25,472-Speed 3167.53 samples/sec Loss 5.5192 Epoch: 4 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:53:42,150-Speed 3070.05 samples/sec Loss 5.5925 Epoch: 4 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:53:57,959-Speed 3238.70 samples/sec Loss 5.5326 Epoch: 4 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:54:13,603-Speed 3272.90 samples/sec Loss 5.4882 Epoch: 4 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:54:29,493-Speed 3222.33 samples/sec Loss 5.5259 Epoch: 4 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:54:45,452-Speed 3208.21 samples/sec Loss 5.5922 Epoch: 4 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:55:01,390-Speed 3212.62 samples/sec Loss 5.5494 Epoch: 4 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:55:17,311-Speed 3215.94 samples/sec Loss 5.5763 Epoch: 4 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:55:33,150-Speed 3232.68 samples/sec Loss 5.5768 Epoch: 4 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:55:49,157-Speed 3198.65 samples/sec Loss 5.5332 Epoch: 4 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:56:05,117-Speed 3208.03 samples/sec Loss 5.6137 Epoch: 4 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:56:21,079-Speed 3207.83 samples/sec Loss 5.6249 Epoch: 4 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:56:37,233-Speed 3169.52 samples/sec Loss 5.5761 Epoch: 4 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:56:54,728-Speed 2926.74 samples/sec Loss 5.5951 Epoch: 4 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:57:11,509-Speed 3051.15 samples/sec Loss 5.5877 Epoch: 4 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:57:28,322-Speed 3045.39 samples/sec Loss 5.5800 Epoch: 4 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:57:44,097-Speed 3245.61 samples/sec Loss 5.5943 Epoch: 4 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:58:00,691-Speed 3085.53 samples/sec Loss 5.6263 Epoch: 4 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:58:16,454-Speed 3248.36 samples/sec Loss 5.6669 Epoch: 4 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:58:32,660-Speed 3159.31 samples/sec Loss 5.6400 Epoch: 4 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:58:48,595-Speed 3213.26 samples/sec Loss 5.6258 Epoch: 4 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:59:04,613-Speed 3196.52 samples/sec Loss 5.6379 Epoch: 4 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:59:20,617-Speed 3199.20 samples/sec Loss 5.6430 Epoch: 4 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:59:36,848-Speed 3154.64 samples/sec Loss 5.5888 Epoch: 4 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 07:59:53,089-Speed 3152.66 samples/sec Loss 5.6532 Epoch: 4 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:00:08,924-Speed 3233.31 samples/sec Loss 5.6360 Epoch: 4 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:00:24,639-Speed 3258.14 samples/sec Loss 5.6815 Epoch: 4 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:00:41,440-Speed 3047.65 samples/sec Loss 5.6850 Epoch: 4 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:00:57,238-Speed 3240.84 samples/sec Loss 5.6460 Epoch: 4 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:01:13,099-Speed 3228.10 samples/sec Loss 5.6589 Epoch: 4 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:01:29,280-Speed 3164.39 samples/sec Loss 5.6514 Epoch: 4 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:01:46,016-Speed 3059.34 samples/sec Loss 5.6072 Epoch: 4 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:02:39,397-[lfw][70000]XNorm: 21.769877 Training: 2021-03-16 08:02:39,397-[lfw][70000]Accuracy-Flip: 0.99450+-0.00334 Training: 2021-03-16 08:02:39,397-[lfw][70000]Accuracy-Highest: 0.99717 Training: 2021-03-16 08:03:43,007-[cfp_fp][70000]XNorm: 19.132068 Training: 2021-03-16 08:03:43,007-[cfp_fp][70000]Accuracy-Flip: 0.95414+-0.00614 Training: 2021-03-16 08:03:43,007-[cfp_fp][70000]Accuracy-Highest: 0.95414 Training: 2021-03-16 08:04:36,029-[agedb_30][70000]XNorm: 21.300509 Training: 2021-03-16 08:04:36,029-[agedb_30][70000]Accuracy-Flip: 0.96250+-0.00579 Training: 2021-03-16 08:04:36,030-[agedb_30][70000]Accuracy-Highest: 0.96750 Training: 2021-03-16 08:04:51,801-Speed 275.59 samples/sec Loss 5.6512 Epoch: 4 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:05:07,740-Speed 3212.28 samples/sec Loss 5.6919 Epoch: 4 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:05:23,612-Speed 3226.01 samples/sec Loss 5.6954 Epoch: 4 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:05:39,489-Speed 3224.89 samples/sec Loss 5.6540 Epoch: 4 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:05:55,360-Speed 3226.11 samples/sec Loss 5.7494 Epoch: 4 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:06:11,724-Speed 3128.94 samples/sec Loss 5.6364 Epoch: 4 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:06:27,551-Speed 3234.92 samples/sec Loss 5.6981 Epoch: 4 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:06:43,329-Speed 3245.16 samples/sec Loss 5.6603 Epoch: 4 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:06:59,174-Speed 3231.53 samples/sec Loss 5.6422 Epoch: 4 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:07:15,002-Speed 3234.84 samples/sec Loss 5.6804 Epoch: 4 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:07:30,906-Speed 3219.43 samples/sec Loss 5.6598 Epoch: 4 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:07:46,845-Speed 3212.35 samples/sec Loss 5.7324 Epoch: 4 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:08:03,606-Speed 3054.74 samples/sec Loss 5.6607 Epoch: 4 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:08:19,576-Speed 3206.08 samples/sec Loss 5.7198 Epoch: 4 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:08:35,519-Speed 3211.55 samples/sec Loss 5.7228 Epoch: 4 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:08:51,361-Speed 3232.10 samples/sec Loss 5.7084 Epoch: 4 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:09:07,195-Speed 3233.54 samples/sec Loss 5.6958 Epoch: 4 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:09:23,032-Speed 3233.14 samples/sec Loss 5.7119 Epoch: 4 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:09:38,853-Speed 3236.31 samples/sec Loss 5.7301 Epoch: 4 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:09:54,846-Speed 3201.46 samples/sec Loss 5.6853 Epoch: 4 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:10:10,676-Speed 3234.47 samples/sec Loss 5.6556 Epoch: 4 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:10:26,493-Speed 3237.03 samples/sec Loss 5.6848 Epoch: 4 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:10:42,477-Speed 3203.40 samples/sec Loss 5.6681 Epoch: 4 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:10:58,413-Speed 3212.92 samples/sec Loss 5.7069 Epoch: 4 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:11:16,180-Speed 2881.91 samples/sec Loss 5.6925 Epoch: 4 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:11:33,483-Speed 2959.02 samples/sec Loss 5.7164 Epoch: 4 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:11:49,775-Speed 3142.72 samples/sec Loss 5.6818 Epoch: 4 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:12:05,600-Speed 3235.50 samples/sec Loss 5.6995 Epoch: 4 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:12:22,425-Speed 3043.14 samples/sec Loss 5.7026 Epoch: 4 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:12:38,447-Speed 3195.74 samples/sec Loss 5.6836 Epoch: 4 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:12:54,385-Speed 3212.77 samples/sec Loss 5.7110 Epoch: 4 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:13:10,391-Speed 3198.76 samples/sec Loss 5.7079 Epoch: 4 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:13:26,232-Speed 3232.37 samples/sec Loss 5.7663 Epoch: 4 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:13:42,209-Speed 3204.70 samples/sec Loss 5.7419 Epoch: 4 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:13:58,315-Speed 3179.03 samples/sec Loss 5.7119 Epoch: 4 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:14:14,177-Speed 3227.90 samples/sec Loss 5.6819 Epoch: 4 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:14:30,078-Speed 3219.99 samples/sec Loss 5.7230 Epoch: 4 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:14:46,586-Speed 3101.70 samples/sec Loss 5.7233 Epoch: 4 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:15:02,487-Speed 3220.09 samples/sec Loss 5.7797 Epoch: 4 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:15:18,425-Speed 3212.49 samples/sec Loss 5.7807 Epoch: 4 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:16:13,363-[lfw][72000]XNorm: 22.730792 Training: 2021-03-16 08:16:13,363-[lfw][72000]Accuracy-Flip: 0.99650+-0.00293 Training: 2021-03-16 08:16:13,363-[lfw][72000]Accuracy-Highest: 0.99717 Training: 2021-03-16 08:17:15,667-[cfp_fp][72000]XNorm: 19.627744 Training: 2021-03-16 08:17:15,667-[cfp_fp][72000]Accuracy-Flip: 0.95457+-0.00923 Training: 2021-03-16 08:17:15,667-[cfp_fp][72000]Accuracy-Highest: 0.95457 Training: 2021-03-16 08:18:09,191-[agedb_30][72000]XNorm: 22.098233 Training: 2021-03-16 08:18:09,191-[agedb_30][72000]Accuracy-Flip: 0.96067+-0.01088 Training: 2021-03-16 08:18:09,192-[agedb_30][72000]Accuracy-Highest: 0.96750 Training: 2021-03-16 08:18:25,106-Speed 274.27 samples/sec Loss 5.7671 Epoch: 4 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:18:41,910-Speed 3046.94 samples/sec Loss 5.7585 Epoch: 4 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:18:57,609-Speed 3261.43 samples/sec Loss 5.7452 Epoch: 4 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:19:13,515-Speed 3218.97 samples/sec Loss 5.7735 Epoch: 4 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:19:29,394-Speed 3224.50 samples/sec Loss 5.7367 Epoch: 4 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:19:45,581-Speed 3163.06 samples/sec Loss 5.7425 Epoch: 4 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:20:01,440-Speed 3228.64 samples/sec Loss 5.7169 Epoch: 4 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:20:17,426-Speed 3202.90 samples/sec Loss 5.7620 Epoch: 4 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:20:33,427-Speed 3199.93 samples/sec Loss 5.7281 Epoch: 4 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:20:49,344-Speed 3216.86 samples/sec Loss 5.7653 Epoch: 4 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:21:05,319-Speed 3205.09 samples/sec Loss 5.6778 Epoch: 4 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:21:21,336-Speed 3196.71 samples/sec Loss 5.7261 Epoch: 4 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:21:37,177-Speed 3232.27 samples/sec Loss 5.7585 Epoch: 4 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 30 hours Training: 2021-03-16 08:21:52,905-Speed 3255.28 samples/sec Loss 5.6772 Epoch: 4 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:22:08,945-Speed 3192.17 samples/sec Loss 5.7610 Epoch: 4 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:22:25,779-Speed 3041.66 samples/sec Loss 5.7061 Epoch: 4 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:22:41,719-Speed 3212.14 samples/sec Loss 5.7715 Epoch: 4 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:22:57,856-Speed 3172.90 samples/sec Loss 5.7521 Epoch: 4 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:23:13,761-Speed 3219.25 samples/sec Loss 5.7415 Epoch: 4 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:23:29,654-Speed 3221.63 samples/sec Loss 5.7603 Epoch: 4 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:23:45,526-Speed 3225.88 samples/sec Loss 5.6874 Epoch: 4 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:24:01,478-Speed 3209.73 samples/sec Loss 5.7035 Epoch: 4 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:24:17,255-Speed 3245.31 samples/sec Loss 5.7544 Epoch: 4 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:24:33,316-Speed 3188.04 samples/sec Loss 5.7818 Epoch: 4 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:24:49,101-Speed 3243.54 samples/sec Loss 5.7719 Epoch: 4 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:25:05,123-Speed 3195.71 samples/sec Loss 5.7207 Epoch: 4 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:25:21,886-Speed 3054.50 samples/sec Loss 5.7727 Epoch: 4 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:25:37,764-Speed 3224.78 samples/sec Loss 5.7277 Epoch: 4 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:25:56,246-Speed 2770.34 samples/sec Loss 5.7216 Epoch: 4 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:26:12,125-Speed 3224.46 samples/sec Loss 5.7765 Epoch: 4 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:26:28,111-Speed 3202.73 samples/sec Loss 5.7359 Epoch: 4 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:26:45,107-Speed 3012.62 samples/sec Loss 5.7297 Epoch: 4 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:27:01,165-Speed 3188.60 samples/sec Loss 5.8029 Epoch: 4 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:27:16,997-Speed 3234.04 samples/sec Loss 5.7475 Epoch: 4 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:27:32,742-Speed 3251.91 samples/sec Loss 5.7493 Epoch: 4 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:27:48,528-Speed 3243.38 samples/sec Loss 5.7646 Epoch: 4 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:28:04,249-Speed 3256.98 samples/sec Loss 5.7236 Epoch: 4 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:28:19,999-Speed 3250.95 samples/sec Loss 5.7783 Epoch: 4 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:28:35,930-Speed 3213.95 samples/sec Loss 5.7669 Epoch: 4 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:28:51,742-Speed 3238.21 samples/sec Loss 5.7593 Epoch: 4 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:29:45,024-[lfw][74000]XNorm: 23.221317 Training: 2021-03-16 08:29:45,024-[lfw][74000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-16 08:29:45,024-[lfw][74000]Accuracy-Highest: 0.99717 Training: 2021-03-16 08:30:46,861-[cfp_fp][74000]XNorm: 19.947661 Training: 2021-03-16 08:30:46,862-[cfp_fp][74000]Accuracy-Flip: 0.94629+-0.01552 Training: 2021-03-16 08:30:46,862-[cfp_fp][74000]Accuracy-Highest: 0.95457 Training: 2021-03-16 08:31:39,960-[agedb_30][74000]XNorm: 22.521903 Training: 2021-03-16 08:31:39,960-[agedb_30][74000]Accuracy-Flip: 0.95517+-0.00732 Training: 2021-03-16 08:31:39,960-[agedb_30][74000]Accuracy-Highest: 0.96750 Training: 2021-03-16 08:31:56,595-Speed 276.98 samples/sec Loss 5.7784 Epoch: 4 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:32:12,453-Speed 3228.90 samples/sec Loss 5.7430 Epoch: 4 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:32:28,295-Speed 3231.91 samples/sec Loss 5.7735 Epoch: 4 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:32:44,047-Speed 3250.55 samples/sec Loss 5.7323 Epoch: 4 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:33:00,042-Speed 3201.22 samples/sec Loss 5.7923 Epoch: 4 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:33:16,724-Speed 3069.20 samples/sec Loss 5.7421 Epoch: 4 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:33:32,549-Speed 3235.43 samples/sec Loss 5.7456 Epoch: 4 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:33:48,785-Speed 3153.56 samples/sec Loss 5.7921 Epoch: 4 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:34:04,538-Speed 3250.41 samples/sec Loss 5.7479 Epoch: 4 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:34:20,341-Speed 3239.96 samples/sec Loss 5.8002 Epoch: 4 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:34:36,553-Speed 3158.32 samples/sec Loss 5.7834 Epoch: 4 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:34:52,571-Speed 3196.39 samples/sec Loss 5.7485 Epoch: 4 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:35:08,765-Speed 3161.78 samples/sec Loss 5.7458 Epoch: 4 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:35:24,682-Speed 3216.84 samples/sec Loss 5.8044 Epoch: 4 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:35:40,618-Speed 3212.91 samples/sec Loss 5.6989 Epoch: 4 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:35:56,566-Speed 3210.52 samples/sec Loss 5.7488 Epoch: 4 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:36:12,618-Speed 3189.85 samples/sec Loss 5.7950 Epoch: 4 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:36:28,453-Speed 3233.41 samples/sec Loss 5.7803 Epoch: 4 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:36:45,462-Speed 3010.21 samples/sec Loss 5.7722 Epoch: 4 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:37:01,268-Speed 3239.33 samples/sec Loss 5.7384 Epoch: 4 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:37:17,234-Speed 3206.97 samples/sec Loss 5.7885 Epoch: 4 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:37:32,903-Speed 3267.65 samples/sec Loss 5.7356 Epoch: 4 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:37:48,939-Speed 3192.86 samples/sec Loss 5.8091 Epoch: 4 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:38:04,824-Speed 3223.36 samples/sec Loss 5.7772 Epoch: 4 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:38:20,579-Speed 3249.85 samples/sec Loss 5.7711 Epoch: 4 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:38:36,548-Speed 3206.37 samples/sec Loss 5.7561 Epoch: 4 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:38:52,338-Speed 3242.49 samples/sec Loss 5.7887 Epoch: 4 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:39:08,303-Speed 3207.27 samples/sec Loss 5.7331 Epoch: 4 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:39:24,397-Speed 3181.41 samples/sec Loss 5.8167 Epoch: 4 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:39:41,422-Speed 3007.37 samples/sec Loss 5.7773 Epoch: 4 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:39:58,546-Speed 2990.01 samples/sec Loss 5.7618 Epoch: 4 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:40:16,142-Speed 2909.79 samples/sec Loss 5.8143 Epoch: 4 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:40:32,043-Speed 3220.10 samples/sec Loss 5.7169 Epoch: 4 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:40:48,337-Speed 3142.30 samples/sec Loss 5.7404 Epoch: 4 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:41:04,297-Speed 3208.18 samples/sec Loss 5.7358 Epoch: 4 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:41:21,661-Speed 2948.77 samples/sec Loss 5.7681 Epoch: 4 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:41:37,747-Speed 3183.00 samples/sec Loss 5.7487 Epoch: 4 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:41:53,760-Speed 3197.57 samples/sec Loss 5.7794 Epoch: 4 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:42:09,997-Speed 3153.41 samples/sec Loss 5.7460 Epoch: 4 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:42:26,064-Speed 3186.59 samples/sec Loss 5.7213 Epoch: 4 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:43:19,144-[lfw][76000]XNorm: 23.192200 Training: 2021-03-16 08:43:19,145-[lfw][76000]Accuracy-Flip: 0.99683+-0.00229 Training: 2021-03-16 08:43:19,145-[lfw][76000]Accuracy-Highest: 0.99717 Training: 2021-03-16 08:44:20,950-[cfp_fp][76000]XNorm: 19.861704 Training: 2021-03-16 08:44:20,950-[cfp_fp][76000]Accuracy-Flip: 0.94929+-0.00913 Training: 2021-03-16 08:44:20,950-[cfp_fp][76000]Accuracy-Highest: 0.95457 Training: 2021-03-16 08:45:14,190-[agedb_30][76000]XNorm: 22.472966 Training: 2021-03-16 08:45:14,190-[agedb_30][76000]Accuracy-Flip: 0.96383+-0.00827 Training: 2021-03-16 08:45:14,190-[agedb_30][76000]Accuracy-Highest: 0.96750 Training: 2021-03-16 08:45:30,034-Speed 278.31 samples/sec Loss 5.7586 Epoch: 4 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:45:46,026-Speed 3201.64 samples/sec Loss 5.7255 Epoch: 4 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:46:02,656-Speed 3078.83 samples/sec Loss 5.7552 Epoch: 4 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:46:18,560-Speed 3219.49 samples/sec Loss 5.7447 Epoch: 4 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:46:34,462-Speed 3219.77 samples/sec Loss 5.7491 Epoch: 4 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:46:50,233-Speed 3246.63 samples/sec Loss 5.7432 Epoch: 4 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:47:06,073-Speed 3232.36 samples/sec Loss 5.6960 Epoch: 4 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:47:22,607-Speed 3096.84 samples/sec Loss 5.7036 Epoch: 4 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:47:38,508-Speed 3219.90 samples/sec Loss 5.7157 Epoch: 4 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:47:54,211-Speed 3260.73 samples/sec Loss 5.8368 Epoch: 4 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:48:10,281-Speed 3186.06 samples/sec Loss 5.7578 Epoch: 4 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:48:26,666-Speed 3125.05 samples/sec Loss 5.7558 Epoch: 4 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:48:42,474-Speed 3238.82 samples/sec Loss 5.7658 Epoch: 4 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:48:58,468-Speed 3201.38 samples/sec Loss 5.7200 Epoch: 4 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:49:14,363-Speed 3221.33 samples/sec Loss 5.7586 Epoch: 4 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:49:30,345-Speed 3203.53 samples/sec Loss 5.7498 Epoch: 4 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:49:46,472-Speed 3174.97 samples/sec Loss 5.7431 Epoch: 4 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:50:02,253-Speed 3244.43 samples/sec Loss 5.7982 Epoch: 4 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:50:17,964-Speed 3259.00 samples/sec Loss 5.7478 Epoch: 4 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:50:33,925-Speed 3207.95 samples/sec Loss 5.8358 Epoch: 4 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:50:49,933-Speed 3198.52 samples/sec Loss 5.7768 Epoch: 4 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:51:06,482-Speed 3093.87 samples/sec Loss 5.7532 Epoch: 4 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:51:22,343-Speed 3228.16 samples/sec Loss 5.7485 Epoch: 4 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:51:38,307-Speed 3207.32 samples/sec Loss 5.7856 Epoch: 4 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:51:54,166-Speed 3228.65 samples/sec Loss 5.7695 Epoch: 4 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:52:10,199-Speed 3193.45 samples/sec Loss 5.7697 Epoch: 4 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:52:25,925-Speed 3255.94 samples/sec Loss 5.7893 Epoch: 4 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:52:42,075-Speed 3170.26 samples/sec Loss 5.7099 Epoch: 4 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:52:57,819-Speed 3252.06 samples/sec Loss 5.7247 Epoch: 4 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:53:13,796-Speed 3204.77 samples/sec Loss 5.7682 Epoch: 4 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:53:29,763-Speed 3206.70 samples/sec Loss 5.8214 Epoch: 4 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:53:45,857-Speed 3181.44 samples/sec Loss 5.7737 Epoch: 4 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:54:01,839-Speed 3203.72 samples/sec Loss 5.7273 Epoch: 4 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:54:19,169-Speed 2954.52 samples/sec Loss 5.7534 Epoch: 4 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:54:35,723-Speed 3092.94 samples/sec Loss 5.6896 Epoch: 4 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:54:52,509-Speed 3050.18 samples/sec Loss 5.7643 Epoch: 4 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:55:08,593-Speed 3183.41 samples/sec Loss 5.7459 Epoch: 4 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:55:25,363-Speed 3053.14 samples/sec Loss 5.6892 Epoch: 4 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:55:41,515-Speed 3169.97 samples/sec Loss 5.7238 Epoch: 4 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:55:57,518-Speed 3199.55 samples/sec Loss 5.7638 Epoch: 4 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:56:50,704-[lfw][78000]XNorm: 23.063291 Training: 2021-03-16 08:56:50,705-[lfw][78000]Accuracy-Flip: 0.99550+-0.00259 Training: 2021-03-16 08:56:50,705-[lfw][78000]Accuracy-Highest: 0.99717 Training: 2021-03-16 08:57:52,415-[cfp_fp][78000]XNorm: 19.808578 Training: 2021-03-16 08:57:52,415-[cfp_fp][78000]Accuracy-Flip: 0.94729+-0.00980 Training: 2021-03-16 08:57:52,415-[cfp_fp][78000]Accuracy-Highest: 0.95457 Training: 2021-03-16 08:58:45,903-[agedb_30][78000]XNorm: 22.853619 Training: 2021-03-16 08:58:45,904-[agedb_30][78000]Accuracy-Flip: 0.95967+-0.01130 Training: 2021-03-16 08:58:45,905-[agedb_30][78000]Accuracy-Highest: 0.96750 Training: 2021-03-16 08:59:01,816-Speed 277.81 samples/sec Loss 5.7801 Epoch: 4 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:59:17,692-Speed 3225.11 samples/sec Loss 5.6608 Epoch: 4 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:59:33,466-Speed 3245.85 samples/sec Loss 5.8619 Epoch: 4 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 08:59:49,436-Speed 3206.22 samples/sec Loss 5.7913 Epoch: 4 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:00:05,468-Speed 3193.73 samples/sec Loss 5.7334 Epoch: 4 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:00:21,353-Speed 3223.27 samples/sec Loss 5.7403 Epoch: 4 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:00:38,204-Speed 3038.45 samples/sec Loss 5.6998 Epoch: 4 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:00:54,169-Speed 3207.03 samples/sec Loss 5.7999 Epoch: 4 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:01:10,004-Speed 3233.62 samples/sec Loss 5.7268 Epoch: 4 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:01:25,793-Speed 3242.74 samples/sec Loss 5.7595 Epoch: 4 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:01:42,606-Speed 3045.37 samples/sec Loss 5.7852 Epoch: 4 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:01:58,623-Speed 3196.76 samples/sec Loss 5.7749 Epoch: 4 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:02:14,464-Speed 3232.17 samples/sec Loss 5.7575 Epoch: 4 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:02:30,631-Speed 3167.16 samples/sec Loss 5.7465 Epoch: 4 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:02:46,784-Speed 3169.82 samples/sec Loss 5.7185 Epoch: 4 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:03:02,475-Speed 3263.04 samples/sec Loss 5.7242 Epoch: 4 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:03:18,210-Speed 3254.03 samples/sec Loss 5.7925 Epoch: 4 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:03:34,130-Speed 3216.25 samples/sec Loss 5.7697 Epoch: 4 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:03:49,862-Speed 3254.53 samples/sec Loss 5.7499 Epoch: 4 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:04:05,895-Speed 3193.58 samples/sec Loss 5.8094 Epoch: 4 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:04:21,711-Speed 3237.17 samples/sec Loss 5.7506 Epoch: 4 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:04:37,412-Speed 3261.13 samples/sec Loss 5.7700 Epoch: 4 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:04:53,567-Speed 3169.45 samples/sec Loss 5.7730 Epoch: 4 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:05:09,529-Speed 3207.68 samples/sec Loss 5.7380 Epoch: 4 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:05:26,263-Speed 3059.64 samples/sec Loss 5.7273 Epoch: 4 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:05:42,379-Speed 3177.21 samples/sec Loss 5.7078 Epoch: 4 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:05:58,430-Speed 3189.97 samples/sec Loss 5.7946 Epoch: 4 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:06:14,203-Speed 3245.99 samples/sec Loss 5.7398 Epoch: 4 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:06:30,122-Speed 3216.38 samples/sec Loss 5.7905 Epoch: 4 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:06:46,020-Speed 3220.67 samples/sec Loss 5.7734 Epoch: 4 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:07:01,880-Speed 3228.37 samples/sec Loss 5.7583 Epoch: 4 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:07:17,947-Speed 3186.72 samples/sec Loss 5.7052 Epoch: 4 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:07:34,124-Speed 3165.13 samples/sec Loss 5.7679 Epoch: 4 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:07:49,743-Speed 3278.18 samples/sec Loss 5.7791 Epoch: 4 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:08:05,509-Speed 3247.58 samples/sec Loss 5.8319 Epoch: 4 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:08:21,411-Speed 3219.84 samples/sec Loss 5.7767 Epoch: 4 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:08:39,049-Speed 2902.97 samples/sec Loss 5.7069 Epoch: 4 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:08:56,563-Speed 2923.39 samples/sec Loss 5.7671 Epoch: 4 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:09:12,760-Speed 3161.21 samples/sec Loss 5.7762 Epoch: 4 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:09:28,585-Speed 3235.56 samples/sec Loss 5.7635 Epoch: 4 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:10:21,849-[lfw][80000]XNorm: 22.391722 Training: 2021-03-16 09:10:21,849-[lfw][80000]Accuracy-Flip: 0.99567+-0.00367 Training: 2021-03-16 09:10:21,849-[lfw][80000]Accuracy-Highest: 0.99717 Training: 2021-03-16 09:11:23,629-[cfp_fp][80000]XNorm: 19.355565 Training: 2021-03-16 09:11:23,629-[cfp_fp][80000]Accuracy-Flip: 0.95029+-0.00873 Training: 2021-03-16 09:11:23,629-[cfp_fp][80000]Accuracy-Highest: 0.95457 Training: 2021-03-16 09:12:16,764-[agedb_30][80000]XNorm: 21.922969 Training: 2021-03-16 09:12:16,764-[agedb_30][80000]Accuracy-Flip: 0.96133+-0.00819 Training: 2021-03-16 09:12:16,764-[agedb_30][80000]Accuracy-Highest: 0.96750 Training: 2021-03-16 09:12:33,649-Speed 276.66 samples/sec Loss 5.7558 Epoch: 4 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:12:49,769-Speed 3176.30 samples/sec Loss 5.7374 Epoch: 4 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:13:05,542-Speed 3246.14 samples/sec Loss 5.6958 Epoch: 4 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:13:21,578-Speed 3192.84 samples/sec Loss 5.7302 Epoch: 4 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:13:37,618-Speed 3192.15 samples/sec Loss 5.7242 Epoch: 4 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:13:53,488-Speed 3226.27 samples/sec Loss 5.7058 Epoch: 4 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:14:09,370-Speed 3223.93 samples/sec Loss 5.7071 Epoch: 4 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:14:25,638-Speed 3147.28 samples/sec Loss 5.7507 Epoch: 4 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:14:41,750-Speed 3177.94 samples/sec Loss 5.7397 Epoch: 4 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:14:58,296-Speed 3094.43 samples/sec Loss 5.7667 Epoch: 4 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:15:14,168-Speed 3226.01 samples/sec Loss 5.7268 Epoch: 4 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:15:29,891-Speed 3256.41 samples/sec Loss 5.7667 Epoch: 4 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:15:46,690-Speed 3047.86 samples/sec Loss 5.7112 Epoch: 4 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:16:02,577-Speed 3222.97 samples/sec Loss 5.6755 Epoch: 4 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:16:18,349-Speed 3246.31 samples/sec Loss 5.7675 Epoch: 4 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:16:34,144-Speed 3241.58 samples/sec Loss 5.7528 Epoch: 4 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:16:49,971-Speed 3235.08 samples/sec Loss 5.7247 Epoch: 4 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:17:05,958-Speed 3202.71 samples/sec Loss 5.7595 Epoch: 4 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:17:21,977-Speed 3196.35 samples/sec Loss 5.7265 Epoch: 4 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:17:38,070-Speed 3181.61 samples/sec Loss 5.7160 Epoch: 4 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:17:53,945-Speed 3225.39 samples/sec Loss 5.7168 Epoch: 4 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:18:09,900-Speed 3208.97 samples/sec Loss 5.6878 Epoch: 4 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:18:25,911-Speed 3197.89 samples/sec Loss 5.7283 Epoch: 4 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 29 hours Training: 2021-03-16 09:18:41,728-Speed 3237.13 samples/sec Loss 5.7746 Epoch: 4 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:18:57,639-Speed 3218.10 samples/sec Loss 5.7180 Epoch: 4 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:19:13,716-Speed 3184.77 samples/sec Loss 5.7297 Epoch: 4 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:19:29,684-Speed 3206.41 samples/sec Loss 5.7697 Epoch: 4 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:19:45,738-Speed 3189.36 samples/sec Loss 5.7441 Epoch: 4 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:20:01,624-Speed 3223.01 samples/sec Loss 5.7297 Epoch: 4 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:20:18,401-Speed 3051.97 samples/sec Loss 5.7489 Epoch: 4 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:20:34,209-Speed 3239.04 samples/sec Loss 5.7582 Epoch: 4 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:20:50,162-Speed 3209.49 samples/sec Loss 5.7441 Epoch: 4 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:21:05,827-Speed 3268.62 samples/sec Loss 5.7237 Epoch: 4 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:21:21,640-Speed 3237.87 samples/sec Loss 5.8055 Epoch: 4 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:21:37,435-Speed 3241.68 samples/sec Loss 5.6990 Epoch: 4 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:21:53,580-Speed 3171.24 samples/sec Loss 5.7688 Epoch: 4 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:22:09,738-Speed 3168.91 samples/sec Loss 5.7502 Epoch: 4 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:22:25,817-Speed 3184.30 samples/sec Loss 5.7195 Epoch: 4 Global Step: 81900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:22:41,597-Speed 3244.66 samples/sec Loss 5.7686 Epoch: 4 Global Step: 81950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:22:58,817-Speed 2973.39 samples/sec Loss 5.7399 Epoch: 4 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:23:51,921-[lfw][82000]XNorm: 22.953466 Training: 2021-03-16 09:23:51,921-[lfw][82000]Accuracy-Flip: 0.99633+-0.00348 Training: 2021-03-16 09:23:51,921-[lfw][82000]Accuracy-Highest: 0.99717 Training: 2021-03-16 09:24:53,745-[cfp_fp][82000]XNorm: 19.627939 Training: 2021-03-16 09:24:53,745-[cfp_fp][82000]Accuracy-Flip: 0.95371+-0.01036 Training: 2021-03-16 09:24:53,745-[cfp_fp][82000]Accuracy-Highest: 0.95457 Training: 2021-03-16 09:25:46,876-[agedb_30][82000]XNorm: 22.532300 Training: 2021-03-16 09:25:46,876-[agedb_30][82000]Accuracy-Flip: 0.96083+-0.00708 Training: 2021-03-16 09:25:46,876-[agedb_30][82000]Accuracy-Highest: 0.96750 Training: 2021-03-16 09:26:02,627-Speed 278.55 samples/sec Loss 5.7147 Epoch: 4 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:26:21,734-Speed 2679.75 samples/sec Loss 5.7459 Epoch: 4 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:26:38,458-Speed 3061.47 samples/sec Loss 5.7449 Epoch: 4 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:26:54,447-Speed 3202.39 samples/sec Loss 5.7799 Epoch: 4 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:27:10,622-Speed 3165.37 samples/sec Loss 5.6966 Epoch: 4 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:27:26,986-Speed 3128.98 samples/sec Loss 5.7682 Epoch: 4 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:27:43,011-Speed 3195.13 samples/sec Loss 5.6710 Epoch: 4 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:27:58,710-Speed 3261.47 samples/sec Loss 5.7111 Epoch: 4 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:28:14,713-Speed 3199.52 samples/sec Loss 5.7284 Epoch: 4 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:28:30,611-Speed 3220.59 samples/sec Loss 5.7879 Epoch: 4 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:28:46,828-Speed 3157.42 samples/sec Loss 5.7203 Epoch: 4 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:29:02,945-Speed 3176.68 samples/sec Loss 5.7444 Epoch: 4 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:29:18,853-Speed 3218.78 samples/sec Loss 5.7199 Epoch: 4 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:29:34,669-Speed 3237.31 samples/sec Loss 5.6602 Epoch: 4 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:29:52,083-Speed 2940.15 samples/sec Loss 5.7621 Epoch: 4 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:30:07,958-Speed 3225.30 samples/sec Loss 5.7348 Epoch: 4 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:30:23,986-Speed 3194.55 samples/sec Loss 5.7606 Epoch: 4 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:30:39,603-Speed 3278.66 samples/sec Loss 5.7208 Epoch: 4 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:30:55,595-Speed 3201.70 samples/sec Loss 5.7354 Epoch: 4 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:31:11,590-Speed 3200.92 samples/sec Loss 5.8141 Epoch: 4 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:31:27,575-Speed 3203.27 samples/sec Loss 5.7056 Epoch: 4 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:31:43,510-Speed 3213.14 samples/sec Loss 5.7720 Epoch: 4 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:31:59,735-Speed 3155.79 samples/sec Loss 5.7087 Epoch: 4 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:32:15,648-Speed 3217.45 samples/sec Loss 5.7194 Epoch: 4 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:32:31,597-Speed 3210.41 samples/sec Loss 5.7241 Epoch: 4 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:32:47,640-Speed 3191.41 samples/sec Loss 5.6622 Epoch: 4 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:33:03,358-Speed 3257.56 samples/sec Loss 5.6592 Epoch: 4 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:33:19,265-Speed 3218.94 samples/sec Loss 5.7402 Epoch: 4 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:33:35,339-Speed 3185.26 samples/sec Loss 5.7509 Epoch: 4 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:34:04,974-Speed 1727.73 samples/sec Loss 5.2201 Epoch: 5 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:34:21,981-Speed 3010.62 samples/sec Loss 5.0830 Epoch: 5 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:34:37,766-Speed 3243.70 samples/sec Loss 5.0654 Epoch: 5 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:34:53,798-Speed 3193.70 samples/sec Loss 5.1533 Epoch: 5 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:35:09,782-Speed 3203.35 samples/sec Loss 5.1227 Epoch: 5 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:35:25,663-Speed 3224.19 samples/sec Loss 5.0928 Epoch: 5 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:35:41,802-Speed 3172.61 samples/sec Loss 5.1982 Epoch: 5 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:35:57,855-Speed 3189.44 samples/sec Loss 5.1888 Epoch: 5 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:36:13,723-Speed 3226.69 samples/sec Loss 5.2334 Epoch: 5 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:36:29,768-Speed 3191.17 samples/sec Loss 5.2255 Epoch: 5 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:36:45,793-Speed 3195.10 samples/sec Loss 5.2338 Epoch: 5 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:37:38,886-[lfw][84000]XNorm: 21.853102 Training: 2021-03-16 09:37:38,886-[lfw][84000]Accuracy-Flip: 0.99683+-0.00353 Training: 2021-03-16 09:37:38,886-[lfw][84000]Accuracy-Highest: 0.99717 Training: 2021-03-16 09:38:40,928-[cfp_fp][84000]XNorm: 19.244800 Training: 2021-03-16 09:38:40,929-[cfp_fp][84000]Accuracy-Flip: 0.95457+-0.00928 Training: 2021-03-16 09:38:40,929-[cfp_fp][84000]Accuracy-Highest: 0.95457 Training: 2021-03-16 09:39:34,161-[agedb_30][84000]XNorm: 21.744532 Training: 2021-03-16 09:39:34,161-[agedb_30][84000]Accuracy-Flip: 0.96217+-0.01101 Training: 2021-03-16 09:39:34,161-[agedb_30][84000]Accuracy-Highest: 0.96750 Training: 2021-03-16 09:39:50,100-Speed 277.80 samples/sec Loss 5.2494 Epoch: 5 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:40:05,966-Speed 3227.01 samples/sec Loss 5.3007 Epoch: 5 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:40:22,674-Speed 3064.54 samples/sec Loss 5.3386 Epoch: 5 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:40:39,428-Speed 3056.03 samples/sec Loss 5.2896 Epoch: 5 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:40:56,187-Speed 3055.27 samples/sec Loss 5.2897 Epoch: 5 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:41:12,700-Speed 3100.62 samples/sec Loss 5.3571 Epoch: 5 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:41:29,619-Speed 3026.34 samples/sec Loss 5.3377 Epoch: 5 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:41:45,974-Speed 3130.65 samples/sec Loss 5.3735 Epoch: 5 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:42:01,745-Speed 3246.60 samples/sec Loss 5.3370 Epoch: 5 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:42:18,020-Speed 3146.06 samples/sec Loss 5.3449 Epoch: 5 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:42:33,734-Speed 3258.22 samples/sec Loss 5.3736 Epoch: 5 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:42:49,560-Speed 3235.39 samples/sec Loss 5.4144 Epoch: 5 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:43:05,293-Speed 3254.46 samples/sec Loss 5.3637 Epoch: 5 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:43:21,499-Speed 3159.29 samples/sec Loss 5.3106 Epoch: 5 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:43:37,595-Speed 3181.09 samples/sec Loss 5.4302 Epoch: 5 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:43:53,477-Speed 3223.83 samples/sec Loss 5.4476 Epoch: 5 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:44:09,338-Speed 3228.24 samples/sec Loss 5.4401 Epoch: 5 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:44:26,876-Speed 2919.34 samples/sec Loss 5.4639 Epoch: 5 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:44:42,811-Speed 3213.26 samples/sec Loss 5.4243 Epoch: 5 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:44:58,725-Speed 3217.44 samples/sec Loss 5.4874 Epoch: 5 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:45:14,606-Speed 3224.10 samples/sec Loss 5.4321 Epoch: 5 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:45:30,402-Speed 3241.32 samples/sec Loss 5.4330 Epoch: 5 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:45:46,300-Speed 3220.73 samples/sec Loss 5.5540 Epoch: 5 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:46:02,400-Speed 3180.13 samples/sec Loss 5.5081 Epoch: 5 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:46:18,400-Speed 3200.09 samples/sec Loss 5.4554 Epoch: 5 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:46:34,505-Speed 3179.23 samples/sec Loss 5.4911 Epoch: 5 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:46:50,346-Speed 3232.17 samples/sec Loss 5.4705 Epoch: 5 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:47:06,357-Speed 3198.08 samples/sec Loss 5.4892 Epoch: 5 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:47:22,129-Speed 3246.28 samples/sec Loss 5.4762 Epoch: 5 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:47:38,202-Speed 3185.55 samples/sec Loss 5.4640 Epoch: 5 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:47:54,091-Speed 3222.46 samples/sec Loss 5.4718 Epoch: 5 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:48:09,927-Speed 3233.33 samples/sec Loss 5.5023 Epoch: 5 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:48:25,757-Speed 3234.34 samples/sec Loss 5.4978 Epoch: 5 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:48:42,648-Speed 3031.31 samples/sec Loss 5.4997 Epoch: 5 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:48:58,575-Speed 3214.80 samples/sec Loss 5.5388 Epoch: 5 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:49:14,249-Speed 3266.56 samples/sec Loss 5.4860 Epoch: 5 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:49:30,314-Speed 3187.18 samples/sec Loss 5.5585 Epoch: 5 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:49:46,066-Speed 3250.54 samples/sec Loss 5.4647 Epoch: 5 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:50:01,768-Speed 3261.01 samples/sec Loss 5.4695 Epoch: 5 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:50:17,685-Speed 3216.80 samples/sec Loss 5.4983 Epoch: 5 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:51:10,784-[lfw][86000]XNorm: 22.793524 Training: 2021-03-16 09:51:10,784-[lfw][86000]Accuracy-Flip: 0.99683+-0.00217 Training: 2021-03-16 09:51:10,784-[lfw][86000]Accuracy-Highest: 0.99717 Training: 2021-03-16 09:52:12,549-[cfp_fp][86000]XNorm: 19.393418 Training: 2021-03-16 09:52:12,549-[cfp_fp][86000]Accuracy-Flip: 0.94771+-0.01163 Training: 2021-03-16 09:52:12,549-[cfp_fp][86000]Accuracy-Highest: 0.95457 Training: 2021-03-16 09:53:05,629-[agedb_30][86000]XNorm: 22.032263 Training: 2021-03-16 09:53:05,629-[agedb_30][86000]Accuracy-Flip: 0.96850+-0.00938 Training: 2021-03-16 09:53:05,629-[agedb_30][86000]Accuracy-Highest: 0.96850 Training: 2021-03-16 09:53:21,676-Speed 278.27 samples/sec Loss 5.5653 Epoch: 5 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:53:37,429-Speed 3250.40 samples/sec Loss 5.5198 Epoch: 5 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:53:53,490-Speed 3187.79 samples/sec Loss 5.5461 Epoch: 5 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:54:09,485-Speed 3201.13 samples/sec Loss 5.5135 Epoch: 5 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:54:25,610-Speed 3175.42 samples/sec Loss 5.5547 Epoch: 5 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:54:42,301-Speed 3067.57 samples/sec Loss 5.5349 Epoch: 5 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:54:58,066-Speed 3247.81 samples/sec Loss 5.6174 Epoch: 5 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:55:15,634-Speed 2914.45 samples/sec Loss 5.5699 Epoch: 5 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:55:32,583-Speed 3020.90 samples/sec Loss 5.5524 Epoch: 5 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:55:49,245-Speed 3072.99 samples/sec Loss 5.5908 Epoch: 5 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:56:05,110-Speed 3227.46 samples/sec Loss 5.5072 Epoch: 5 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:56:20,938-Speed 3234.74 samples/sec Loss 5.5594 Epoch: 5 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:56:36,719-Speed 3244.45 samples/sec Loss 5.5900 Epoch: 5 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:56:52,796-Speed 3184.83 samples/sec Loss 5.5808 Epoch: 5 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:57:08,683-Speed 3222.94 samples/sec Loss 5.5179 Epoch: 5 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:57:24,656-Speed 3205.58 samples/sec Loss 5.5759 Epoch: 5 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:57:40,511-Speed 3229.29 samples/sec Loss 5.6079 Epoch: 5 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:57:56,540-Speed 3194.29 samples/sec Loss 5.5790 Epoch: 5 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:58:12,251-Speed 3258.91 samples/sec Loss 5.5991 Epoch: 5 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:58:28,164-Speed 3217.75 samples/sec Loss 5.6190 Epoch: 5 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:58:45,397-Speed 2971.08 samples/sec Loss 5.5981 Epoch: 5 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:59:01,346-Speed 3210.34 samples/sec Loss 5.6152 Epoch: 5 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:59:18,250-Speed 3028.90 samples/sec Loss 5.5826 Epoch: 5 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:59:34,022-Speed 3246.43 samples/sec Loss 5.6045 Epoch: 5 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 09:59:49,811-Speed 3242.76 samples/sec Loss 5.5803 Epoch: 5 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:00:05,494-Speed 3264.88 samples/sec Loss 5.6113 Epoch: 5 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:00:21,483-Speed 3202.22 samples/sec Loss 5.5525 Epoch: 5 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:00:37,470-Speed 3202.71 samples/sec Loss 5.5741 Epoch: 5 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:00:53,588-Speed 3176.64 samples/sec Loss 5.7105 Epoch: 5 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:01:09,534-Speed 3210.90 samples/sec Loss 5.6015 Epoch: 5 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:01:25,304-Speed 3246.90 samples/sec Loss 5.5799 Epoch: 5 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:01:41,060-Speed 3249.63 samples/sec Loss 5.5844 Epoch: 5 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:01:56,933-Speed 3225.73 samples/sec Loss 5.6057 Epoch: 5 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:02:12,841-Speed 3218.49 samples/sec Loss 5.5699 Epoch: 5 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:02:28,787-Speed 3211.08 samples/sec Loss 5.6048 Epoch: 5 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:02:45,820-Speed 3005.86 samples/sec Loss 5.6614 Epoch: 5 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:03:01,752-Speed 3213.85 samples/sec Loss 5.6149 Epoch: 5 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:03:17,736-Speed 3203.25 samples/sec Loss 5.6895 Epoch: 5 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:03:33,453-Speed 3257.66 samples/sec Loss 5.6531 Epoch: 5 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:03:49,494-Speed 3192.02 samples/sec Loss 5.6356 Epoch: 5 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:04:42,624-[lfw][88000]XNorm: 22.821369 Training: 2021-03-16 10:04:42,625-[lfw][88000]Accuracy-Flip: 0.99633+-0.00332 Training: 2021-03-16 10:04:42,625-[lfw][88000]Accuracy-Highest: 0.99717 Training: 2021-03-16 10:05:44,678-[cfp_fp][88000]XNorm: 19.879451 Training: 2021-03-16 10:05:44,678-[cfp_fp][88000]Accuracy-Flip: 0.95529+-0.00881 Training: 2021-03-16 10:05:44,678-[cfp_fp][88000]Accuracy-Highest: 0.95529 Training: 2021-03-16 10:06:38,168-[agedb_30][88000]XNorm: 22.392394 Training: 2021-03-16 10:06:38,168-[agedb_30][88000]Accuracy-Flip: 0.96100+-0.01062 Training: 2021-03-16 10:06:38,168-[agedb_30][88000]Accuracy-Highest: 0.96850 Training: 2021-03-16 10:06:54,354-Speed 276.97 samples/sec Loss 5.6553 Epoch: 5 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:07:10,043-Speed 3263.38 samples/sec Loss 5.7163 Epoch: 5 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:07:26,078-Speed 3193.13 samples/sec Loss 5.6828 Epoch: 5 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:07:42,100-Speed 3195.69 samples/sec Loss 5.6452 Epoch: 5 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:07:58,216-Speed 3177.15 samples/sec Loss 5.5862 Epoch: 5 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:08:13,974-Speed 3249.27 samples/sec Loss 5.6430 Epoch: 5 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:08:29,814-Speed 3232.42 samples/sec Loss 5.6344 Epoch: 5 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:08:45,828-Speed 3197.24 samples/sec Loss 5.6073 Epoch: 5 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:09:02,497-Speed 3071.68 samples/sec Loss 5.6256 Epoch: 5 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:09:19,153-Speed 3074.11 samples/sec Loss 5.6429 Epoch: 5 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:09:36,114-Speed 3018.77 samples/sec Loss 5.6631 Epoch: 5 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:09:53,026-Speed 3027.45 samples/sec Loss 5.6894 Epoch: 5 Global Step: 88600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:10:09,926-Speed 3029.61 samples/sec Loss 5.6940 Epoch: 5 Global Step: 88650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:10:26,076-Speed 3170.45 samples/sec Loss 5.6647 Epoch: 5 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:10:42,016-Speed 3212.24 samples/sec Loss 5.6457 Epoch: 5 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:10:57,814-Speed 3240.97 samples/sec Loss 5.6492 Epoch: 5 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:11:13,567-Speed 3250.17 samples/sec Loss 5.5968 Epoch: 5 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:11:29,462-Speed 3221.29 samples/sec Loss 5.6085 Epoch: 5 Global Step: 88900 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:11:45,263-Speed 3240.35 samples/sec Loss 5.6861 Epoch: 5 Global Step: 88950 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:12:01,389-Speed 3175.08 samples/sec Loss 5.6871 Epoch: 5 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:12:17,067-Speed 3265.92 samples/sec Loss 5.6361 Epoch: 5 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:12:33,089-Speed 3195.73 samples/sec Loss 5.6826 Epoch: 5 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:12:48,821-Speed 3254.51 samples/sec Loss 5.6569 Epoch: 5 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:13:05,715-Speed 3030.85 samples/sec Loss 5.6468 Epoch: 5 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:13:22,333-Speed 3081.06 samples/sec Loss 5.6718 Epoch: 5 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:13:38,461-Speed 3174.78 samples/sec Loss 5.6226 Epoch: 5 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:13:54,579-Speed 3176.53 samples/sec Loss 5.6746 Epoch: 5 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:14:10,562-Speed 3203.58 samples/sec Loss 5.6844 Epoch: 5 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:14:26,663-Speed 3179.94 samples/sec Loss 5.7197 Epoch: 5 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:14:42,697-Speed 3193.39 samples/sec Loss 5.6508 Epoch: 5 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:14:58,949-Speed 3150.53 samples/sec Loss 5.6655 Epoch: 5 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:15:14,760-Speed 3238.36 samples/sec Loss 5.6847 Epoch: 5 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:15:30,989-Speed 3154.89 samples/sec Loss 5.6997 Epoch: 5 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:15:47,055-Speed 3186.89 samples/sec Loss 5.6034 Epoch: 5 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:16:03,015-Speed 3208.17 samples/sec Loss 5.6535 Epoch: 5 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:16:18,969-Speed 3209.20 samples/sec Loss 5.6344 Epoch: 5 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:16:34,745-Speed 3245.68 samples/sec Loss 5.6971 Epoch: 5 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:16:50,526-Speed 3244.46 samples/sec Loss 5.6705 Epoch: 5 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:17:07,330-Speed 3046.97 samples/sec Loss 5.6701 Epoch: 5 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:17:23,417-Speed 3182.89 samples/sec Loss 5.5792 Epoch: 5 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:18:16,497-[lfw][90000]XNorm: 24.148110 Training: 2021-03-16 10:18:16,497-[lfw][90000]Accuracy-Flip: 0.99600+-0.00271 Training: 2021-03-16 10:18:16,497-[lfw][90000]Accuracy-Highest: 0.99717 Training: 2021-03-16 10:19:18,050-[cfp_fp][90000]XNorm: 20.933203 Training: 2021-03-16 10:19:18,050-[cfp_fp][90000]Accuracy-Flip: 0.95757+-0.00988 Training: 2021-03-16 10:19:18,050-[cfp_fp][90000]Accuracy-Highest: 0.95757 Training: 2021-03-16 10:20:11,493-[agedb_30][90000]XNorm: 23.658337 Training: 2021-03-16 10:20:11,493-[agedb_30][90000]Accuracy-Flip: 0.96450+-0.00723 Training: 2021-03-16 10:20:11,493-[agedb_30][90000]Accuracy-Highest: 0.96850 Training: 2021-03-16 10:20:27,491-Speed 278.15 samples/sec Loss 5.7000 Epoch: 5 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:20:43,380-Speed 3222.54 samples/sec Loss 5.6499 Epoch: 5 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:20:59,124-Speed 3252.13 samples/sec Loss 5.6185 Epoch: 5 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:21:15,162-Speed 3192.42 samples/sec Loss 5.6077 Epoch: 5 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:21:31,005-Speed 3231.94 samples/sec Loss 5.7134 Epoch: 5 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:21:46,795-Speed 3242.63 samples/sec Loss 5.6977 Epoch: 5 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:22:02,690-Speed 3221.16 samples/sec Loss 5.7020 Epoch: 5 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:22:18,580-Speed 3222.33 samples/sec Loss 5.7001 Epoch: 5 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:22:34,794-Speed 3157.86 samples/sec Loss 5.6835 Epoch: 5 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:22:50,662-Speed 3226.68 samples/sec Loss 5.7079 Epoch: 5 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 28 hours Training: 2021-03-16 10:23:06,835-Speed 3165.98 samples/sec Loss 5.6624 Epoch: 5 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:23:23,710-Speed 3034.10 samples/sec Loss 5.6004 Epoch: 5 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:23:39,687-Speed 3204.64 samples/sec Loss 5.6431 Epoch: 5 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:23:56,492-Speed 3046.81 samples/sec Loss 5.6738 Epoch: 5 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:24:12,591-Speed 3180.47 samples/sec Loss 5.6365 Epoch: 5 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:24:30,148-Speed 2916.32 samples/sec Loss 5.6653 Epoch: 5 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:24:46,180-Speed 3193.74 samples/sec Loss 5.6153 Epoch: 5 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:25:02,249-Speed 3186.25 samples/sec Loss 5.6571 Epoch: 5 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:25:18,217-Speed 3206.48 samples/sec Loss 5.6376 Epoch: 5 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:25:34,116-Speed 3220.60 samples/sec Loss 5.6873 Epoch: 5 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:25:49,911-Speed 3241.46 samples/sec Loss 5.6887 Epoch: 5 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:26:06,149-Speed 3153.22 samples/sec Loss 5.6410 Epoch: 5 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:26:22,323-Speed 3165.64 samples/sec Loss 5.6533 Epoch: 5 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:26:38,355-Speed 3193.83 samples/sec Loss 5.6681 Epoch: 5 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:26:54,314-Speed 3208.19 samples/sec Loss 5.6628 Epoch: 5 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:27:10,646-Speed 3135.12 samples/sec Loss 5.7143 Epoch: 5 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:27:26,424-Speed 3245.14 samples/sec Loss 5.7047 Epoch: 5 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:27:42,337-Speed 3217.50 samples/sec Loss 5.6986 Epoch: 5 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:27:59,998-Speed 2899.18 samples/sec Loss 5.7109 Epoch: 5 Global Step: 91450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:28:15,882-Speed 3223.49 samples/sec Loss 5.7570 Epoch: 5 Global Step: 91500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:28:31,787-Speed 3219.14 samples/sec Loss 5.7364 Epoch: 5 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:28:47,546-Speed 3249.07 samples/sec Loss 5.6892 Epoch: 5 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:29:03,439-Speed 3221.72 samples/sec Loss 5.7038 Epoch: 5 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:29:19,472-Speed 3193.39 samples/sec Loss 5.6811 Epoch: 5 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:29:35,709-Speed 3153.40 samples/sec Loss 5.7237 Epoch: 5 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:29:51,903-Speed 3161.74 samples/sec Loss 5.6469 Epoch: 5 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:30:07,877-Speed 3205.26 samples/sec Loss 5.6666 Epoch: 5 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:30:23,693-Speed 3237.44 samples/sec Loss 5.6306 Epoch: 5 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:30:39,715-Speed 3195.71 samples/sec Loss 5.6668 Epoch: 5 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:30:55,804-Speed 3182.44 samples/sec Loss 5.6997 Epoch: 5 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:31:49,164-[lfw][92000]XNorm: 23.354356 Training: 2021-03-16 10:31:49,165-[lfw][92000]Accuracy-Flip: 0.99700+-0.00221 Training: 2021-03-16 10:31:49,165-[lfw][92000]Accuracy-Highest: 0.99717 Training: 2021-03-16 10:32:51,227-[cfp_fp][92000]XNorm: 20.268730 Training: 2021-03-16 10:32:51,228-[cfp_fp][92000]Accuracy-Flip: 0.95186+-0.00764 Training: 2021-03-16 10:32:51,228-[cfp_fp][92000]Accuracy-Highest: 0.95757 Training: 2021-03-16 10:33:44,395-[agedb_30][92000]XNorm: 22.712846 Training: 2021-03-16 10:33:44,395-[agedb_30][92000]Accuracy-Flip: 0.96483+-0.00787 Training: 2021-03-16 10:33:44,395-[agedb_30][92000]Accuracy-Highest: 0.96850 Training: 2021-03-16 10:34:01,116-Speed 276.29 samples/sec Loss 5.7196 Epoch: 5 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:34:17,260-Speed 3171.50 samples/sec Loss 5.7348 Epoch: 5 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:34:33,068-Speed 3238.95 samples/sec Loss 5.6760 Epoch: 5 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:34:49,077-Speed 3198.30 samples/sec Loss 5.6931 Epoch: 5 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:35:05,097-Speed 3196.15 samples/sec Loss 5.6702 Epoch: 5 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:35:20,978-Speed 3223.95 samples/sec Loss 5.6629 Epoch: 5 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:35:37,015-Speed 3192.78 samples/sec Loss 5.7051 Epoch: 5 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:35:53,119-Speed 3179.48 samples/sec Loss 5.6916 Epoch: 5 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:36:09,005-Speed 3222.97 samples/sec Loss 5.7114 Epoch: 5 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:36:25,132-Speed 3174.98 samples/sec Loss 5.7227 Epoch: 5 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:36:41,003-Speed 3226.02 samples/sec Loss 5.6442 Epoch: 5 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:36:57,115-Speed 3177.96 samples/sec Loss 5.6646 Epoch: 5 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:37:13,367-Speed 3150.52 samples/sec Loss 5.6841 Epoch: 5 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:37:28,973-Speed 3280.70 samples/sec Loss 5.6972 Epoch: 5 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:37:45,786-Speed 3045.35 samples/sec Loss 5.6565 Epoch: 5 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:38:01,690-Speed 3219.56 samples/sec Loss 5.7071 Epoch: 5 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:38:17,897-Speed 3159.14 samples/sec Loss 5.7097 Epoch: 5 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:38:35,412-Speed 2923.36 samples/sec Loss 5.6301 Epoch: 5 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:38:53,185-Speed 2880.78 samples/sec Loss 5.6612 Epoch: 5 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:39:09,371-Speed 3163.50 samples/sec Loss 5.6522 Epoch: 5 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:39:25,555-Speed 3163.58 samples/sec Loss 5.6867 Epoch: 5 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:39:41,634-Speed 3184.50 samples/sec Loss 5.6184 Epoch: 5 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:39:57,981-Speed 3132.03 samples/sec Loss 5.7398 Epoch: 5 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:40:13,725-Speed 3252.11 samples/sec Loss 5.7275 Epoch: 5 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:40:29,790-Speed 3187.29 samples/sec Loss 5.7101 Epoch: 5 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:40:45,943-Speed 3169.82 samples/sec Loss 5.6838 Epoch: 5 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:41:02,099-Speed 3169.08 samples/sec Loss 5.6793 Epoch: 5 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:41:17,922-Speed 3236.02 samples/sec Loss 5.7026 Epoch: 5 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:41:34,059-Speed 3172.80 samples/sec Loss 5.6943 Epoch: 5 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:41:50,119-Speed 3188.18 samples/sec Loss 5.6504 Epoch: 5 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:42:06,266-Speed 3170.95 samples/sec Loss 5.6630 Epoch: 5 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:42:23,918-Speed 2900.71 samples/sec Loss 5.6940 Epoch: 5 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:42:39,910-Speed 3201.58 samples/sec Loss 5.7143 Epoch: 5 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:42:55,913-Speed 3199.46 samples/sec Loss 5.6333 Epoch: 5 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:43:11,841-Speed 3214.61 samples/sec Loss 5.6867 Epoch: 5 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:43:27,677-Speed 3233.24 samples/sec Loss 5.7006 Epoch: 5 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:43:43,526-Speed 3230.67 samples/sec Loss 5.6997 Epoch: 5 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:43:59,211-Speed 3264.28 samples/sec Loss 5.7493 Epoch: 5 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:44:15,216-Speed 3199.04 samples/sec Loss 5.7132 Epoch: 5 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:44:31,184-Speed 3206.62 samples/sec Loss 5.7106 Epoch: 5 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:45:24,500-[lfw][94000]XNorm: 21.469482 Training: 2021-03-16 10:45:24,501-[lfw][94000]Accuracy-Flip: 0.99700+-0.00296 Training: 2021-03-16 10:45:24,501-[lfw][94000]Accuracy-Highest: 0.99717 Training: 2021-03-16 10:46:26,614-[cfp_fp][94000]XNorm: 18.227408 Training: 2021-03-16 10:46:26,615-[cfp_fp][94000]Accuracy-Flip: 0.95514+-0.00881 Training: 2021-03-16 10:46:26,615-[cfp_fp][94000]Accuracy-Highest: 0.95757 Training: 2021-03-16 10:47:20,113-[agedb_30][94000]XNorm: 20.741873 Training: 2021-03-16 10:47:20,113-[agedb_30][94000]Accuracy-Flip: 0.95933+-0.00917 Training: 2021-03-16 10:47:20,113-[agedb_30][94000]Accuracy-Highest: 0.96850 Training: 2021-03-16 10:47:35,989-Speed 277.05 samples/sec Loss 5.6817 Epoch: 5 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:47:52,161-Speed 3165.94 samples/sec Loss 5.6145 Epoch: 5 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:48:07,880-Speed 3257.30 samples/sec Loss 5.7035 Epoch: 5 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:48:23,989-Speed 3178.52 samples/sec Loss 5.7541 Epoch: 5 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:48:40,319-Speed 3135.51 samples/sec Loss 5.6751 Epoch: 5 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:48:57,031-Speed 3063.63 samples/sec Loss 5.6820 Epoch: 5 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:49:13,026-Speed 3201.04 samples/sec Loss 5.6763 Epoch: 5 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:49:28,789-Speed 3248.21 samples/sec Loss 5.7113 Epoch: 5 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:49:44,695-Speed 3219.15 samples/sec Loss 5.6886 Epoch: 5 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:50:01,077-Speed 3125.39 samples/sec Loss 5.6701 Epoch: 5 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:50:16,797-Speed 3257.21 samples/sec Loss 5.6735 Epoch: 5 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:50:32,855-Speed 3188.41 samples/sec Loss 5.6985 Epoch: 5 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:50:48,689-Speed 3233.77 samples/sec Loss 5.7065 Epoch: 5 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:51:04,858-Speed 3166.59 samples/sec Loss 5.6971 Epoch: 5 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:51:20,624-Speed 3247.62 samples/sec Loss 5.6198 Epoch: 5 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:51:36,596-Speed 3205.75 samples/sec Loss 5.6577 Epoch: 5 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:51:53,000-Speed 3121.13 samples/sec Loss 5.6846 Epoch: 5 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:52:08,929-Speed 3214.37 samples/sec Loss 5.6462 Epoch: 5 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:52:25,800-Speed 3035.06 samples/sec Loss 5.6818 Epoch: 5 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:52:41,722-Speed 3215.61 samples/sec Loss 5.6521 Epoch: 5 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:52:58,576-Speed 3038.06 samples/sec Loss 5.6619 Epoch: 5 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:53:14,692-Speed 3177.08 samples/sec Loss 5.6956 Epoch: 5 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:53:33,239-Speed 2760.55 samples/sec Loss 5.7172 Epoch: 5 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:53:49,059-Speed 3236.64 samples/sec Loss 5.6131 Epoch: 5 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:54:04,971-Speed 3217.82 samples/sec Loss 5.6886 Epoch: 5 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:54:20,920-Speed 3210.22 samples/sec Loss 5.6347 Epoch: 5 Global Step: 95300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:54:37,116-Speed 3161.46 samples/sec Loss 5.6821 Epoch: 5 Global Step: 95350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:54:52,993-Speed 3224.79 samples/sec Loss 5.7388 Epoch: 5 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:55:08,883-Speed 3222.29 samples/sec Loss 5.6675 Epoch: 5 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:55:24,599-Speed 3257.83 samples/sec Loss 5.6972 Epoch: 5 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:55:40,645-Speed 3191.03 samples/sec Loss 5.6929 Epoch: 5 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:55:56,949-Speed 3140.29 samples/sec Loss 5.6571 Epoch: 5 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:56:12,835-Speed 3223.24 samples/sec Loss 5.7660 Epoch: 5 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:56:28,712-Speed 3224.78 samples/sec Loss 5.7130 Epoch: 5 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:56:45,771-Speed 3001.55 samples/sec Loss 5.6400 Epoch: 5 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:57:01,782-Speed 3197.84 samples/sec Loss 5.6595 Epoch: 5 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:57:18,834-Speed 3002.58 samples/sec Loss 5.6418 Epoch: 5 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:57:35,008-Speed 3165.71 samples/sec Loss 5.6748 Epoch: 5 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:57:51,290-Speed 3144.69 samples/sec Loss 5.6768 Epoch: 5 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:58:07,267-Speed 3204.65 samples/sec Loss 5.6298 Epoch: 5 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 10:59:00,453-[lfw][96000]XNorm: 21.435900 Training: 2021-03-16 10:59:00,453-[lfw][96000]Accuracy-Flip: 0.99633+-0.00245 Training: 2021-03-16 10:59:00,453-[lfw][96000]Accuracy-Highest: 0.99717 Training: 2021-03-16 11:00:02,180-[cfp_fp][96000]XNorm: 18.355847 Training: 2021-03-16 11:00:02,181-[cfp_fp][96000]Accuracy-Flip: 0.94814+-0.00943 Training: 2021-03-16 11:00:02,181-[cfp_fp][96000]Accuracy-Highest: 0.95757 Training: 2021-03-16 11:00:55,448-[agedb_30][96000]XNorm: 20.633250 Training: 2021-03-16 11:00:55,448-[agedb_30][96000]Accuracy-Flip: 0.95933+-0.00651 Training: 2021-03-16 11:00:55,448-[agedb_30][96000]Accuracy-Highest: 0.96850 Training: 2021-03-16 11:01:11,485-Speed 277.93 samples/sec Loss 5.7059 Epoch: 5 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:01:27,393-Speed 3218.67 samples/sec Loss 5.7370 Epoch: 5 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:01:43,381-Speed 3202.63 samples/sec Loss 5.6660 Epoch: 5 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:01:59,475-Speed 3181.29 samples/sec Loss 5.6941 Epoch: 5 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:02:15,632-Speed 3169.08 samples/sec Loss 5.6788 Epoch: 5 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:02:31,758-Speed 3175.10 samples/sec Loss 5.6926 Epoch: 5 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:02:47,836-Speed 3184.55 samples/sec Loss 5.6564 Epoch: 5 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:03:03,988-Speed 3169.91 samples/sec Loss 5.6791 Epoch: 5 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:03:20,968-Speed 3015.52 samples/sec Loss 5.6488 Epoch: 5 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:03:36,804-Speed 3233.18 samples/sec Loss 5.6724 Epoch: 5 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:03:52,753-Speed 3210.37 samples/sec Loss 5.7035 Epoch: 5 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:04:08,653-Speed 3220.22 samples/sec Loss 5.6920 Epoch: 5 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:04:24,604-Speed 3209.85 samples/sec Loss 5.6950 Epoch: 5 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:04:40,611-Speed 3198.68 samples/sec Loss 5.6854 Epoch: 5 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:04:56,962-Speed 3131.41 samples/sec Loss 5.6676 Epoch: 5 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:05:13,077-Speed 3177.38 samples/sec Loss 5.6908 Epoch: 5 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:05:29,068-Speed 3201.84 samples/sec Loss 5.6596 Epoch: 5 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:05:45,103-Speed 3193.19 samples/sec Loss 5.7162 Epoch: 5 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:06:00,978-Speed 3225.34 samples/sec Loss 5.6374 Epoch: 5 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:06:16,871-Speed 3221.60 samples/sec Loss 5.7018 Epoch: 5 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:06:32,791-Speed 3216.07 samples/sec Loss 5.7191 Epoch: 5 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:06:48,533-Speed 3252.54 samples/sec Loss 5.6473 Epoch: 5 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:07:05,760-Speed 2972.18 samples/sec Loss 5.7007 Epoch: 5 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:07:22,399-Speed 3077.34 samples/sec Loss 5.7614 Epoch: 5 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:07:38,256-Speed 3228.86 samples/sec Loss 5.6806 Epoch: 5 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:07:55,965-Speed 2891.26 samples/sec Loss 5.6885 Epoch: 5 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:08:11,928-Speed 3207.52 samples/sec Loss 5.6258 Epoch: 5 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:08:29,136-Speed 2975.52 samples/sec Loss 5.6307 Epoch: 5 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:08:44,923-Speed 3243.32 samples/sec Loss 5.6532 Epoch: 5 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:09:01,073-Speed 3170.29 samples/sec Loss 5.7056 Epoch: 5 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:09:16,849-Speed 3245.51 samples/sec Loss 5.6382 Epoch: 5 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:09:32,921-Speed 3185.92 samples/sec Loss 5.6914 Epoch: 5 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:09:49,026-Speed 3179.19 samples/sec Loss 5.6819 Epoch: 5 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:10:04,965-Speed 3212.34 samples/sec Loss 5.6758 Epoch: 5 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:10:20,908-Speed 3211.57 samples/sec Loss 5.6740 Epoch: 5 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:10:36,652-Speed 3252.01 samples/sec Loss 5.7240 Epoch: 5 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:10:52,762-Speed 3178.27 samples/sec Loss 5.6889 Epoch: 5 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:11:08,772-Speed 3198.12 samples/sec Loss 5.6915 Epoch: 5 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:11:25,666-Speed 3030.77 samples/sec Loss 5.6358 Epoch: 5 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:11:42,506-Speed 3040.48 samples/sec Loss 5.6635 Epoch: 5 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:12:35,924-[lfw][98000]XNorm: 22.657922 Training: 2021-03-16 11:12:35,925-[lfw][98000]Accuracy-Flip: 0.99600+-0.00389 Training: 2021-03-16 11:12:35,925-[lfw][98000]Accuracy-Highest: 0.99717 Training: 2021-03-16 11:13:38,089-[cfp_fp][98000]XNorm: 19.835295 Training: 2021-03-16 11:13:38,090-[cfp_fp][98000]Accuracy-Flip: 0.94586+-0.01009 Training: 2021-03-16 11:13:38,090-[cfp_fp][98000]Accuracy-Highest: 0.95757 Training: 2021-03-16 11:14:31,104-[agedb_30][98000]XNorm: 21.730660 Training: 2021-03-16 11:14:31,104-[agedb_30][98000]Accuracy-Flip: 0.96283+-0.00830 Training: 2021-03-16 11:14:31,104-[agedb_30][98000]Accuracy-Highest: 0.96850 Training: 2021-03-16 11:14:47,177-Speed 277.25 samples/sec Loss 5.6619 Epoch: 5 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:15:03,139-Speed 3207.77 samples/sec Loss 5.6320 Epoch: 5 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:15:19,181-Speed 3191.76 samples/sec Loss 5.6524 Epoch: 5 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:15:35,250-Speed 3186.38 samples/sec Loss 5.6421 Epoch: 5 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:15:51,041-Speed 3242.35 samples/sec Loss 5.6369 Epoch: 5 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:16:06,762-Speed 3256.90 samples/sec Loss 5.6778 Epoch: 5 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:16:22,569-Speed 3239.18 samples/sec Loss 5.6698 Epoch: 5 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:16:38,448-Speed 3224.49 samples/sec Loss 5.6434 Epoch: 5 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:16:54,103-Speed 3270.75 samples/sec Loss 5.6881 Epoch: 5 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:17:10,112-Speed 3198.14 samples/sec Loss 5.6555 Epoch: 5 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:17:26,165-Speed 3189.62 samples/sec Loss 5.6851 Epoch: 5 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:17:42,752-Speed 3086.84 samples/sec Loss 5.7117 Epoch: 5 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:17:59,040-Speed 3143.56 samples/sec Loss 5.6851 Epoch: 5 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:18:14,955-Speed 3217.09 samples/sec Loss 5.7072 Epoch: 5 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:18:30,881-Speed 3215.07 samples/sec Loss 5.6542 Epoch: 5 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:18:46,586-Speed 3260.21 samples/sec Loss 5.6686 Epoch: 5 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:19:02,532-Speed 3210.95 samples/sec Loss 5.6954 Epoch: 5 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:19:18,394-Speed 3227.94 samples/sec Loss 5.6923 Epoch: 5 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:19:34,537-Speed 3171.81 samples/sec Loss 5.7010 Epoch: 5 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:19:50,453-Speed 3216.84 samples/sec Loss 5.7197 Epoch: 5 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:20:06,429-Speed 3204.89 samples/sec Loss 5.7034 Epoch: 5 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 27 hours Training: 2021-03-16 11:20:22,410-Speed 3203.98 samples/sec Loss 5.6735 Epoch: 5 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:20:38,731-Speed 3137.12 samples/sec Loss 5.6420 Epoch: 5 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:20:54,647-Speed 3216.94 samples/sec Loss 5.6716 Epoch: 5 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:21:10,609-Speed 3207.72 samples/sec Loss 5.6678 Epoch: 5 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:21:27,225-Speed 3081.51 samples/sec Loss 5.6990 Epoch: 5 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:21:43,307-Speed 3183.78 samples/sec Loss 5.6247 Epoch: 5 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:22:00,098-Speed 3049.37 samples/sec Loss 5.6547 Epoch: 5 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:22:17,665-Speed 2914.74 samples/sec Loss 5.6634 Epoch: 5 Global Step: 99450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:22:33,736-Speed 3185.84 samples/sec Loss 5.6817 Epoch: 5 Global Step: 99500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:22:50,543-Speed 3046.43 samples/sec Loss 5.7271 Epoch: 5 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:23:06,743-Speed 3160.76 samples/sec Loss 5.6114 Epoch: 5 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:23:22,578-Speed 3233.39 samples/sec Loss 5.6873 Epoch: 5 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:23:38,481-Speed 3219.60 samples/sec Loss 5.6580 Epoch: 5 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:23:54,519-Speed 3192.50 samples/sec Loss 5.6976 Epoch: 5 Global Step: 99750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:24:10,516-Speed 3200.65 samples/sec Loss 5.6587 Epoch: 5 Global Step: 99800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:24:26,363-Speed 3231.01 samples/sec Loss 5.6879 Epoch: 5 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:24:42,139-Speed 3245.63 samples/sec Loss 5.7301 Epoch: 5 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:24:57,908-Speed 3246.89 samples/sec Loss 5.6724 Epoch: 5 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:25:13,928-Speed 3196.12 samples/sec Loss 5.6752 Epoch: 5 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:26:07,379-[lfw][100000]XNorm: 23.170854 Training: 2021-03-16 11:26:07,380-[lfw][100000]Accuracy-Flip: 0.99633+-0.00296 Training: 2021-03-16 11:26:07,380-[lfw][100000]Accuracy-Highest: 0.99717 Training: 2021-03-16 11:27:09,494-[cfp_fp][100000]XNorm: 20.234653 Training: 2021-03-16 11:27:09,495-[cfp_fp][100000]Accuracy-Flip: 0.95114+-0.00645 Training: 2021-03-16 11:27:09,495-[cfp_fp][100000]Accuracy-Highest: 0.95757 Training: 2021-03-16 11:28:02,386-[agedb_30][100000]XNorm: 23.077390 Training: 2021-03-16 11:28:02,387-[agedb_30][100000]Accuracy-Flip: 0.95900+-0.00629 Training: 2021-03-16 11:28:02,387-[agedb_30][100000]Accuracy-Highest: 0.96850 Training: 2021-03-16 11:28:18,649-Speed 277.18 samples/sec Loss 5.6674 Epoch: 5 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:28:35,390-Speed 3058.49 samples/sec Loss 5.6750 Epoch: 5 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:29:04,737-Speed 1744.69 samples/sec Loss 5.6447 Epoch: 6 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:29:21,662-Speed 3025.20 samples/sec Loss 5.0280 Epoch: 6 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:29:37,470-Speed 3238.90 samples/sec Loss 5.0155 Epoch: 6 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:29:53,223-Speed 3250.46 samples/sec Loss 5.0497 Epoch: 6 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:30:09,217-Speed 3201.25 samples/sec Loss 5.0585 Epoch: 6 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:30:25,204-Speed 3202.77 samples/sec Loss 5.0738 Epoch: 6 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:30:41,357-Speed 3169.66 samples/sec Loss 5.1179 Epoch: 6 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:30:57,472-Speed 3177.26 samples/sec Loss 5.1353 Epoch: 6 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:31:13,291-Speed 3236.71 samples/sec Loss 5.1617 Epoch: 6 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:31:29,173-Speed 3224.03 samples/sec Loss 5.1924 Epoch: 6 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:31:44,759-Speed 3285.13 samples/sec Loss 5.2280 Epoch: 6 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:32:01,771-Speed 3009.74 samples/sec Loss 5.2364 Epoch: 6 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:32:17,796-Speed 3195.07 samples/sec Loss 5.2001 Epoch: 6 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:32:33,659-Speed 3227.76 samples/sec Loss 5.2325 Epoch: 6 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:32:49,764-Speed 3179.20 samples/sec Loss 5.2471 Epoch: 6 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:33:05,997-Speed 3154.22 samples/sec Loss 5.2587 Epoch: 6 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:33:21,854-Speed 3228.77 samples/sec Loss 5.2414 Epoch: 6 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:33:37,827-Speed 3205.66 samples/sec Loss 5.2636 Epoch: 6 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:33:54,284-Speed 3111.13 samples/sec Loss 5.2864 Epoch: 6 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:34:10,265-Speed 3203.96 samples/sec Loss 5.2646 Epoch: 6 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:34:26,095-Speed 3234.41 samples/sec Loss 5.3032 Epoch: 6 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:34:41,997-Speed 3219.76 samples/sec Loss 5.2675 Epoch: 6 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:34:57,859-Speed 3228.08 samples/sec Loss 5.3776 Epoch: 6 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:35:13,520-Speed 3269.26 samples/sec Loss 5.3062 Epoch: 6 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:35:29,520-Speed 3200.08 samples/sec Loss 5.2926 Epoch: 6 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:35:45,659-Speed 3172.53 samples/sec Loss 5.3461 Epoch: 6 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:36:02,623-Speed 3018.38 samples/sec Loss 5.3056 Epoch: 6 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:36:18,632-Speed 3198.18 samples/sec Loss 5.3674 Epoch: 6 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:36:34,977-Speed 3132.55 samples/sec Loss 5.3109 Epoch: 6 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:36:52,549-Speed 2913.90 samples/sec Loss 5.3835 Epoch: 6 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:37:09,531-Speed 3015.05 samples/sec Loss 5.3492 Epoch: 6 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:37:26,608-Speed 2998.29 samples/sec Loss 5.3657 Epoch: 6 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:37:42,359-Speed 3250.62 samples/sec Loss 5.4126 Epoch: 6 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:37:58,210-Speed 3230.15 samples/sec Loss 5.3753 Epoch: 6 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:38:14,136-Speed 3215.05 samples/sec Loss 5.3554 Epoch: 6 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:38:30,267-Speed 3174.09 samples/sec Loss 5.3899 Epoch: 6 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:38:46,548-Speed 3144.83 samples/sec Loss 5.4237 Epoch: 6 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:39:02,350-Speed 3240.29 samples/sec Loss 5.4040 Epoch: 6 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:39:55,365-[lfw][102000]XNorm: 23.957733 Training: 2021-03-16 11:39:55,365-[lfw][102000]Accuracy-Flip: 0.99600+-0.00359 Training: 2021-03-16 11:39:55,365-[lfw][102000]Accuracy-Highest: 0.99717 Training: 2021-03-16 11:40:57,309-[cfp_fp][102000]XNorm: 20.822612 Training: 2021-03-16 11:40:57,309-[cfp_fp][102000]Accuracy-Flip: 0.95100+-0.00998 Training: 2021-03-16 11:40:57,309-[cfp_fp][102000]Accuracy-Highest: 0.95757 Training: 2021-03-16 11:41:50,634-[agedb_30][102000]XNorm: 22.951965 Training: 2021-03-16 11:41:50,634-[agedb_30][102000]Accuracy-Flip: 0.95717+-0.00873 Training: 2021-03-16 11:41:50,634-[agedb_30][102000]Accuracy-Highest: 0.96850 Training: 2021-03-16 11:42:06,661-Speed 277.79 samples/sec Loss 5.4203 Epoch: 6 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:42:23,027-Speed 3128.55 samples/sec Loss 5.4298 Epoch: 6 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:42:38,904-Speed 3224.93 samples/sec Loss 5.4041 Epoch: 6 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:42:55,169-Speed 3147.93 samples/sec Loss 5.4195 Epoch: 6 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:43:11,936-Speed 3053.76 samples/sec Loss 5.4090 Epoch: 6 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:43:27,795-Speed 3228.43 samples/sec Loss 5.4175 Epoch: 6 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:43:44,688-Speed 3030.94 samples/sec Loss 5.4454 Epoch: 6 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:44:01,158-Speed 3108.79 samples/sec Loss 5.4180 Epoch: 6 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:44:17,349-Speed 3162.43 samples/sec Loss 5.4568 Epoch: 6 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:44:33,217-Speed 3226.56 samples/sec Loss 5.4453 Epoch: 6 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:44:48,947-Speed 3255.14 samples/sec Loss 5.4956 Epoch: 6 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:45:04,727-Speed 3244.75 samples/sec Loss 5.4380 Epoch: 6 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:45:20,521-Speed 3241.82 samples/sec Loss 5.4251 Epoch: 6 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:45:36,429-Speed 3218.46 samples/sec Loss 5.4946 Epoch: 6 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:45:52,309-Speed 3224.36 samples/sec Loss 5.4174 Epoch: 6 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:46:08,176-Speed 3226.93 samples/sec Loss 5.4612 Epoch: 6 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:46:24,561-Speed 3124.86 samples/sec Loss 5.4568 Epoch: 6 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:46:41,295-Speed 3059.75 samples/sec Loss 5.4699 Epoch: 6 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:46:57,132-Speed 3233.07 samples/sec Loss 5.4596 Epoch: 6 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:47:12,985-Speed 3229.74 samples/sec Loss 5.4920 Epoch: 6 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:47:29,000-Speed 3197.04 samples/sec Loss 5.5278 Epoch: 6 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:47:45,022-Speed 3195.80 samples/sec Loss 5.4902 Epoch: 6 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:48:00,874-Speed 3229.96 samples/sec Loss 5.5061 Epoch: 6 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:48:16,790-Speed 3216.96 samples/sec Loss 5.4893 Epoch: 6 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:48:32,874-Speed 3183.40 samples/sec Loss 5.5238 Epoch: 6 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:48:48,731-Speed 3228.97 samples/sec Loss 5.4825 Epoch: 6 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:49:04,455-Speed 3256.23 samples/sec Loss 5.5310 Epoch: 6 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:49:20,555-Speed 3180.31 samples/sec Loss 5.5131 Epoch: 6 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:49:36,628-Speed 3185.61 samples/sec Loss 5.4882 Epoch: 6 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:49:52,616-Speed 3202.55 samples/sec Loss 5.4449 Epoch: 6 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:50:08,639-Speed 3195.52 samples/sec Loss 5.5016 Epoch: 6 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:50:25,297-Speed 3073.64 samples/sec Loss 5.5663 Epoch: 6 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:50:41,284-Speed 3202.71 samples/sec Loss 5.5221 Epoch: 6 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:50:57,236-Speed 3209.73 samples/sec Loss 5.5471 Epoch: 6 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:51:14,971-Speed 2887.08 samples/sec Loss 5.5729 Epoch: 6 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:51:31,482-Speed 3100.87 samples/sec Loss 5.5882 Epoch: 6 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:51:47,549-Speed 3186.93 samples/sec Loss 5.5203 Epoch: 6 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:52:04,354-Speed 3046.66 samples/sec Loss 5.5775 Epoch: 6 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:52:20,503-Speed 3170.67 samples/sec Loss 5.5514 Epoch: 6 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:52:36,530-Speed 3194.69 samples/sec Loss 5.5709 Epoch: 6 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:53:29,887-[lfw][104000]XNorm: 23.344531 Training: 2021-03-16 11:53:29,887-[lfw][104000]Accuracy-Flip: 0.99500+-0.00342 Training: 2021-03-16 11:53:29,887-[lfw][104000]Accuracy-Highest: 0.99717 Training: 2021-03-16 11:54:31,659-[cfp_fp][104000]XNorm: 20.652955 Training: 2021-03-16 11:54:31,660-[cfp_fp][104000]Accuracy-Flip: 0.93214+-0.01499 Training: 2021-03-16 11:54:31,660-[cfp_fp][104000]Accuracy-Highest: 0.95757 Training: 2021-03-16 11:55:25,122-[agedb_30][104000]XNorm: 22.898489 Training: 2021-03-16 11:55:25,122-[agedb_30][104000]Accuracy-Flip: 0.95633+-0.01103 Training: 2021-03-16 11:55:25,122-[agedb_30][104000]Accuracy-Highest: 0.96850 Training: 2021-03-16 11:55:40,977-Speed 277.59 samples/sec Loss 5.5296 Epoch: 6 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:55:56,948-Speed 3206.12 samples/sec Loss 5.5253 Epoch: 6 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:56:12,737-Speed 3242.81 samples/sec Loss 5.5551 Epoch: 6 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:56:28,406-Speed 3267.61 samples/sec Loss 5.5370 Epoch: 6 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:56:44,116-Speed 3259.25 samples/sec Loss 5.5694 Epoch: 6 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:57:00,040-Speed 3215.27 samples/sec Loss 5.5688 Epoch: 6 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:57:15,849-Speed 3238.87 samples/sec Loss 5.5724 Epoch: 6 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:57:32,625-Speed 3052.09 samples/sec Loss 5.5892 Epoch: 6 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:57:48,790-Speed 3167.48 samples/sec Loss 5.6353 Epoch: 6 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:58:05,516-Speed 3061.11 samples/sec Loss 5.5223 Epoch: 6 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:58:21,601-Speed 3183.18 samples/sec Loss 5.5449 Epoch: 6 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:58:37,418-Speed 3237.19 samples/sec Loss 5.5575 Epoch: 6 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:58:53,364-Speed 3210.96 samples/sec Loss 5.5328 Epoch: 6 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:59:09,363-Speed 3200.32 samples/sec Loss 5.5750 Epoch: 6 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:59:25,431-Speed 3186.52 samples/sec Loss 5.5896 Epoch: 6 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:59:41,470-Speed 3192.29 samples/sec Loss 5.6252 Epoch: 6 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 11:59:57,559-Speed 3182.45 samples/sec Loss 5.5392 Epoch: 6 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:00:13,479-Speed 3216.01 samples/sec Loss 5.5696 Epoch: 6 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:00:29,349-Speed 3226.39 samples/sec Loss 5.5937 Epoch: 6 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:00:45,155-Speed 3239.37 samples/sec Loss 5.6099 Epoch: 6 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:01:02,174-Speed 3008.44 samples/sec Loss 5.5883 Epoch: 6 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:01:18,254-Speed 3184.16 samples/sec Loss 5.6353 Epoch: 6 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:01:34,377-Speed 3175.72 samples/sec Loss 5.6085 Epoch: 6 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:01:50,429-Speed 3189.70 samples/sec Loss 5.6162 Epoch: 6 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:02:06,406-Speed 3204.76 samples/sec Loss 5.5520 Epoch: 6 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:02:22,548-Speed 3172.01 samples/sec Loss 5.5943 Epoch: 6 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:02:38,479-Speed 3213.95 samples/sec Loss 5.5490 Epoch: 6 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:02:54,291-Speed 3238.09 samples/sec Loss 5.6038 Epoch: 6 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:03:10,362-Speed 3185.98 samples/sec Loss 5.5573 Epoch: 6 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:03:26,293-Speed 3213.90 samples/sec Loss 5.6190 Epoch: 6 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:03:42,300-Speed 3198.65 samples/sec Loss 5.6287 Epoch: 6 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:03:58,620-Speed 3137.46 samples/sec Loss 5.5494 Epoch: 6 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:04:14,546-Speed 3215.01 samples/sec Loss 5.5870 Epoch: 6 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:04:30,311-Speed 3247.72 samples/sec Loss 5.6203 Epoch: 6 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:04:47,047-Speed 3059.47 samples/sec Loss 5.5675 Epoch: 6 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:05:03,150-Speed 3179.62 samples/sec Loss 5.5877 Epoch: 6 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:05:19,390-Speed 3152.69 samples/sec Loss 5.6230 Epoch: 6 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:05:35,452-Speed 3187.75 samples/sec Loss 5.6139 Epoch: 6 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:05:52,118-Speed 3072.13 samples/sec Loss 5.6189 Epoch: 6 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:06:09,860-Speed 2886.02 samples/sec Loss 5.6566 Epoch: 6 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:07:03,229-[lfw][106000]XNorm: 20.929459 Training: 2021-03-16 12:07:03,230-[lfw][106000]Accuracy-Flip: 0.99717+-0.00289 Training: 2021-03-16 12:07:03,230-[lfw][106000]Accuracy-Highest: 0.99717 Training: 2021-03-16 12:08:05,318-[cfp_fp][106000]XNorm: 18.077987 Training: 2021-03-16 12:08:05,318-[cfp_fp][106000]Accuracy-Flip: 0.95443+-0.01100 Training: 2021-03-16 12:08:05,318-[cfp_fp][106000]Accuracy-Highest: 0.95757 Training: 2021-03-16 12:08:58,644-[agedb_30][106000]XNorm: 20.176012 Training: 2021-03-16 12:08:58,645-[agedb_30][106000]Accuracy-Flip: 0.95900+-0.01086 Training: 2021-03-16 12:08:58,645-[agedb_30][106000]Accuracy-Highest: 0.96850 Training: 2021-03-16 12:09:15,433-Speed 275.90 samples/sec Loss 5.5798 Epoch: 6 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:09:31,834-Speed 3122.01 samples/sec Loss 5.5408 Epoch: 6 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:09:47,983-Speed 3170.56 samples/sec Loss 5.5970 Epoch: 6 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:10:03,787-Speed 3239.66 samples/sec Loss 5.6154 Epoch: 6 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:10:19,623-Speed 3233.37 samples/sec Loss 5.6754 Epoch: 6 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:10:35,664-Speed 3191.93 samples/sec Loss 5.6650 Epoch: 6 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:10:51,638-Speed 3205.19 samples/sec Loss 5.6150 Epoch: 6 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:11:07,384-Speed 3251.81 samples/sec Loss 5.6359 Epoch: 6 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:11:23,228-Speed 3231.61 samples/sec Loss 5.6324 Epoch: 6 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:11:39,072-Speed 3231.61 samples/sec Loss 5.6219 Epoch: 6 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:11:55,297-Speed 3155.73 samples/sec Loss 5.6537 Epoch: 6 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:12:11,177-Speed 3224.21 samples/sec Loss 5.5442 Epoch: 6 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:12:28,003-Speed 3042.98 samples/sec Loss 5.6534 Epoch: 6 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:12:43,898-Speed 3221.25 samples/sec Loss 5.6114 Epoch: 6 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:13:00,420-Speed 3099.08 samples/sec Loss 5.6915 Epoch: 6 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:13:16,645-Speed 3155.55 samples/sec Loss 5.6306 Epoch: 6 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:13:32,533-Speed 3222.69 samples/sec Loss 5.6308 Epoch: 6 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:13:48,546-Speed 3197.54 samples/sec Loss 5.5699 Epoch: 6 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:14:04,663-Speed 3176.94 samples/sec Loss 5.5722 Epoch: 6 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:14:20,985-Speed 3136.97 samples/sec Loss 5.6306 Epoch: 6 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:14:37,048-Speed 3187.44 samples/sec Loss 5.6155 Epoch: 6 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:14:52,840-Speed 3242.26 samples/sec Loss 5.6209 Epoch: 6 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:15:08,731-Speed 3222.10 samples/sec Loss 5.6145 Epoch: 6 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:15:24,835-Speed 3179.45 samples/sec Loss 5.6434 Epoch: 6 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:15:40,660-Speed 3235.38 samples/sec Loss 5.6511 Epoch: 6 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:15:57,376-Speed 3063.00 samples/sec Loss 5.5857 Epoch: 6 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:16:13,150-Speed 3245.99 samples/sec Loss 5.6474 Epoch: 6 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:16:28,964-Speed 3237.74 samples/sec Loss 5.6480 Epoch: 6 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:16:44,787-Speed 3235.88 samples/sec Loss 5.6411 Epoch: 6 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:17:00,700-Speed 3217.61 samples/sec Loss 5.6662 Epoch: 6 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:17:16,601-Speed 3220.11 samples/sec Loss 5.6072 Epoch: 6 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:17:32,664-Speed 3187.41 samples/sec Loss 5.6403 Epoch: 6 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:17:48,511-Speed 3231.14 samples/sec Loss 5.6107 Epoch: 6 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:18:04,230-Speed 3257.30 samples/sec Loss 5.5911 Epoch: 6 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:18:20,281-Speed 3189.92 samples/sec Loss 5.5938 Epoch: 6 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:18:36,221-Speed 3212.13 samples/sec Loss 5.6097 Epoch: 6 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:18:52,305-Speed 3183.35 samples/sec Loss 5.6349 Epoch: 6 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:19:08,368-Speed 3187.44 samples/sec Loss 5.6063 Epoch: 6 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:19:25,142-Speed 3052.51 samples/sec Loss 5.6387 Epoch: 6 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:19:41,072-Speed 3214.19 samples/sec Loss 5.5536 Epoch: 6 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:20:34,195-[lfw][108000]XNorm: 20.816370 Training: 2021-03-16 12:20:34,196-[lfw][108000]Accuracy-Flip: 0.99683+-0.00217 Training: 2021-03-16 12:20:34,196-[lfw][108000]Accuracy-Highest: 0.99717 Training: 2021-03-16 12:21:36,040-[cfp_fp][108000]XNorm: 18.456434 Training: 2021-03-16 12:21:36,040-[cfp_fp][108000]Accuracy-Flip: 0.94886+-0.01224 Training: 2021-03-16 12:21:36,040-[cfp_fp][108000]Accuracy-Highest: 0.95757 Training: 2021-03-16 12:22:29,248-[agedb_30][108000]XNorm: 20.622175 Training: 2021-03-16 12:22:29,248-[agedb_30][108000]Accuracy-Flip: 0.96167+-0.00986 Training: 2021-03-16 12:22:29,248-[agedb_30][108000]Accuracy-Highest: 0.96850 Training: 2021-03-16 12:22:45,350-Speed 277.84 samples/sec Loss 5.6322 Epoch: 6 Global Step: 108050 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:23:02,006-Speed 3074.04 samples/sec Loss 5.6421 Epoch: 6 Global Step: 108100 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:23:18,989-Speed 3014.88 samples/sec Loss 5.6700 Epoch: 6 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:23:35,676-Speed 3068.43 samples/sec Loss 5.6178 Epoch: 6 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:23:52,190-Speed 3100.49 samples/sec Loss 5.6271 Epoch: 6 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:24:08,385-Speed 3161.51 samples/sec Loss 5.6127 Epoch: 6 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:24:24,349-Speed 3207.44 samples/sec Loss 5.6308 Epoch: 6 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 26 hours Training: 2021-03-16 12:24:40,278-Speed 3214.23 samples/sec Loss 5.6195 Epoch: 6 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:24:56,434-Speed 3169.24 samples/sec Loss 5.6521 Epoch: 6 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:25:12,287-Speed 3229.65 samples/sec Loss 5.6250 Epoch: 6 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:25:28,198-Speed 3218.00 samples/sec Loss 5.6085 Epoch: 6 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:25:44,094-Speed 3221.10 samples/sec Loss 5.6352 Epoch: 6 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:26:00,298-Speed 3159.83 samples/sec Loss 5.5844 Epoch: 6 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:26:16,557-Speed 3149.06 samples/sec Loss 5.5783 Epoch: 6 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:26:32,502-Speed 3211.16 samples/sec Loss 5.5475 Epoch: 6 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:26:48,438-Speed 3213.12 samples/sec Loss 5.6530 Epoch: 6 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:27:04,494-Speed 3188.81 samples/sec Loss 5.5634 Epoch: 6 Global Step: 108850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:27:21,796-Speed 2959.29 samples/sec Loss 5.6012 Epoch: 6 Global Step: 108900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:27:37,465-Speed 3267.76 samples/sec Loss 5.6421 Epoch: 6 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:27:53,311-Speed 3231.19 samples/sec Loss 5.6424 Epoch: 6 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:28:09,276-Speed 3207.00 samples/sec Loss 5.6440 Epoch: 6 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:28:25,082-Speed 3239.49 samples/sec Loss 5.5939 Epoch: 6 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:28:41,480-Speed 3122.51 samples/sec Loss 5.6022 Epoch: 6 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:28:57,451-Speed 3205.77 samples/sec Loss 5.6466 Epoch: 6 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:29:13,344-Speed 3221.69 samples/sec Loss 5.6581 Epoch: 6 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:29:29,823-Speed 3107.13 samples/sec Loss 5.6741 Epoch: 6 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:29:45,686-Speed 3227.72 samples/sec Loss 5.6037 Epoch: 6 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:30:01,563-Speed 3224.73 samples/sec Loss 5.6805 Epoch: 6 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:30:17,506-Speed 3211.53 samples/sec Loss 5.5804 Epoch: 6 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:30:34,266-Speed 3055.05 samples/sec Loss 5.5976 Epoch: 6 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:30:50,299-Speed 3193.47 samples/sec Loss 5.6548 Epoch: 6 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:31:06,044-Speed 3251.89 samples/sec Loss 5.6064 Epoch: 6 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:31:22,027-Speed 3203.52 samples/sec Loss 5.6349 Epoch: 6 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:31:37,924-Speed 3220.85 samples/sec Loss 5.6300 Epoch: 6 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:31:54,204-Speed 3145.19 samples/sec Loss 5.6276 Epoch: 6 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:32:10,189-Speed 3203.14 samples/sec Loss 5.6452 Epoch: 6 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:32:25,950-Speed 3248.63 samples/sec Loss 5.6018 Epoch: 6 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:32:42,061-Speed 3177.88 samples/sec Loss 5.5784 Epoch: 6 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:32:57,902-Speed 3232.31 samples/sec Loss 5.6332 Epoch: 6 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:33:13,732-Speed 3234.43 samples/sec Loss 5.6466 Epoch: 6 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:34:06,595-[lfw][110000]XNorm: 21.378414 Training: 2021-03-16 12:34:06,596-[lfw][110000]Accuracy-Flip: 0.99633+-0.00306 Training: 2021-03-16 12:34:06,596-[lfw][110000]Accuracy-Highest: 0.99717 Training: 2021-03-16 12:35:08,300-[cfp_fp][110000]XNorm: 18.842800 Training: 2021-03-16 12:35:08,300-[cfp_fp][110000]Accuracy-Flip: 0.94486+-0.00879 Training: 2021-03-16 12:35:08,300-[cfp_fp][110000]Accuracy-Highest: 0.95757 Training: 2021-03-16 12:36:01,372-[agedb_30][110000]XNorm: 21.038433 Training: 2021-03-16 12:36:01,372-[agedb_30][110000]Accuracy-Flip: 0.95883+-0.00895 Training: 2021-03-16 12:36:01,372-[agedb_30][110000]Accuracy-Highest: 0.96850 Training: 2021-03-16 12:36:17,382-Speed 278.79 samples/sec Loss 5.6398 Epoch: 6 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:36:33,129-Speed 3251.64 samples/sec Loss 5.6496 Epoch: 6 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:36:50,237-Speed 2992.84 samples/sec Loss 5.6451 Epoch: 6 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:37:06,185-Speed 3210.58 samples/sec Loss 5.5759 Epoch: 6 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:37:23,146-Speed 3018.61 samples/sec Loss 5.6762 Epoch: 6 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:37:39,196-Speed 3190.24 samples/sec Loss 5.6208 Epoch: 6 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:37:56,637-Speed 2935.74 samples/sec Loss 5.6497 Epoch: 6 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:38:13,610-Speed 3016.53 samples/sec Loss 5.6536 Epoch: 6 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:38:29,412-Speed 3240.20 samples/sec Loss 5.6870 Epoch: 6 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:38:45,507-Speed 3181.37 samples/sec Loss 5.5822 Epoch: 6 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:39:01,638-Speed 3174.05 samples/sec Loss 5.6543 Epoch: 6 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:39:17,469-Speed 3234.24 samples/sec Loss 5.6341 Epoch: 6 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:39:33,411-Speed 3211.81 samples/sec Loss 5.6328 Epoch: 6 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:39:49,448-Speed 3192.54 samples/sec Loss 5.5645 Epoch: 6 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:40:05,531-Speed 3183.68 samples/sec Loss 5.5758 Epoch: 6 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:40:21,600-Speed 3186.33 samples/sec Loss 5.6387 Epoch: 6 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:40:37,659-Speed 3188.37 samples/sec Loss 5.6160 Epoch: 6 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:40:53,734-Speed 3185.18 samples/sec Loss 5.6655 Epoch: 6 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:41:09,785-Speed 3190.02 samples/sec Loss 5.6230 Epoch: 6 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:41:25,554-Speed 3246.79 samples/sec Loss 5.6127 Epoch: 6 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:41:42,502-Speed 3021.19 samples/sec Loss 5.6245 Epoch: 6 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:41:59,593-Speed 2995.80 samples/sec Loss 5.6323 Epoch: 6 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:42:15,291-Speed 3261.75 samples/sec Loss 5.6079 Epoch: 6 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:42:31,170-Speed 3224.44 samples/sec Loss 5.6018 Epoch: 6 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:42:47,122-Speed 3209.73 samples/sec Loss 5.5926 Epoch: 6 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:43:03,295-Speed 3165.81 samples/sec Loss 5.6019 Epoch: 6 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:43:19,577-Speed 3144.70 samples/sec Loss 5.6856 Epoch: 6 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:43:35,518-Speed 3211.98 samples/sec Loss 5.7063 Epoch: 6 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:43:51,323-Speed 3239.73 samples/sec Loss 5.6105 Epoch: 6 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:44:07,540-Speed 3157.17 samples/sec Loss 5.6339 Epoch: 6 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:44:23,409-Speed 3226.59 samples/sec Loss 5.6746 Epoch: 6 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:44:39,297-Speed 3222.68 samples/sec Loss 5.6017 Epoch: 6 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:44:55,430-Speed 3173.69 samples/sec Loss 5.5815 Epoch: 6 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:45:12,201-Speed 3053.06 samples/sec Loss 5.6012 Epoch: 6 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:45:28,132-Speed 3213.84 samples/sec Loss 5.5636 Epoch: 6 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:45:44,246-Speed 3177.42 samples/sec Loss 5.6255 Epoch: 6 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:46:00,455-Speed 3158.99 samples/sec Loss 5.6695 Epoch: 6 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:46:16,387-Speed 3213.72 samples/sec Loss 5.6248 Epoch: 6 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:46:32,217-Speed 3234.40 samples/sec Loss 5.6196 Epoch: 6 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:46:48,122-Speed 3219.14 samples/sec Loss 5.6752 Epoch: 6 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:47:41,157-[lfw][112000]XNorm: 20.551115 Training: 2021-03-16 12:47:41,157-[lfw][112000]Accuracy-Flip: 0.99483+-0.00353 Training: 2021-03-16 12:47:41,157-[lfw][112000]Accuracy-Highest: 0.99717 Training: 2021-03-16 12:48:43,141-[cfp_fp][112000]XNorm: 17.849935 Training: 2021-03-16 12:48:43,141-[cfp_fp][112000]Accuracy-Flip: 0.94329+-0.00899 Training: 2021-03-16 12:48:43,141-[cfp_fp][112000]Accuracy-Highest: 0.95757 Training: 2021-03-16 12:49:36,432-[agedb_30][112000]XNorm: 19.994332 Training: 2021-03-16 12:49:36,433-[agedb_30][112000]Accuracy-Flip: 0.96317+-0.00721 Training: 2021-03-16 12:49:36,433-[agedb_30][112000]Accuracy-Highest: 0.96850 Training: 2021-03-16 12:49:52,216-Speed 278.12 samples/sec Loss 5.6654 Epoch: 6 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:50:08,239-Speed 3195.60 samples/sec Loss 5.6476 Epoch: 6 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:50:24,067-Speed 3234.84 samples/sec Loss 5.6168 Epoch: 6 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:50:39,902-Speed 3233.33 samples/sec Loss 5.5925 Epoch: 6 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:50:55,664-Speed 3248.43 samples/sec Loss 5.6306 Epoch: 6 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:51:12,430-Speed 3053.92 samples/sec Loss 5.6683 Epoch: 6 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:51:28,518-Speed 3182.62 samples/sec Loss 5.6203 Epoch: 6 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:51:44,655-Speed 3172.86 samples/sec Loss 5.5959 Epoch: 6 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:52:00,515-Speed 3228.46 samples/sec Loss 5.6184 Epoch: 6 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:52:18,607-Speed 2830.08 samples/sec Loss 5.6485 Epoch: 6 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:52:34,802-Speed 3161.89 samples/sec Loss 5.5864 Epoch: 6 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:52:50,972-Speed 3166.50 samples/sec Loss 5.6049 Epoch: 6 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:53:08,866-Speed 2861.30 samples/sec Loss 5.6091 Epoch: 6 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:53:25,023-Speed 3168.99 samples/sec Loss 5.6207 Epoch: 6 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:53:40,973-Speed 3210.13 samples/sec Loss 5.6293 Epoch: 6 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:53:57,311-Speed 3134.00 samples/sec Loss 5.6383 Epoch: 6 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:54:13,559-Speed 3151.28 samples/sec Loss 5.6782 Epoch: 6 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:54:29,433-Speed 3225.54 samples/sec Loss 5.6168 Epoch: 6 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:54:45,092-Speed 3269.60 samples/sec Loss 5.6364 Epoch: 6 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:55:00,759-Speed 3268.11 samples/sec Loss 5.6258 Epoch: 6 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:55:16,618-Speed 3228.64 samples/sec Loss 5.6186 Epoch: 6 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:55:32,363-Speed 3251.98 samples/sec Loss 5.6492 Epoch: 6 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:55:48,598-Speed 3153.72 samples/sec Loss 5.6764 Epoch: 6 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:56:04,542-Speed 3211.32 samples/sec Loss 5.5949 Epoch: 6 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:56:22,085-Speed 2918.59 samples/sec Loss 5.6982 Epoch: 6 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:56:38,341-Speed 3149.75 samples/sec Loss 5.6777 Epoch: 6 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:56:54,220-Speed 3224.51 samples/sec Loss 5.6350 Epoch: 6 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:57:10,431-Speed 3158.39 samples/sec Loss 5.5919 Epoch: 6 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:57:26,460-Speed 3194.39 samples/sec Loss 5.6451 Epoch: 6 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:57:42,474-Speed 3197.20 samples/sec Loss 5.6050 Epoch: 6 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:57:58,419-Speed 3211.07 samples/sec Loss 5.6230 Epoch: 6 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:58:14,378-Speed 3208.33 samples/sec Loss 5.6126 Epoch: 6 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:58:30,274-Speed 3221.14 samples/sec Loss 5.6472 Epoch: 6 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:58:46,289-Speed 3197.01 samples/sec Loss 5.5973 Epoch: 6 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:59:02,303-Speed 3197.39 samples/sec Loss 5.6114 Epoch: 6 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:59:18,054-Speed 3250.60 samples/sec Loss 5.6265 Epoch: 6 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:59:33,964-Speed 3218.17 samples/sec Loss 5.6250 Epoch: 6 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 12:59:50,001-Speed 3192.71 samples/sec Loss 5.6376 Epoch: 6 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:00:06,799-Speed 3048.09 samples/sec Loss 5.6616 Epoch: 6 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:00:22,718-Speed 3216.50 samples/sec Loss 5.5824 Epoch: 6 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:01:15,920-[lfw][114000]XNorm: 22.381001 Training: 2021-03-16 13:01:15,920-[lfw][114000]Accuracy-Flip: 0.99600+-0.00226 Training: 2021-03-16 13:01:15,920-[lfw][114000]Accuracy-Highest: 0.99717 Training: 2021-03-16 13:02:17,773-[cfp_fp][114000]XNorm: 18.870735 Training: 2021-03-16 13:02:17,773-[cfp_fp][114000]Accuracy-Flip: 0.95586+-0.00922 Training: 2021-03-16 13:02:17,773-[cfp_fp][114000]Accuracy-Highest: 0.95757 Training: 2021-03-16 13:03:11,022-[agedb_30][114000]XNorm: 21.653531 Training: 2021-03-16 13:03:11,023-[agedb_30][114000]Accuracy-Flip: 0.96317+-0.01089 Training: 2021-03-16 13:03:11,023-[agedb_30][114000]Accuracy-Highest: 0.96850 Training: 2021-03-16 13:03:26,854-Speed 278.05 samples/sec Loss 5.6345 Epoch: 6 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:03:42,701-Speed 3231.17 samples/sec Loss 5.6698 Epoch: 6 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:03:58,584-Speed 3223.60 samples/sec Loss 5.5973 Epoch: 6 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:04:14,335-Speed 3250.68 samples/sec Loss 5.5712 Epoch: 6 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:04:30,390-Speed 3189.12 samples/sec Loss 5.5895 Epoch: 6 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:04:46,216-Speed 3235.43 samples/sec Loss 5.6259 Epoch: 6 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:05:02,157-Speed 3211.81 samples/sec Loss 5.6129 Epoch: 6 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:05:18,406-Speed 3151.05 samples/sec Loss 5.6160 Epoch: 6 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:05:34,531-Speed 3175.31 samples/sec Loss 5.6767 Epoch: 6 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:05:51,258-Speed 3061.11 samples/sec Loss 5.6073 Epoch: 6 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:06:07,189-Speed 3213.85 samples/sec Loss 5.6005 Epoch: 6 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:06:23,176-Speed 3202.65 samples/sec Loss 5.5212 Epoch: 6 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:06:39,051-Speed 3225.37 samples/sec Loss 5.6544 Epoch: 6 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:06:55,884-Speed 3041.79 samples/sec Loss 5.6286 Epoch: 6 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:07:12,545-Speed 3073.13 samples/sec Loss 5.6243 Epoch: 6 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:07:30,300-Speed 2883.68 samples/sec Loss 5.6050 Epoch: 6 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:07:46,244-Speed 3211.40 samples/sec Loss 5.6307 Epoch: 6 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:08:02,126-Speed 3223.87 samples/sec Loss 5.6467 Epoch: 6 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:08:18,073-Speed 3210.63 samples/sec Loss 5.6063 Epoch: 6 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:08:33,977-Speed 3219.52 samples/sec Loss 5.5841 Epoch: 6 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:08:49,686-Speed 3259.31 samples/sec Loss 5.6084 Epoch: 6 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:09:05,525-Speed 3232.73 samples/sec Loss 5.6044 Epoch: 6 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:09:21,496-Speed 3205.77 samples/sec Loss 5.6449 Epoch: 6 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:09:37,531-Speed 3193.24 samples/sec Loss 5.6112 Epoch: 6 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:09:53,435-Speed 3219.36 samples/sec Loss 5.6522 Epoch: 6 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:10:09,211-Speed 3245.45 samples/sec Loss 5.6596 Epoch: 6 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:10:25,026-Speed 3237.52 samples/sec Loss 5.5970 Epoch: 6 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:10:41,927-Speed 3029.51 samples/sec Loss 5.5616 Epoch: 6 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:10:58,586-Speed 3073.52 samples/sec Loss 5.6170 Epoch: 6 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:11:14,500-Speed 3217.45 samples/sec Loss 5.6101 Epoch: 6 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:11:30,539-Speed 3192.32 samples/sec Loss 5.6177 Epoch: 6 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:11:46,254-Speed 3258.20 samples/sec Loss 5.5833 Epoch: 6 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:12:02,160-Speed 3219.08 samples/sec Loss 5.6394 Epoch: 6 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:12:18,191-Speed 3193.81 samples/sec Loss 5.5909 Epoch: 6 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:12:34,437-Speed 3151.57 samples/sec Loss 5.6104 Epoch: 6 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:12:50,124-Speed 3264.00 samples/sec Loss 5.6587 Epoch: 6 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:13:06,120-Speed 3200.86 samples/sec Loss 5.5989 Epoch: 6 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:13:21,971-Speed 3230.29 samples/sec Loss 5.6292 Epoch: 6 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:13:37,711-Speed 3253.02 samples/sec Loss 5.6114 Epoch: 6 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:13:53,747-Speed 3192.91 samples/sec Loss 5.6396 Epoch: 6 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:14:47,197-[lfw][116000]XNorm: 23.849930 Training: 2021-03-16 13:14:47,198-[lfw][116000]Accuracy-Flip: 0.99583+-0.00271 Training: 2021-03-16 13:14:47,198-[lfw][116000]Accuracy-Highest: 0.99717 Training: 2021-03-16 13:15:49,755-[cfp_fp][116000]XNorm: 20.654094 Training: 2021-03-16 13:15:49,755-[cfp_fp][116000]Accuracy-Flip: 0.94886+-0.00845 Training: 2021-03-16 13:15:49,755-[cfp_fp][116000]Accuracy-Highest: 0.95757 Training: 2021-03-16 13:16:43,126-[agedb_30][116000]XNorm: 23.234454 Training: 2021-03-16 13:16:43,126-[agedb_30][116000]Accuracy-Flip: 0.96233+-0.00742 Training: 2021-03-16 13:16:43,126-[agedb_30][116000]Accuracy-Highest: 0.96850 Training: 2021-03-16 13:16:59,043-Speed 276.31 samples/sec Loss 5.6436 Epoch: 6 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:17:15,015-Speed 3205.80 samples/sec Loss 5.5683 Epoch: 6 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:17:31,803-Speed 3049.82 samples/sec Loss 5.5979 Epoch: 6 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:17:47,704-Speed 3220.03 samples/sec Loss 5.6262 Epoch: 6 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:18:03,496-Speed 3242.37 samples/sec Loss 5.5873 Epoch: 6 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:18:19,435-Speed 3212.30 samples/sec Loss 5.6236 Epoch: 6 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:18:35,421-Speed 3202.88 samples/sec Loss 5.5743 Epoch: 6 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:18:51,175-Speed 3250.08 samples/sec Loss 5.7002 Epoch: 6 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:19:07,176-Speed 3200.00 samples/sec Loss 5.5988 Epoch: 6 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:19:23,120-Speed 3211.28 samples/sec Loss 5.5725 Epoch: 6 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:19:38,907-Speed 3243.25 samples/sec Loss 5.6470 Epoch: 6 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:19:54,952-Speed 3191.08 samples/sec Loss 5.6495 Epoch: 6 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:20:11,592-Speed 3077.07 samples/sec Loss 5.6154 Epoch: 6 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:20:27,414-Speed 3235.96 samples/sec Loss 5.6383 Epoch: 6 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:20:43,541-Speed 3174.89 samples/sec Loss 5.5570 Epoch: 6 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:20:59,484-Speed 3211.64 samples/sec Loss 5.6511 Epoch: 6 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:21:29,991-Speed 1678.34 samples/sec Loss 5.3595 Epoch: 7 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:21:46,016-Speed 3195.18 samples/sec Loss 4.0727 Epoch: 7 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:22:03,666-Speed 2900.86 samples/sec Loss 3.7354 Epoch: 7 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 25 hours Training: 2021-03-16 13:22:20,738-Speed 2999.19 samples/sec Loss 3.5491 Epoch: 7 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:22:36,481-Speed 3252.39 samples/sec Loss 3.5024 Epoch: 7 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:22:52,292-Speed 3238.24 samples/sec Loss 3.3829 Epoch: 7 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:23:08,415-Speed 3175.71 samples/sec Loss 3.3202 Epoch: 7 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:23:24,436-Speed 3196.05 samples/sec Loss 3.2830 Epoch: 7 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:23:40,333-Speed 3220.70 samples/sec Loss 3.2182 Epoch: 7 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:23:56,150-Speed 3237.23 samples/sec Loss 3.1952 Epoch: 7 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:24:12,105-Speed 3209.09 samples/sec Loss 3.1579 Epoch: 7 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:24:28,075-Speed 3206.09 samples/sec Loss 3.1522 Epoch: 7 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:24:44,180-Speed 3179.38 samples/sec Loss 3.0958 Epoch: 7 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:25:00,008-Speed 3234.80 samples/sec Loss 3.0465 Epoch: 7 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:25:16,803-Speed 3048.70 samples/sec Loss 3.0578 Epoch: 7 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:25:32,855-Speed 3189.62 samples/sec Loss 3.0015 Epoch: 7 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:25:48,744-Speed 3222.53 samples/sec Loss 2.9795 Epoch: 7 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:26:04,707-Speed 3207.44 samples/sec Loss 2.9930 Epoch: 7 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:26:21,959-Speed 2967.89 samples/sec Loss 2.9689 Epoch: 7 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:26:38,099-Speed 3172.24 samples/sec Loss 2.9112 Epoch: 7 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:26:54,045-Speed 3210.93 samples/sec Loss 2.9670 Epoch: 7 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:27:10,027-Speed 3203.85 samples/sec Loss 2.8982 Epoch: 7 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:27:26,132-Speed 3179.23 samples/sec Loss 2.9148 Epoch: 7 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:27:42,225-Speed 3181.65 samples/sec Loss 2.8127 Epoch: 7 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:28:35,623-[lfw][118000]XNorm: 22.650957 Training: 2021-03-16 13:28:35,623-[lfw][118000]Accuracy-Flip: 0.99767+-0.00300 Training: 2021-03-16 13:28:35,623-[lfw][118000]Accuracy-Highest: 0.99767 Training: 2021-03-16 13:29:37,470-[cfp_fp][118000]XNorm: 20.645051 Training: 2021-03-16 13:29:37,470-[cfp_fp][118000]Accuracy-Flip: 0.98143+-0.00682 Training: 2021-03-16 13:29:37,470-[cfp_fp][118000]Accuracy-Highest: 0.98143 Training: 2021-03-16 13:30:30,403-[agedb_30][118000]XNorm: 22.471236 Training: 2021-03-16 13:30:30,403-[agedb_30][118000]Accuracy-Flip: 0.97683+-0.00639 Training: 2021-03-16 13:30:30,403-[agedb_30][118000]Accuracy-Highest: 0.97683 Training: 2021-03-16 13:30:46,727-Speed 277.50 samples/sec Loss 2.8489 Epoch: 7 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:31:02,648-Speed 3216.02 samples/sec Loss 2.8298 Epoch: 7 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:31:18,423-Speed 3245.62 samples/sec Loss 2.8238 Epoch: 7 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:31:34,631-Speed 3159.09 samples/sec Loss 2.7955 Epoch: 7 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:31:50,781-Speed 3170.45 samples/sec Loss 2.7672 Epoch: 7 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:32:07,856-Speed 2998.52 samples/sec Loss 2.7757 Epoch: 7 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:32:23,734-Speed 3224.87 samples/sec Loss 2.7840 Epoch: 7 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:32:39,956-Speed 3156.34 samples/sec Loss 2.7335 Epoch: 7 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:32:56,321-Speed 3128.73 samples/sec Loss 2.7233 Epoch: 7 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:33:12,343-Speed 3195.55 samples/sec Loss 2.7016 Epoch: 7 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:33:28,245-Speed 3219.92 samples/sec Loss 2.6957 Epoch: 7 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:33:44,062-Speed 3237.02 samples/sec Loss 2.7248 Epoch: 7 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:33:59,827-Speed 3247.82 samples/sec Loss 2.6668 Epoch: 7 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:34:15,783-Speed 3208.94 samples/sec Loss 2.6549 Epoch: 7 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:34:31,649-Speed 3227.19 samples/sec Loss 2.6549 Epoch: 7 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:34:48,359-Speed 3064.03 samples/sec Loss 2.6495 Epoch: 7 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:35:04,578-Speed 3157.08 samples/sec Loss 2.6476 Epoch: 7 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:35:20,604-Speed 3194.79 samples/sec Loss 2.6621 Epoch: 7 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:35:36,466-Speed 3227.89 samples/sec Loss 2.6237 Epoch: 7 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:35:53,242-Speed 3052.07 samples/sec Loss 2.6299 Epoch: 7 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:36:09,257-Speed 3197.22 samples/sec Loss 2.6060 Epoch: 7 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:36:26,852-Speed 2909.88 samples/sec Loss 2.6182 Epoch: 7 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:36:42,695-Speed 3231.78 samples/sec Loss 2.5828 Epoch: 7 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:37:00,020-Speed 2955.37 samples/sec Loss 2.5924 Epoch: 7 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:37:16,054-Speed 3193.37 samples/sec Loss 2.5800 Epoch: 7 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:37:32,205-Speed 3170.11 samples/sec Loss 2.5591 Epoch: 7 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:37:48,274-Speed 3186.47 samples/sec Loss 2.5728 Epoch: 7 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:38:04,494-Speed 3156.72 samples/sec Loss 2.5226 Epoch: 7 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:38:20,632-Speed 3172.78 samples/sec Loss 2.5600 Epoch: 7 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:38:36,355-Speed 3256.38 samples/sec Loss 2.5432 Epoch: 7 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:38:52,099-Speed 3252.22 samples/sec Loss 2.5329 Epoch: 7 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:39:08,012-Speed 3217.62 samples/sec Loss 2.5157 Epoch: 7 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:39:24,013-Speed 3199.89 samples/sec Loss 2.4898 Epoch: 7 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:39:40,071-Speed 3188.41 samples/sec Loss 2.5255 Epoch: 7 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:39:57,000-Speed 3024.51 samples/sec Loss 2.4843 Epoch: 7 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:40:13,083-Speed 3183.54 samples/sec Loss 2.5055 Epoch: 7 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:40:28,752-Speed 3267.83 samples/sec Loss 2.4899 Epoch: 7 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:40:45,540-Speed 3049.86 samples/sec Loss 2.4676 Epoch: 7 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:41:01,365-Speed 3235.39 samples/sec Loss 2.4697 Epoch: 7 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:41:17,521-Speed 3169.32 samples/sec Loss 2.4537 Epoch: 7 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:42:10,770-[lfw][120000]XNorm: 22.569425 Training: 2021-03-16 13:42:10,770-[lfw][120000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-16 13:42:10,770-[lfw][120000]Accuracy-Highest: 0.99767 Training: 2021-03-16 13:43:12,874-[cfp_fp][120000]XNorm: 20.818332 Training: 2021-03-16 13:43:12,874-[cfp_fp][120000]Accuracy-Flip: 0.98586+-0.00580 Training: 2021-03-16 13:43:12,874-[cfp_fp][120000]Accuracy-Highest: 0.98586 Training: 2021-03-16 13:44:06,082-[agedb_30][120000]XNorm: 22.613479 Training: 2021-03-16 13:44:06,082-[agedb_30][120000]Accuracy-Flip: 0.97817+-0.00608 Training: 2021-03-16 13:44:06,082-[agedb_30][120000]Accuracy-Highest: 0.97817 Training: 2021-03-16 13:44:22,217-Speed 277.21 samples/sec Loss 2.4506 Epoch: 7 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:44:38,087-Speed 3226.22 samples/sec Loss 2.4362 Epoch: 7 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:44:54,053-Speed 3206.95 samples/sec Loss 2.4239 Epoch: 7 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:45:09,742-Speed 3263.52 samples/sec Loss 2.4475 Epoch: 7 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:45:25,749-Speed 3198.68 samples/sec Loss 2.4297 Epoch: 7 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:45:42,130-Speed 3125.82 samples/sec Loss 2.4156 Epoch: 7 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:45:57,954-Speed 3235.52 samples/sec Loss 2.4437 Epoch: 7 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:46:13,800-Speed 3231.28 samples/sec Loss 2.4444 Epoch: 7 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:46:30,555-Speed 3055.84 samples/sec Loss 2.4325 Epoch: 7 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:46:46,349-Speed 3241.99 samples/sec Loss 2.4076 Epoch: 7 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:47:02,391-Speed 3191.69 samples/sec Loss 2.3895 Epoch: 7 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:47:18,240-Speed 3230.52 samples/sec Loss 2.4218 Epoch: 7 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:47:34,466-Speed 3155.64 samples/sec Loss 2.4146 Epoch: 7 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:47:50,555-Speed 3182.32 samples/sec Loss 2.3854 Epoch: 7 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:48:06,333-Speed 3245.15 samples/sec Loss 2.3691 Epoch: 7 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:48:22,045-Speed 3258.66 samples/sec Loss 2.3964 Epoch: 7 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:48:38,104-Speed 3188.34 samples/sec Loss 2.3606 Epoch: 7 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:48:54,131-Speed 3194.76 samples/sec Loss 2.3668 Epoch: 7 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:49:10,345-Speed 3157.98 samples/sec Loss 2.3373 Epoch: 7 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:49:27,053-Speed 3064.37 samples/sec Loss 2.3223 Epoch: 7 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:49:43,152-Speed 3180.48 samples/sec Loss 2.3415 Epoch: 7 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:49:59,303-Speed 3170.22 samples/sec Loss 2.3464 Epoch: 7 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:50:15,249-Speed 3210.91 samples/sec Loss 2.3591 Epoch: 7 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:50:31,333-Speed 3183.43 samples/sec Loss 2.3581 Epoch: 7 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:50:48,032-Speed 3066.23 samples/sec Loss 2.3518 Epoch: 7 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:51:04,735-Speed 3065.24 samples/sec Loss 2.2967 Epoch: 7 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:51:22,623-Speed 2862.47 samples/sec Loss 2.3591 Epoch: 7 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:51:38,458-Speed 3233.37 samples/sec Loss 2.3398 Epoch: 7 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:51:54,238-Speed 3244.63 samples/sec Loss 2.3259 Epoch: 7 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:52:10,367-Speed 3174.64 samples/sec Loss 2.3161 Epoch: 7 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:52:26,158-Speed 3242.43 samples/sec Loss 2.3340 Epoch: 7 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:52:42,035-Speed 3224.82 samples/sec Loss 2.2970 Epoch: 7 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:52:57,974-Speed 3212.30 samples/sec Loss 2.3166 Epoch: 7 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:53:13,934-Speed 3208.14 samples/sec Loss 2.3049 Epoch: 7 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:53:30,129-Speed 3161.66 samples/sec Loss 2.2847 Epoch: 7 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:53:46,012-Speed 3223.56 samples/sec Loss 2.2969 Epoch: 7 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:54:02,198-Speed 3163.43 samples/sec Loss 2.2933 Epoch: 7 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:54:17,940-Speed 3252.55 samples/sec Loss 2.3182 Epoch: 7 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:54:33,796-Speed 3229.06 samples/sec Loss 2.3094 Epoch: 7 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:54:50,635-Speed 3040.69 samples/sec Loss 2.2970 Epoch: 7 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:55:43,628-[lfw][122000]XNorm: 22.192981 Training: 2021-03-16 13:55:43,628-[lfw][122000]Accuracy-Flip: 0.99767+-0.00300 Training: 2021-03-16 13:55:43,628-[lfw][122000]Accuracy-Highest: 0.99767 Training: 2021-03-16 13:56:45,283-[cfp_fp][122000]XNorm: 20.659423 Training: 2021-03-16 13:56:45,283-[cfp_fp][122000]Accuracy-Flip: 0.98357+-0.00614 Training: 2021-03-16 13:56:45,283-[cfp_fp][122000]Accuracy-Highest: 0.98586 Training: 2021-03-16 13:57:38,303-[agedb_30][122000]XNorm: 22.166280 Training: 2021-03-16 13:57:38,303-[agedb_30][122000]Accuracy-Flip: 0.97750+-0.00696 Training: 2021-03-16 13:57:38,303-[agedb_30][122000]Accuracy-Highest: 0.97817 Training: 2021-03-16 13:57:54,097-Speed 279.08 samples/sec Loss 2.3041 Epoch: 7 Global Step: 122050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:58:09,955-Speed 3228.82 samples/sec Loss 2.2870 Epoch: 7 Global Step: 122100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:58:25,944-Speed 3202.31 samples/sec Loss 2.2967 Epoch: 7 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:58:42,758-Speed 3045.13 samples/sec Loss 2.2633 Epoch: 7 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:58:58,831-Speed 3185.63 samples/sec Loss 2.3074 Epoch: 7 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:59:14,657-Speed 3235.15 samples/sec Loss 2.2666 Epoch: 7 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:59:30,720-Speed 3187.49 samples/sec Loss 2.2836 Epoch: 7 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 13:59:46,451-Speed 3254.84 samples/sec Loss 2.2793 Epoch: 7 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:00:02,317-Speed 3227.12 samples/sec Loss 2.2746 Epoch: 7 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:00:18,477-Speed 3168.56 samples/sec Loss 2.2558 Epoch: 7 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:00:34,677-Speed 3160.48 samples/sec Loss 2.2562 Epoch: 7 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:00:51,261-Speed 3087.41 samples/sec Loss 2.2604 Epoch: 7 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:01:07,109-Speed 3230.81 samples/sec Loss 2.2813 Epoch: 7 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:01:23,125-Speed 3196.94 samples/sec Loss 2.2645 Epoch: 7 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:01:39,188-Speed 3187.53 samples/sec Loss 2.2582 Epoch: 7 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:01:55,109-Speed 3216.14 samples/sec Loss 2.2359 Epoch: 7 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:02:11,244-Speed 3173.27 samples/sec Loss 2.2162 Epoch: 7 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:02:27,048-Speed 3239.71 samples/sec Loss 2.2673 Epoch: 7 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:02:43,292-Speed 3152.13 samples/sec Loss 2.2279 Epoch: 7 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:02:59,109-Speed 3237.17 samples/sec Loss 2.2416 Epoch: 7 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:03:15,024-Speed 3217.22 samples/sec Loss 2.2524 Epoch: 7 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:03:31,022-Speed 3200.42 samples/sec Loss 2.2510 Epoch: 7 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:03:47,010-Speed 3202.48 samples/sec Loss 2.2444 Epoch: 7 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:04:02,943-Speed 3213.67 samples/sec Loss 2.2457 Epoch: 7 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:04:19,702-Speed 3055.01 samples/sec Loss 2.2327 Epoch: 7 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:04:35,626-Speed 3215.50 samples/sec Loss 2.2062 Epoch: 7 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:04:51,401-Speed 3245.77 samples/sec Loss 2.2507 Epoch: 7 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:05:07,079-Speed 3265.83 samples/sec Loss 2.2410 Epoch: 7 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:05:23,886-Speed 3046.32 samples/sec Loss 2.2375 Epoch: 7 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:05:41,360-Speed 2930.13 samples/sec Loss 2.2223 Epoch: 7 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:05:57,415-Speed 3189.16 samples/sec Loss 2.1971 Epoch: 7 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:06:14,105-Speed 3067.81 samples/sec Loss 2.2245 Epoch: 7 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:06:30,130-Speed 3195.25 samples/sec Loss 2.2289 Epoch: 7 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:06:46,020-Speed 3222.30 samples/sec Loss 2.2422 Epoch: 7 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:07:02,019-Speed 3200.14 samples/sec Loss 2.2393 Epoch: 7 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:07:17,950-Speed 3214.02 samples/sec Loss 2.2233 Epoch: 7 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:07:33,962-Speed 3197.78 samples/sec Loss 2.2308 Epoch: 7 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:07:49,992-Speed 3194.04 samples/sec Loss 2.2118 Epoch: 7 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:08:06,097-Speed 3179.31 samples/sec Loss 2.2005 Epoch: 7 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:08:21,929-Speed 3234.06 samples/sec Loss 2.2311 Epoch: 7 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:09:15,057-[lfw][124000]XNorm: 22.540994 Training: 2021-03-16 14:09:15,058-[lfw][124000]Accuracy-Flip: 0.99733+-0.00249 Training: 2021-03-16 14:09:15,058-[lfw][124000]Accuracy-Highest: 0.99767 Training: 2021-03-16 14:10:17,083-[cfp_fp][124000]XNorm: 20.601274 Training: 2021-03-16 14:10:17,083-[cfp_fp][124000]Accuracy-Flip: 0.98571+-0.00679 Training: 2021-03-16 14:10:17,083-[cfp_fp][124000]Accuracy-Highest: 0.98586 Training: 2021-03-16 14:11:10,166-[agedb_30][124000]XNorm: 22.339273 Training: 2021-03-16 14:11:10,166-[agedb_30][124000]Accuracy-Flip: 0.97883+-0.00654 Training: 2021-03-16 14:11:10,167-[agedb_30][124000]Accuracy-Highest: 0.97883 Training: 2021-03-16 14:11:25,861-Speed 278.36 samples/sec Loss 2.2038 Epoch: 7 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:11:41,816-Speed 3209.15 samples/sec Loss 2.1847 Epoch: 7 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:11:57,900-Speed 3183.32 samples/sec Loss 2.2069 Epoch: 7 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:12:13,946-Speed 3190.91 samples/sec Loss 2.2241 Epoch: 7 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:12:31,057-Speed 2992.40 samples/sec Loss 2.1974 Epoch: 7 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:12:47,336-Speed 3145.32 samples/sec Loss 2.2181 Epoch: 7 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:13:04,135-Speed 3047.94 samples/sec Loss 2.1990 Epoch: 7 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:13:20,519-Speed 3125.04 samples/sec Loss 2.2125 Epoch: 7 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:13:36,614-Speed 3181.25 samples/sec Loss 2.2353 Epoch: 7 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:13:52,698-Speed 3183.23 samples/sec Loss 2.1916 Epoch: 7 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:14:08,662-Speed 3207.34 samples/sec Loss 2.1771 Epoch: 7 Global Step: 124550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:14:24,931-Speed 3147.19 samples/sec Loss 2.1771 Epoch: 7 Global Step: 124600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:14:40,925-Speed 3201.34 samples/sec Loss 2.2268 Epoch: 7 Global Step: 124650 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:14:56,846-Speed 3215.89 samples/sec Loss 2.1908 Epoch: 7 Global Step: 124700 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:15:12,875-Speed 3194.41 samples/sec Loss 2.1951 Epoch: 7 Global Step: 124750 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:15:29,828-Speed 3020.20 samples/sec Loss 2.1803 Epoch: 7 Global Step: 124800 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:15:45,953-Speed 3175.27 samples/sec Loss 2.1651 Epoch: 7 Global Step: 124850 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:16:01,704-Speed 3250.79 samples/sec Loss 2.1754 Epoch: 7 Global Step: 124900 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:16:17,989-Speed 3143.94 samples/sec Loss 2.1659 Epoch: 7 Global Step: 124950 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:16:34,066-Speed 3184.91 samples/sec Loss 2.1649 Epoch: 7 Global Step: 125000 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:16:49,779-Speed 3258.48 samples/sec Loss 2.2127 Epoch: 7 Global Step: 125050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:17:05,728-Speed 3210.23 samples/sec Loss 2.1947 Epoch: 7 Global Step: 125100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:17:21,963-Speed 3153.82 samples/sec Loss 2.2048 Epoch: 7 Global Step: 125150 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:17:37,706-Speed 3252.33 samples/sec Loss 2.1619 Epoch: 7 Global Step: 125200 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:17:53,646-Speed 3212.19 samples/sec Loss 2.1728 Epoch: 7 Global Step: 125250 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:18:09,590-Speed 3211.26 samples/sec Loss 2.2069 Epoch: 7 Global Step: 125300 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:18:25,535-Speed 3211.27 samples/sec Loss 2.1890 Epoch: 7 Global Step: 125350 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:18:41,468-Speed 3213.55 samples/sec Loss 2.1514 Epoch: 7 Global Step: 125400 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:18:57,554-Speed 3182.86 samples/sec Loss 2.1442 Epoch: 7 Global Step: 125450 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:19:14,213-Speed 3073.45 samples/sec Loss 2.1291 Epoch: 7 Global Step: 125500 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:19:30,322-Speed 3178.62 samples/sec Loss 2.1693 Epoch: 7 Global Step: 125550 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:19:47,332-Speed 3010.07 samples/sec Loss 2.1747 Epoch: 7 Global Step: 125600 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:20:03,435-Speed 3179.55 samples/sec Loss 2.1840 Epoch: 7 Global Step: 125650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:20:20,442-Speed 3010.71 samples/sec Loss 2.1568 Epoch: 7 Global Step: 125700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:20:37,050-Speed 3082.88 samples/sec Loss 2.1584 Epoch: 7 Global Step: 125750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:20:53,199-Speed 3170.65 samples/sec Loss 2.1315 Epoch: 7 Global Step: 125800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:21:10,211-Speed 3009.69 samples/sec Loss 2.1314 Epoch: 7 Global Step: 125850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:21:26,222-Speed 3197.81 samples/sec Loss 2.1755 Epoch: 7 Global Step: 125900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:21:42,063-Speed 3232.30 samples/sec Loss 2.1985 Epoch: 7 Global Step: 125950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:21:57,850-Speed 3243.23 samples/sec Loss 2.1784 Epoch: 7 Global Step: 126000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:22:50,813-[lfw][126000]XNorm: 22.602396 Training: 2021-03-16 14:22:50,814-[lfw][126000]Accuracy-Flip: 0.99700+-0.00267 Training: 2021-03-16 14:22:50,814-[lfw][126000]Accuracy-Highest: 0.99767 Training: 2021-03-16 14:23:52,726-[cfp_fp][126000]XNorm: 21.055857 Training: 2021-03-16 14:23:52,726-[cfp_fp][126000]Accuracy-Flip: 0.98129+-0.00718 Training: 2021-03-16 14:23:52,727-[cfp_fp][126000]Accuracy-Highest: 0.98586 Training: 2021-03-16 14:24:45,795-[agedb_30][126000]XNorm: 22.484362 Training: 2021-03-16 14:24:45,795-[agedb_30][126000]Accuracy-Flip: 0.97567+-0.00616 Training: 2021-03-16 14:24:45,796-[agedb_30][126000]Accuracy-Highest: 0.97883 Training: 2021-03-16 14:25:01,924-Speed 278.15 samples/sec Loss 2.1448 Epoch: 7 Global Step: 126050 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:25:17,803-Speed 3224.48 samples/sec Loss 2.1567 Epoch: 7 Global Step: 126100 Fp16 Grad Scale: 16384 Required: 24 hours Training: 2021-03-16 14:25:33,724-Speed 3215.97 samples/sec Loss 2.1495 Epoch: 7 Global Step: 126150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:25:49,547-Speed 3235.90 samples/sec Loss 2.1671 Epoch: 7 Global Step: 126200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:26:05,453-Speed 3219.08 samples/sec Loss 2.1647 Epoch: 7 Global Step: 126250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:26:21,392-Speed 3212.33 samples/sec Loss 2.1465 Epoch: 7 Global Step: 126300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:26:37,364-Speed 3205.69 samples/sec Loss 2.1443 Epoch: 7 Global Step: 126350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:26:54,212-Speed 3039.05 samples/sec Loss 2.1381 Epoch: 7 Global Step: 126400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:27:09,960-Speed 3251.24 samples/sec Loss 2.1212 Epoch: 7 Global Step: 126450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:27:26,579-Speed 3080.93 samples/sec Loss 2.1451 Epoch: 7 Global Step: 126500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:27:42,865-Speed 3143.90 samples/sec Loss 2.1155 Epoch: 7 Global Step: 126550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:27:58,941-Speed 3185.01 samples/sec Loss 2.1698 Epoch: 7 Global Step: 126600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:28:14,858-Speed 3216.67 samples/sec Loss 2.1401 Epoch: 7 Global Step: 126650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:28:30,683-Speed 3235.47 samples/sec Loss 2.1289 Epoch: 7 Global Step: 126700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:28:46,522-Speed 3232.71 samples/sec Loss 2.1390 Epoch: 7 Global Step: 126750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:29:02,545-Speed 3195.37 samples/sec Loss 2.1354 Epoch: 7 Global Step: 126800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:29:18,572-Speed 3194.83 samples/sec Loss 2.1448 Epoch: 7 Global Step: 126850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:29:34,421-Speed 3230.67 samples/sec Loss 2.1423 Epoch: 7 Global Step: 126900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:29:51,035-Speed 3081.78 samples/sec Loss 2.1247 Epoch: 7 Global Step: 126950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:30:07,030-Speed 3201.02 samples/sec Loss 2.1234 Epoch: 7 Global Step: 127000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:30:23,054-Speed 3195.22 samples/sec Loss 2.1296 Epoch: 7 Global Step: 127050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:30:38,833-Speed 3244.96 samples/sec Loss 2.1260 Epoch: 7 Global Step: 127100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:30:54,966-Speed 3173.85 samples/sec Loss 2.1276 Epoch: 7 Global Step: 127150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:31:10,758-Speed 3242.19 samples/sec Loss 2.1614 Epoch: 7 Global Step: 127200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:31:26,630-Speed 3225.83 samples/sec Loss 2.1059 Epoch: 7 Global Step: 127250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:31:42,739-Speed 3178.48 samples/sec Loss 2.0959 Epoch: 7 Global Step: 127300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:31:58,605-Speed 3227.13 samples/sec Loss 2.1661 Epoch: 7 Global Step: 127350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:32:14,562-Speed 3208.76 samples/sec Loss 2.1125 Epoch: 7 Global Step: 127400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:32:30,415-Speed 3229.80 samples/sec Loss 2.0962 Epoch: 7 Global Step: 127450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:32:46,426-Speed 3197.79 samples/sec Loss 2.1274 Epoch: 7 Global Step: 127500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:33:02,474-Speed 3190.48 samples/sec Loss 2.1002 Epoch: 7 Global Step: 127550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:33:18,240-Speed 3247.58 samples/sec Loss 2.1001 Epoch: 7 Global Step: 127600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:33:34,820-Speed 3088.19 samples/sec Loss 2.1033 Epoch: 7 Global Step: 127650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:33:51,198-Speed 3126.24 samples/sec Loss 2.1073 Epoch: 7 Global Step: 127700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:34:07,263-Speed 3187.16 samples/sec Loss 2.1160 Epoch: 7 Global Step: 127750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:34:23,622-Speed 3129.85 samples/sec Loss 2.1083 Epoch: 7 Global Step: 127800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:34:39,738-Speed 3177.23 samples/sec Loss 2.1123 Epoch: 7 Global Step: 127850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:34:57,165-Speed 2937.98 samples/sec Loss 2.1168 Epoch: 7 Global Step: 127900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:35:13,875-Speed 3064.07 samples/sec Loss 2.0995 Epoch: 7 Global Step: 127950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:35:30,917-Speed 3004.54 samples/sec Loss 2.0926 Epoch: 7 Global Step: 128000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:36:24,050-[lfw][128000]XNorm: 21.980892 Training: 2021-03-16 14:36:24,051-[lfw][128000]Accuracy-Flip: 0.99783+-0.00236 Training: 2021-03-16 14:36:24,051-[lfw][128000]Accuracy-Highest: 0.99783 Training: 2021-03-16 14:37:26,082-[cfp_fp][128000]XNorm: 20.500718 Training: 2021-03-16 14:37:26,082-[cfp_fp][128000]Accuracy-Flip: 0.98500+-0.00630 Training: 2021-03-16 14:37:26,082-[cfp_fp][128000]Accuracy-Highest: 0.98586 Training: 2021-03-16 14:38:19,394-[agedb_30][128000]XNorm: 22.300678 Training: 2021-03-16 14:38:19,394-[agedb_30][128000]Accuracy-Flip: 0.97767+-0.00593 Training: 2021-03-16 14:38:19,395-[agedb_30][128000]Accuracy-Highest: 0.97883 Training: 2021-03-16 14:38:35,413-Speed 277.51 samples/sec Loss 2.0817 Epoch: 7 Global Step: 128050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:38:52,117-Speed 3065.09 samples/sec Loss 2.0780 Epoch: 7 Global Step: 128100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:39:08,390-Speed 3146.47 samples/sec Loss 2.0966 Epoch: 7 Global Step: 128150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:39:24,155-Speed 3247.92 samples/sec Loss 2.0855 Epoch: 7 Global Step: 128200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:39:40,094-Speed 3212.35 samples/sec Loss 2.1256 Epoch: 7 Global Step: 128250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:39:56,135-Speed 3191.87 samples/sec Loss 2.1010 Epoch: 7 Global Step: 128300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:40:12,244-Speed 3178.35 samples/sec Loss 2.1030 Epoch: 7 Global Step: 128350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:40:28,048-Speed 3239.88 samples/sec Loss 2.1223 Epoch: 7 Global Step: 128400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:40:44,153-Speed 3179.09 samples/sec Loss 2.0966 Epoch: 7 Global Step: 128450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:41:00,072-Speed 3216.46 samples/sec Loss 2.0912 Epoch: 7 Global Step: 128500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:41:16,165-Speed 3181.71 samples/sec Loss 2.0818 Epoch: 7 Global Step: 128550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:41:33,142-Speed 3015.82 samples/sec Loss 2.1035 Epoch: 7 Global Step: 128600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:41:49,592-Speed 3112.58 samples/sec Loss 2.0553 Epoch: 7 Global Step: 128650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:42:06,435-Speed 3039.98 samples/sec Loss 2.0760 Epoch: 7 Global Step: 128700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:42:22,541-Speed 3178.92 samples/sec Loss 2.0783 Epoch: 7 Global Step: 128750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:42:38,390-Speed 3230.58 samples/sec Loss 2.0761 Epoch: 7 Global Step: 128800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:42:54,314-Speed 3215.54 samples/sec Loss 2.0715 Epoch: 7 Global Step: 128850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:43:10,498-Speed 3163.69 samples/sec Loss 2.0818 Epoch: 7 Global Step: 128900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:43:26,602-Speed 3179.34 samples/sec Loss 2.0971 Epoch: 7 Global Step: 128950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:43:42,679-Speed 3184.93 samples/sec Loss 2.0542 Epoch: 7 Global Step: 129000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:43:58,674-Speed 3200.97 samples/sec Loss 2.0632 Epoch: 7 Global Step: 129050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:44:15,402-Speed 3060.84 samples/sec Loss 2.0837 Epoch: 7 Global Step: 129100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:44:31,204-Speed 3240.23 samples/sec Loss 2.0683 Epoch: 7 Global Step: 129150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:44:47,004-Speed 3240.63 samples/sec Loss 2.0743 Epoch: 7 Global Step: 129200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:45:02,990-Speed 3202.76 samples/sec Loss 2.0880 Epoch: 7 Global Step: 129250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:45:18,895-Speed 3219.22 samples/sec Loss 2.0857 Epoch: 7 Global Step: 129300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:45:34,981-Speed 3183.15 samples/sec Loss 2.0725 Epoch: 7 Global Step: 129350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:45:50,858-Speed 3224.74 samples/sec Loss 2.0731 Epoch: 7 Global Step: 129400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:46:07,331-Speed 3108.29 samples/sec Loss 2.0630 Epoch: 7 Global Step: 129450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:46:23,495-Speed 3167.50 samples/sec Loss 2.0674 Epoch: 7 Global Step: 129500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:46:39,461-Speed 3207.08 samples/sec Loss 2.0517 Epoch: 7 Global Step: 129550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:46:55,331-Speed 3226.23 samples/sec Loss 2.0417 Epoch: 7 Global Step: 129600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:47:11,459-Speed 3174.73 samples/sec Loss 2.0579 Epoch: 7 Global Step: 129650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:47:27,375-Speed 3216.90 samples/sec Loss 2.0490 Epoch: 7 Global Step: 129700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:47:43,497-Speed 3176.04 samples/sec Loss 2.0640 Epoch: 7 Global Step: 129750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:47:59,623-Speed 3175.05 samples/sec Loss 2.0581 Epoch: 7 Global Step: 129800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:48:16,599-Speed 3016.16 samples/sec Loss 2.0549 Epoch: 7 Global Step: 129850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:48:32,621-Speed 3195.55 samples/sec Loss 2.0636 Epoch: 7 Global Step: 129900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:48:48,462-Speed 3232.34 samples/sec Loss 2.0399 Epoch: 7 Global Step: 129950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:49:04,515-Speed 3189.48 samples/sec Loss 2.0450 Epoch: 7 Global Step: 130000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:49:57,655-[lfw][130000]XNorm: 23.881637 Training: 2021-03-16 14:49:57,656-[lfw][130000]Accuracy-Flip: 0.99750+-0.00291 Training: 2021-03-16 14:49:57,656-[lfw][130000]Accuracy-Highest: 0.99783 Training: 2021-03-16 14:50:59,478-[cfp_fp][130000]XNorm: 22.139364 Training: 2021-03-16 14:50:59,479-[cfp_fp][130000]Accuracy-Flip: 0.98243+-0.00553 Training: 2021-03-16 14:50:59,479-[cfp_fp][130000]Accuracy-Highest: 0.98586 Training: 2021-03-16 14:51:52,504-[agedb_30][130000]XNorm: 23.656091 Training: 2021-03-16 14:51:52,504-[agedb_30][130000]Accuracy-Flip: 0.97817+-0.00598 Training: 2021-03-16 14:51:52,504-[agedb_30][130000]Accuracy-Highest: 0.97883 Training: 2021-03-16 14:52:09,325-Speed 277.04 samples/sec Loss 2.0307 Epoch: 7 Global Step: 130050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:52:25,949-Speed 3079.99 samples/sec Loss 2.0462 Epoch: 7 Global Step: 130100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:52:41,824-Speed 3225.41 samples/sec Loss 2.0308 Epoch: 7 Global Step: 130150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:52:57,716-Speed 3221.86 samples/sec Loss 2.0181 Epoch: 7 Global Step: 130200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:53:14,575-Speed 3037.07 samples/sec Loss 2.0219 Epoch: 7 Global Step: 130250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:53:30,798-Speed 3156.14 samples/sec Loss 2.0173 Epoch: 7 Global Step: 130300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:53:47,740-Speed 3022.09 samples/sec Loss 2.0060 Epoch: 7 Global Step: 130350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:54:03,505-Speed 3247.76 samples/sec Loss 2.0044 Epoch: 7 Global Step: 130400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:54:19,665-Speed 3168.47 samples/sec Loss 2.0176 Epoch: 7 Global Step: 130450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:54:35,765-Speed 3180.22 samples/sec Loss 2.0629 Epoch: 7 Global Step: 130500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:54:51,410-Speed 3272.73 samples/sec Loss 2.0092 Epoch: 7 Global Step: 130550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:55:07,681-Speed 3146.83 samples/sec Loss 2.0623 Epoch: 7 Global Step: 130600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:55:23,842-Speed 3168.07 samples/sec Loss 2.0093 Epoch: 7 Global Step: 130650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:55:40,034-Speed 3162.31 samples/sec Loss 2.0264 Epoch: 7 Global Step: 130700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:55:57,171-Speed 2987.64 samples/sec Loss 2.0292 Epoch: 7 Global Step: 130750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:56:13,217-Speed 3191.02 samples/sec Loss 2.0146 Epoch: 7 Global Step: 130800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:56:29,161-Speed 3211.36 samples/sec Loss 2.0262 Epoch: 7 Global Step: 130850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:56:45,272-Speed 3178.05 samples/sec Loss 1.9991 Epoch: 7 Global Step: 130900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:57:02,092-Speed 3043.99 samples/sec Loss 2.0009 Epoch: 7 Global Step: 130950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:57:18,349-Speed 3149.46 samples/sec Loss 2.0242 Epoch: 7 Global Step: 131000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:57:34,693-Speed 3132.82 samples/sec Loss 1.9925 Epoch: 7 Global Step: 131050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:57:50,598-Speed 3219.22 samples/sec Loss 2.0054 Epoch: 7 Global Step: 131100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:58:06,440-Speed 3232.09 samples/sec Loss 2.0062 Epoch: 7 Global Step: 131150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:58:22,252-Speed 3238.02 samples/sec Loss 2.0184 Epoch: 7 Global Step: 131200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:58:39,145-Speed 3030.97 samples/sec Loss 1.9955 Epoch: 7 Global Step: 131250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:58:54,964-Speed 3236.63 samples/sec Loss 2.0129 Epoch: 7 Global Step: 131300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:59:10,893-Speed 3214.42 samples/sec Loss 2.0148 Epoch: 7 Global Step: 131350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:59:26,798-Speed 3219.18 samples/sec Loss 1.9905 Epoch: 7 Global Step: 131400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:59:42,613-Speed 3237.50 samples/sec Loss 2.0076 Epoch: 7 Global Step: 131450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 14:59:58,772-Speed 3168.61 samples/sec Loss 2.0087 Epoch: 7 Global Step: 131500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:00:14,711-Speed 3212.34 samples/sec Loss 2.0133 Epoch: 7 Global Step: 131550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:00:30,676-Speed 3207.16 samples/sec Loss 2.0266 Epoch: 7 Global Step: 131600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:00:46,933-Speed 3149.58 samples/sec Loss 1.9961 Epoch: 7 Global Step: 131650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:01:03,132-Speed 3160.71 samples/sec Loss 2.0004 Epoch: 7 Global Step: 131700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:01:19,000-Speed 3226.82 samples/sec Loss 2.0027 Epoch: 7 Global Step: 131750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:01:35,046-Speed 3190.92 samples/sec Loss 1.9884 Epoch: 7 Global Step: 131800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:01:51,049-Speed 3199.45 samples/sec Loss 1.9876 Epoch: 7 Global Step: 131850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:02:07,057-Speed 3198.44 samples/sec Loss 1.9858 Epoch: 7 Global Step: 131900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:02:23,188-Speed 3174.19 samples/sec Loss 2.0074 Epoch: 7 Global Step: 131950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:02:39,419-Speed 3154.60 samples/sec Loss 2.0069 Epoch: 7 Global Step: 132000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:03:32,828-[lfw][132000]XNorm: 23.318636 Training: 2021-03-16 15:03:32,828-[lfw][132000]Accuracy-Flip: 0.99750+-0.00291 Training: 2021-03-16 15:03:32,828-[lfw][132000]Accuracy-Highest: 0.99783 Training: 2021-03-16 15:04:34,880-[cfp_fp][132000]XNorm: 21.643806 Training: 2021-03-16 15:04:34,880-[cfp_fp][132000]Accuracy-Flip: 0.98186+-0.00623 Training: 2021-03-16 15:04:34,880-[cfp_fp][132000]Accuracy-Highest: 0.98586 Training: 2021-03-16 15:05:28,462-[agedb_30][132000]XNorm: 23.193969 Training: 2021-03-16 15:05:28,462-[agedb_30][132000]Accuracy-Flip: 0.97600+-0.00727 Training: 2021-03-16 15:05:28,462-[agedb_30][132000]Accuracy-Highest: 0.97883 Training: 2021-03-16 15:05:44,532-Speed 276.59 samples/sec Loss 1.9875 Epoch: 7 Global Step: 132050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:06:01,553-Speed 3008.10 samples/sec Loss 1.9959 Epoch: 7 Global Step: 132100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:06:18,078-Speed 3098.35 samples/sec Loss 1.9678 Epoch: 7 Global Step: 132150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:06:34,609-Speed 3097.42 samples/sec Loss 1.9887 Epoch: 7 Global Step: 132200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:06:50,684-Speed 3185.10 samples/sec Loss 1.9618 Epoch: 7 Global Step: 132250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:07:06,588-Speed 3219.44 samples/sec Loss 1.9825 Epoch: 7 Global Step: 132300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:07:23,637-Speed 3003.25 samples/sec Loss 1.9871 Epoch: 7 Global Step: 132350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:07:39,679-Speed 3191.71 samples/sec Loss 1.9910 Epoch: 7 Global Step: 132400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:07:55,872-Speed 3161.91 samples/sec Loss 1.9861 Epoch: 7 Global Step: 132450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:08:13,575-Speed 2892.25 samples/sec Loss 1.9696 Epoch: 7 Global Step: 132500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:08:29,408-Speed 3233.91 samples/sec Loss 1.9652 Epoch: 7 Global Step: 132550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:08:45,485-Speed 3184.71 samples/sec Loss 1.9780 Epoch: 7 Global Step: 132600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:09:01,424-Speed 3212.45 samples/sec Loss 1.9726 Epoch: 7 Global Step: 132650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:09:17,500-Speed 3184.92 samples/sec Loss 1.9859 Epoch: 7 Global Step: 132700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:09:33,237-Speed 3253.71 samples/sec Loss 1.9530 Epoch: 7 Global Step: 132750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:09:49,190-Speed 3209.35 samples/sec Loss 1.9529 Epoch: 7 Global Step: 132800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:10:05,308-Speed 3176.77 samples/sec Loss 1.9671 Epoch: 7 Global Step: 132850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:10:21,249-Speed 3211.91 samples/sec Loss 1.9476 Epoch: 7 Global Step: 132900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:10:38,043-Speed 3048.83 samples/sec Loss 1.9783 Epoch: 7 Global Step: 132950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:10:54,321-Speed 3145.34 samples/sec Loss 1.9665 Epoch: 7 Global Step: 133000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:11:10,228-Speed 3218.97 samples/sec Loss 1.9524 Epoch: 7 Global Step: 133050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:11:26,332-Speed 3179.38 samples/sec Loss 1.9555 Epoch: 7 Global Step: 133100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:11:43,306-Speed 3016.49 samples/sec Loss 2.0031 Epoch: 7 Global Step: 133150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:11:59,367-Speed 3187.92 samples/sec Loss 1.9380 Epoch: 7 Global Step: 133200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:12:15,240-Speed 3225.66 samples/sec Loss 1.9656 Epoch: 7 Global Step: 133250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:12:31,222-Speed 3203.82 samples/sec Loss 1.9484 Epoch: 7 Global Step: 133300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:12:47,515-Speed 3142.43 samples/sec Loss 1.9489 Epoch: 7 Global Step: 133350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:13:03,396-Speed 3224.20 samples/sec Loss 1.9424 Epoch: 7 Global Step: 133400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:13:20,516-Speed 2990.63 samples/sec Loss 1.9648 Epoch: 7 Global Step: 133450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:13:36,898-Speed 3125.43 samples/sec Loss 1.9239 Epoch: 7 Global Step: 133500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:14:06,890-Speed 1707.19 samples/sec Loss 1.8268 Epoch: 8 Global Step: 133550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:14:23,169-Speed 3145.19 samples/sec Loss 1.6547 Epoch: 8 Global Step: 133600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:14:39,269-Speed 3180.35 samples/sec Loss 1.6673 Epoch: 8 Global Step: 133650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:14:55,196-Speed 3214.64 samples/sec Loss 1.6835 Epoch: 8 Global Step: 133700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:15:11,245-Speed 3190.40 samples/sec Loss 1.6541 Epoch: 8 Global Step: 133750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:15:27,228-Speed 3203.62 samples/sec Loss 1.6363 Epoch: 8 Global Step: 133800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:15:43,465-Speed 3153.26 samples/sec Loss 1.6529 Epoch: 8 Global Step: 133850 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:15:59,191-Speed 3255.95 samples/sec Loss 1.6547 Epoch: 8 Global Step: 133900 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:16:15,046-Speed 3229.36 samples/sec Loss 1.6545 Epoch: 8 Global Step: 133950 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:16:31,160-Speed 3177.54 samples/sec Loss 1.6569 Epoch: 8 Global Step: 134000 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:17:24,572-[lfw][134000]XNorm: 22.888676 Training: 2021-03-16 15:17:24,573-[lfw][134000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-16 15:17:24,573-[lfw][134000]Accuracy-Highest: 0.99800 Training: 2021-03-16 15:18:26,363-[cfp_fp][134000]XNorm: 21.260297 Training: 2021-03-16 15:18:26,364-[cfp_fp][134000]Accuracy-Flip: 0.98657+-0.00479 Training: 2021-03-16 15:18:26,364-[cfp_fp][134000]Accuracy-Highest: 0.98657 Training: 2021-03-16 15:19:19,507-[agedb_30][134000]XNorm: 22.780793 Training: 2021-03-16 15:19:19,507-[agedb_30][134000]Accuracy-Flip: 0.97850+-0.00787 Training: 2021-03-16 15:19:19,508-[agedb_30][134000]Accuracy-Highest: 0.97883 Training: 2021-03-16 15:19:35,225-Speed 278.16 samples/sec Loss 1.6352 Epoch: 8 Global Step: 134050 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:19:50,961-Speed 3253.94 samples/sec Loss 1.6510 Epoch: 8 Global Step: 134100 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:20:07,156-Speed 3161.43 samples/sec Loss 1.6694 Epoch: 8 Global Step: 134150 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:20:23,030-Speed 3225.51 samples/sec Loss 1.6701 Epoch: 8 Global Step: 134200 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:20:40,092-Speed 3001.00 samples/sec Loss 1.6714 Epoch: 8 Global Step: 134250 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:20:55,965-Speed 3225.64 samples/sec Loss 1.6567 Epoch: 8 Global Step: 134300 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:21:12,759-Speed 3048.72 samples/sec Loss 1.6740 Epoch: 8 Global Step: 134350 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:21:28,627-Speed 3226.89 samples/sec Loss 1.6507 Epoch: 8 Global Step: 134400 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:21:44,665-Speed 3192.40 samples/sec Loss 1.6507 Epoch: 8 Global Step: 134450 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:22:00,443-Speed 3245.13 samples/sec Loss 1.6738 Epoch: 8 Global Step: 134500 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:22:17,038-Speed 3085.40 samples/sec Loss 1.6721 Epoch: 8 Global Step: 134550 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:22:33,528-Speed 3104.98 samples/sec Loss 1.6661 Epoch: 8 Global Step: 134600 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:22:49,363-Speed 3233.52 samples/sec Loss 1.6884 Epoch: 8 Global Step: 134650 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:23:07,722-Speed 2788.89 samples/sec Loss 1.6672 Epoch: 8 Global Step: 134700 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:23:23,621-Speed 3220.41 samples/sec Loss 1.6521 Epoch: 8 Global Step: 134750 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:23:39,563-Speed 3211.80 samples/sec Loss 1.6832 Epoch: 8 Global Step: 134800 Fp16 Grad Scale: 16384 Required: 23 hours Training: 2021-03-16 15:23:55,716-Speed 3169.80 samples/sec Loss 1.6911 Epoch: 8 Global Step: 134850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:24:11,495-Speed 3244.78 samples/sec Loss 1.6722 Epoch: 8 Global Step: 134900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:24:27,472-Speed 3204.74 samples/sec Loss 1.6637 Epoch: 8 Global Step: 134950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:24:43,438-Speed 3206.99 samples/sec Loss 1.6837 Epoch: 8 Global Step: 135000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:24:59,296-Speed 3228.77 samples/sec Loss 1.6549 Epoch: 8 Global Step: 135050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:25:16,131-Speed 3041.35 samples/sec Loss 1.6613 Epoch: 8 Global Step: 135100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:25:32,123-Speed 3201.75 samples/sec Loss 1.6863 Epoch: 8 Global Step: 135150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:25:47,966-Speed 3231.74 samples/sec Loss 1.7038 Epoch: 8 Global Step: 135200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:26:03,933-Speed 3206.74 samples/sec Loss 1.7022 Epoch: 8 Global Step: 135250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:26:20,797-Speed 3036.08 samples/sec Loss 1.6907 Epoch: 8 Global Step: 135300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:26:36,810-Speed 3197.49 samples/sec Loss 1.6883 Epoch: 8 Global Step: 135350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:26:52,867-Speed 3188.76 samples/sec Loss 1.6892 Epoch: 8 Global Step: 135400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:27:08,911-Speed 3191.32 samples/sec Loss 1.6761 Epoch: 8 Global Step: 135450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:27:24,989-Speed 3184.59 samples/sec Loss 1.7082 Epoch: 8 Global Step: 135500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:27:42,398-Speed 2941.19 samples/sec Loss 1.6690 Epoch: 8 Global Step: 135550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:27:58,647-Speed 3151.12 samples/sec Loss 1.6943 Epoch: 8 Global Step: 135600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:28:14,646-Speed 3200.27 samples/sec Loss 1.7148 Epoch: 8 Global Step: 135650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:28:30,575-Speed 3214.25 samples/sec Loss 1.7081 Epoch: 8 Global Step: 135700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:28:46,733-Speed 3168.82 samples/sec Loss 1.6694 Epoch: 8 Global Step: 135750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:29:02,676-Speed 3211.61 samples/sec Loss 1.6586 Epoch: 8 Global Step: 135800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:29:18,600-Speed 3215.37 samples/sec Loss 1.6907 Epoch: 8 Global Step: 135850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:29:34,524-Speed 3215.46 samples/sec Loss 1.6796 Epoch: 8 Global Step: 135900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:29:50,567-Speed 3191.51 samples/sec Loss 1.7097 Epoch: 8 Global Step: 135950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:30:06,781-Speed 3157.79 samples/sec Loss 1.7031 Epoch: 8 Global Step: 136000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:30:59,941-[lfw][136000]XNorm: 24.132196 Training: 2021-03-16 15:30:59,941-[lfw][136000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-16 15:30:59,941-[lfw][136000]Accuracy-Highest: 0.99800 Training: 2021-03-16 15:32:01,745-[cfp_fp][136000]XNorm: 22.402091 Training: 2021-03-16 15:32:01,745-[cfp_fp][136000]Accuracy-Flip: 0.98300+-0.00493 Training: 2021-03-16 15:32:01,745-[cfp_fp][136000]Accuracy-Highest: 0.98657 Training: 2021-03-16 15:32:55,276-[agedb_30][136000]XNorm: 23.919130 Training: 2021-03-16 15:32:55,276-[agedb_30][136000]Accuracy-Flip: 0.97650+-0.00647 Training: 2021-03-16 15:32:55,276-[agedb_30][136000]Accuracy-Highest: 0.97883 Training: 2021-03-16 15:33:11,034-Speed 277.88 samples/sec Loss 1.6898 Epoch: 8 Global Step: 136050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:33:27,391-Speed 3130.29 samples/sec Loss 1.6860 Epoch: 8 Global Step: 136100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:33:43,242-Speed 3230.32 samples/sec Loss 1.6897 Epoch: 8 Global Step: 136150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:33:58,966-Speed 3256.24 samples/sec Loss 1.6654 Epoch: 8 Global Step: 136200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:34:14,996-Speed 3193.96 samples/sec Loss 1.6884 Epoch: 8 Global Step: 136250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:34:31,087-Speed 3182.08 samples/sec Loss 1.6992 Epoch: 8 Global Step: 136300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:34:47,157-Speed 3186.25 samples/sec Loss 1.6729 Epoch: 8 Global Step: 136350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:35:03,067-Speed 3218.12 samples/sec Loss 1.6537 Epoch: 8 Global Step: 136400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:35:19,277-Speed 3158.67 samples/sec Loss 1.6936 Epoch: 8 Global Step: 136450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:35:36,054-Speed 3051.89 samples/sec Loss 1.6875 Epoch: 8 Global Step: 136500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:35:52,731-Speed 3070.13 samples/sec Loss 1.6912 Epoch: 8 Global Step: 136550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:36:08,672-Speed 3211.97 samples/sec Loss 1.6931 Epoch: 8 Global Step: 136600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:36:24,375-Speed 3260.63 samples/sec Loss 1.7113 Epoch: 8 Global Step: 136650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:36:40,212-Speed 3232.98 samples/sec Loss 1.6997 Epoch: 8 Global Step: 136700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:36:56,655-Speed 3113.91 samples/sec Loss 1.6717 Epoch: 8 Global Step: 136750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:37:12,747-Speed 3181.86 samples/sec Loss 1.6787 Epoch: 8 Global Step: 136800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:37:29,553-Speed 3046.60 samples/sec Loss 1.7059 Epoch: 8 Global Step: 136850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:37:46,680-Speed 2989.46 samples/sec Loss 1.6851 Epoch: 8 Global Step: 136900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:38:02,476-Speed 3241.42 samples/sec Loss 1.7147 Epoch: 8 Global Step: 136950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:38:18,771-Speed 3142.27 samples/sec Loss 1.7069 Epoch: 8 Global Step: 137000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:38:34,746-Speed 3205.00 samples/sec Loss 1.7079 Epoch: 8 Global Step: 137050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:38:50,705-Speed 3208.31 samples/sec Loss 1.6917 Epoch: 8 Global Step: 137100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:39:06,873-Speed 3166.81 samples/sec Loss 1.7170 Epoch: 8 Global Step: 137150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:39:23,098-Speed 3155.75 samples/sec Loss 1.6897 Epoch: 8 Global Step: 137200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:39:39,227-Speed 3174.49 samples/sec Loss 1.7062 Epoch: 8 Global Step: 137250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:39:55,281-Speed 3189.33 samples/sec Loss 1.7085 Epoch: 8 Global Step: 137300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:40:11,304-Speed 3195.55 samples/sec Loss 1.7080 Epoch: 8 Global Step: 137350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:40:27,859-Speed 3092.82 samples/sec Loss 1.6888 Epoch: 8 Global Step: 137400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:40:43,978-Speed 3176.54 samples/sec Loss 1.6796 Epoch: 8 Global Step: 137450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:41:00,720-Speed 3058.21 samples/sec Loss 1.6813 Epoch: 8 Global Step: 137500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:41:16,774-Speed 3189.32 samples/sec Loss 1.6884 Epoch: 8 Global Step: 137550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:41:32,866-Speed 3181.71 samples/sec Loss 1.7076 Epoch: 8 Global Step: 137600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:41:48,760-Speed 3221.56 samples/sec Loss 1.6679 Epoch: 8 Global Step: 137650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:42:04,563-Speed 3239.89 samples/sec Loss 1.6883 Epoch: 8 Global Step: 137700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:42:20,441-Speed 3224.75 samples/sec Loss 1.6753 Epoch: 8 Global Step: 137750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:42:37,291-Speed 3038.70 samples/sec Loss 1.6774 Epoch: 8 Global Step: 137800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:42:53,422-Speed 3174.01 samples/sec Loss 1.6800 Epoch: 8 Global Step: 137850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:43:09,807-Speed 3125.05 samples/sec Loss 1.6903 Epoch: 8 Global Step: 137900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:43:25,611-Speed 3239.77 samples/sec Loss 1.7113 Epoch: 8 Global Step: 137950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:43:41,497-Speed 3222.94 samples/sec Loss 1.6994 Epoch: 8 Global Step: 138000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:44:34,901-[lfw][138000]XNorm: 22.276598 Training: 2021-03-16 15:44:34,901-[lfw][138000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-16 15:44:34,901-[lfw][138000]Accuracy-Highest: 0.99800 Training: 2021-03-16 15:45:36,938-[cfp_fp][138000]XNorm: 20.959152 Training: 2021-03-16 15:45:36,939-[cfp_fp][138000]Accuracy-Flip: 0.98057+-0.00620 Training: 2021-03-16 15:45:36,939-[cfp_fp][138000]Accuracy-Highest: 0.98657 Training: 2021-03-16 15:46:30,227-[agedb_30][138000]XNorm: 22.378507 Training: 2021-03-16 15:46:30,228-[agedb_30][138000]Accuracy-Flip: 0.97817+-0.00689 Training: 2021-03-16 15:46:30,228-[agedb_30][138000]Accuracy-Highest: 0.97883 Training: 2021-03-16 15:46:46,184-Speed 277.23 samples/sec Loss 1.7147 Epoch: 8 Global Step: 138050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:47:02,169-Speed 3203.07 samples/sec Loss 1.6927 Epoch: 8 Global Step: 138100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:47:18,224-Speed 3189.29 samples/sec Loss 1.6818 Epoch: 8 Global Step: 138150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:47:33,919-Speed 3262.21 samples/sec Loss 1.6906 Epoch: 8 Global Step: 138200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:47:49,861-Speed 3211.73 samples/sec Loss 1.7005 Epoch: 8 Global Step: 138250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:48:05,995-Speed 3173.49 samples/sec Loss 1.6903 Epoch: 8 Global Step: 138300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:48:21,901-Speed 3219.07 samples/sec Loss 1.7374 Epoch: 8 Global Step: 138350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:48:38,021-Speed 3176.30 samples/sec Loss 1.7044 Epoch: 8 Global Step: 138400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:48:54,068-Speed 3190.66 samples/sec Loss 1.6882 Epoch: 8 Global Step: 138450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:49:09,891-Speed 3235.97 samples/sec Loss 1.6766 Epoch: 8 Global Step: 138500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:49:26,031-Speed 3172.41 samples/sec Loss 1.7149 Epoch: 8 Global Step: 138550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:49:42,248-Speed 3157.22 samples/sec Loss 1.6991 Epoch: 8 Global Step: 138600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:49:58,549-Speed 3141.06 samples/sec Loss 1.6610 Epoch: 8 Global Step: 138650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:50:16,589-Speed 2838.10 samples/sec Loss 1.7133 Epoch: 8 Global Step: 138700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:50:32,717-Speed 3174.74 samples/sec Loss 1.6975 Epoch: 8 Global Step: 138750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:50:48,555-Speed 3232.81 samples/sec Loss 1.7006 Epoch: 8 Global Step: 138800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:51:04,654-Speed 3180.37 samples/sec Loss 1.7063 Epoch: 8 Global Step: 138850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:51:20,690-Speed 3192.96 samples/sec Loss 1.6789 Epoch: 8 Global Step: 138900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:51:37,598-Speed 3028.32 samples/sec Loss 1.6619 Epoch: 8 Global Step: 138950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:51:54,305-Speed 3064.60 samples/sec Loss 1.6851 Epoch: 8 Global Step: 139000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:52:11,201-Speed 3030.46 samples/sec Loss 1.6912 Epoch: 8 Global Step: 139050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:52:27,135-Speed 3213.36 samples/sec Loss 1.6776 Epoch: 8 Global Step: 139100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:52:43,228-Speed 3181.53 samples/sec Loss 1.6785 Epoch: 8 Global Step: 139150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:52:59,305-Speed 3184.85 samples/sec Loss 1.6644 Epoch: 8 Global Step: 139200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:53:15,347-Speed 3191.69 samples/sec Loss 1.6925 Epoch: 8 Global Step: 139250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:53:31,436-Speed 3182.38 samples/sec Loss 1.6940 Epoch: 8 Global Step: 139300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:53:47,418-Speed 3203.76 samples/sec Loss 1.6683 Epoch: 8 Global Step: 139350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:54:03,419-Speed 3199.81 samples/sec Loss 1.6785 Epoch: 8 Global Step: 139400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:54:19,509-Speed 3182.37 samples/sec Loss 1.6786 Epoch: 8 Global Step: 139450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:54:35,609-Speed 3180.21 samples/sec Loss 1.6974 Epoch: 8 Global Step: 139500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:54:52,379-Speed 3053.04 samples/sec Loss 1.6889 Epoch: 8 Global Step: 139550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:55:08,448-Speed 3186.40 samples/sec Loss 1.6958 Epoch: 8 Global Step: 139600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:55:25,288-Speed 3040.49 samples/sec Loss 1.6804 Epoch: 8 Global Step: 139650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:55:41,279-Speed 3201.82 samples/sec Loss 1.6745 Epoch: 8 Global Step: 139700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:55:57,392-Speed 3177.83 samples/sec Loss 1.6807 Epoch: 8 Global Step: 139750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:56:13,422-Speed 3193.93 samples/sec Loss 1.6796 Epoch: 8 Global Step: 139800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:56:29,406-Speed 3203.49 samples/sec Loss 1.6768 Epoch: 8 Global Step: 139850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:56:45,972-Speed 3090.67 samples/sec Loss 1.6803 Epoch: 8 Global Step: 139900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:57:01,849-Speed 3224.81 samples/sec Loss 1.6709 Epoch: 8 Global Step: 139950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:57:18,209-Speed 3129.72 samples/sec Loss 1.7089 Epoch: 8 Global Step: 140000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 15:58:11,489-[lfw][140000]XNorm: 22.818354 Training: 2021-03-16 15:58:11,489-[lfw][140000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-16 15:58:11,489-[lfw][140000]Accuracy-Highest: 0.99800 Training: 2021-03-16 15:59:13,157-[cfp_fp][140000]XNorm: 21.583626 Training: 2021-03-16 15:59:13,157-[cfp_fp][140000]Accuracy-Flip: 0.98514+-0.00520 Training: 2021-03-16 15:59:13,157-[cfp_fp][140000]Accuracy-Highest: 0.98657 Training: 2021-03-16 16:00:06,552-[agedb_30][140000]XNorm: 22.922046 Training: 2021-03-16 16:00:06,552-[agedb_30][140000]Accuracy-Flip: 0.97950+-0.00727 Training: 2021-03-16 16:00:06,552-[agedb_30][140000]Accuracy-Highest: 0.97950 Training: 2021-03-16 16:00:23,528-Speed 276.28 samples/sec Loss 1.6946 Epoch: 8 Global Step: 140050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:00:39,556-Speed 3194.63 samples/sec Loss 1.6763 Epoch: 8 Global Step: 140100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:00:55,934-Speed 3126.24 samples/sec Loss 1.7038 Epoch: 8 Global Step: 140150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:01:11,958-Speed 3195.25 samples/sec Loss 1.7099 Epoch: 8 Global Step: 140200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:01:27,798-Speed 3232.46 samples/sec Loss 1.7192 Epoch: 8 Global Step: 140250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:01:43,796-Speed 3200.48 samples/sec Loss 1.7013 Epoch: 8 Global Step: 140300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:01:59,644-Speed 3230.78 samples/sec Loss 1.7172 Epoch: 8 Global Step: 140350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:02:15,646-Speed 3199.62 samples/sec Loss 1.6828 Epoch: 8 Global Step: 140400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:02:32,350-Speed 3065.28 samples/sec Loss 1.7144 Epoch: 8 Global Step: 140450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:02:48,247-Speed 3220.87 samples/sec Loss 1.6979 Epoch: 8 Global Step: 140500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:03:04,483-Speed 3153.66 samples/sec Loss 1.7146 Epoch: 8 Global Step: 140550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:03:20,371-Speed 3222.56 samples/sec Loss 1.7018 Epoch: 8 Global Step: 140600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:03:36,345-Speed 3205.37 samples/sec Loss 1.7078 Epoch: 8 Global Step: 140650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:03:52,668-Speed 3136.72 samples/sec Loss 1.6765 Epoch: 8 Global Step: 140700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:04:08,731-Speed 3187.64 samples/sec Loss 1.6849 Epoch: 8 Global Step: 140750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:04:25,002-Speed 3146.77 samples/sec Loss 1.6993 Epoch: 8 Global Step: 140800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:04:42,509-Speed 2924.55 samples/sec Loss 1.6683 Epoch: 8 Global Step: 140850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:04:58,858-Speed 3131.74 samples/sec Loss 1.7120 Epoch: 8 Global Step: 140900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:05:14,893-Speed 3193.21 samples/sec Loss 1.6597 Epoch: 8 Global Step: 140950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:05:31,093-Speed 3160.50 samples/sec Loss 1.6865 Epoch: 8 Global Step: 141000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:05:47,340-Speed 3151.42 samples/sec Loss 1.6955 Epoch: 8 Global Step: 141050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:06:03,318-Speed 3204.52 samples/sec Loss 1.6613 Epoch: 8 Global Step: 141100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:06:20,990-Speed 2897.45 samples/sec Loss 1.6911 Epoch: 8 Global Step: 141150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:06:37,039-Speed 3190.27 samples/sec Loss 1.7006 Epoch: 8 Global Step: 141200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:06:53,811-Speed 3052.70 samples/sec Loss 1.6833 Epoch: 8 Global Step: 141250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:07:10,015-Speed 3159.99 samples/sec Loss 1.6922 Epoch: 8 Global Step: 141300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:07:26,233-Speed 3156.97 samples/sec Loss 1.6762 Epoch: 8 Global Step: 141350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:07:42,373-Speed 3172.41 samples/sec Loss 1.6702 Epoch: 8 Global Step: 141400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:07:58,293-Speed 3216.09 samples/sec Loss 1.6734 Epoch: 8 Global Step: 141450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:08:14,174-Speed 3224.15 samples/sec Loss 1.6562 Epoch: 8 Global Step: 141500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:08:30,421-Speed 3151.43 samples/sec Loss 1.7070 Epoch: 8 Global Step: 141550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:08:46,664-Speed 3152.22 samples/sec Loss 1.7228 Epoch: 8 Global Step: 141600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:09:02,734-Speed 3186.20 samples/sec Loss 1.6955 Epoch: 8 Global Step: 141650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:09:19,302-Speed 3090.39 samples/sec Loss 1.6634 Epoch: 8 Global Step: 141700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:09:36,498-Speed 2977.51 samples/sec Loss 1.6720 Epoch: 8 Global Step: 141750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:09:53,553-Speed 3002.15 samples/sec Loss 1.6972 Epoch: 8 Global Step: 141800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:10:09,615-Speed 3187.65 samples/sec Loss 1.6813 Epoch: 8 Global Step: 141850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:10:25,520-Speed 3219.24 samples/sec Loss 1.6845 Epoch: 8 Global Step: 141900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:10:41,581-Speed 3188.02 samples/sec Loss 1.7021 Epoch: 8 Global Step: 141950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:10:57,689-Speed 3178.64 samples/sec Loss 1.6952 Epoch: 8 Global Step: 142000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:11:50,783-[lfw][142000]XNorm: 22.387110 Training: 2021-03-16 16:11:50,784-[lfw][142000]Accuracy-Flip: 0.99767+-0.00300 Training: 2021-03-16 16:11:50,784-[lfw][142000]Accuracy-Highest: 0.99800 Training: 2021-03-16 16:12:52,588-[cfp_fp][142000]XNorm: 21.183766 Training: 2021-03-16 16:12:52,588-[cfp_fp][142000]Accuracy-Flip: 0.98329+-0.00670 Training: 2021-03-16 16:12:52,589-[cfp_fp][142000]Accuracy-Highest: 0.98657 Training: 2021-03-16 16:13:45,798-[agedb_30][142000]XNorm: 22.510329 Training: 2021-03-16 16:13:45,798-[agedb_30][142000]Accuracy-Flip: 0.97950+-0.00654 Training: 2021-03-16 16:13:45,798-[agedb_30][142000]Accuracy-Highest: 0.97950 Training: 2021-03-16 16:14:01,627-Speed 278.36 samples/sec Loss 1.6749 Epoch: 8 Global Step: 142050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:14:18,224-Speed 3085.02 samples/sec Loss 1.6972 Epoch: 8 Global Step: 142100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:14:34,149-Speed 3215.14 samples/sec Loss 1.6818 Epoch: 8 Global Step: 142150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:14:49,938-Speed 3243.00 samples/sec Loss 1.6704 Epoch: 8 Global Step: 142200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:15:07,164-Speed 2972.26 samples/sec Loss 1.6850 Epoch: 8 Global Step: 142250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:15:23,428-Speed 3148.09 samples/sec Loss 1.6947 Epoch: 8 Global Step: 142300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:15:39,328-Speed 3220.39 samples/sec Loss 1.6814 Epoch: 8 Global Step: 142350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:15:55,358-Speed 3194.07 samples/sec Loss 1.6712 Epoch: 8 Global Step: 142400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:16:11,516-Speed 3168.82 samples/sec Loss 1.6710 Epoch: 8 Global Step: 142450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:16:27,349-Speed 3233.76 samples/sec Loss 1.7020 Epoch: 8 Global Step: 142500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:16:43,464-Speed 3177.35 samples/sec Loss 1.6780 Epoch: 8 Global Step: 142550 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:16:59,297-Speed 3233.83 samples/sec Loss 1.6682 Epoch: 8 Global Step: 142600 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:17:15,202-Speed 3219.25 samples/sec Loss 1.6638 Epoch: 8 Global Step: 142650 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:17:31,082-Speed 3224.20 samples/sec Loss 1.7037 Epoch: 8 Global Step: 142700 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:17:47,308-Speed 3155.58 samples/sec Loss 1.6739 Epoch: 8 Global Step: 142750 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:18:03,391-Speed 3183.68 samples/sec Loss 1.6808 Epoch: 8 Global Step: 142800 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:18:19,443-Speed 3189.56 samples/sec Loss 1.6894 Epoch: 8 Global Step: 142850 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:18:35,554-Speed 3178.14 samples/sec Loss 1.6844 Epoch: 8 Global Step: 142900 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:18:51,655-Speed 3180.03 samples/sec Loss 1.6883 Epoch: 8 Global Step: 142950 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:19:08,500-Speed 3039.51 samples/sec Loss 1.6746 Epoch: 8 Global Step: 143000 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:19:25,338-Speed 3040.91 samples/sec Loss 1.6806 Epoch: 8 Global Step: 143050 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:19:41,328-Speed 3202.09 samples/sec Loss 1.6887 Epoch: 8 Global Step: 143100 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:19:57,230-Speed 3219.78 samples/sec Loss 1.6732 Epoch: 8 Global Step: 143150 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:20:13,146-Speed 3216.92 samples/sec Loss 1.6612 Epoch: 8 Global Step: 143200 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:20:29,184-Speed 3192.65 samples/sec Loss 1.7031 Epoch: 8 Global Step: 143250 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:20:47,019-Speed 2870.84 samples/sec Loss 1.6609 Epoch: 8 Global Step: 143300 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:21:03,251-Speed 3154.29 samples/sec Loss 1.6805 Epoch: 8 Global Step: 143350 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:21:19,809-Speed 3092.28 samples/sec Loss 1.6768 Epoch: 8 Global Step: 143400 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:21:35,864-Speed 3189.07 samples/sec Loss 1.6813 Epoch: 8 Global Step: 143450 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:21:52,669-Speed 3046.87 samples/sec Loss 1.6831 Epoch: 8 Global Step: 143500 Fp16 Grad Scale: 16384 Required: 22 hours Training: 2021-03-16 16:22:08,649-Speed 3204.06 samples/sec Loss 1.6934 Epoch: 8 Global Step: 143550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:22:24,487-Speed 3232.79 samples/sec Loss 1.6789 Epoch: 8 Global Step: 143600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:22:40,677-Speed 3162.65 samples/sec Loss 1.6933 Epoch: 8 Global Step: 143650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:22:56,718-Speed 3191.80 samples/sec Loss 1.6977 Epoch: 8 Global Step: 143700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:23:12,863-Speed 3171.42 samples/sec Loss 1.6945 Epoch: 8 Global Step: 143750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:23:29,009-Speed 3171.11 samples/sec Loss 1.6733 Epoch: 8 Global Step: 143800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:23:45,252-Speed 3152.36 samples/sec Loss 1.6477 Epoch: 8 Global Step: 143850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:24:01,999-Speed 3057.30 samples/sec Loss 1.6576 Epoch: 8 Global Step: 143900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:24:18,050-Speed 3189.84 samples/sec Loss 1.6558 Epoch: 8 Global Step: 143950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:24:34,987-Speed 3023.19 samples/sec Loss 1.6713 Epoch: 8 Global Step: 144000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:25:28,425-[lfw][144000]XNorm: 21.146122 Training: 2021-03-16 16:25:28,425-[lfw][144000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-16 16:25:28,425-[lfw][144000]Accuracy-Highest: 0.99800 Training: 2021-03-16 16:26:30,931-[cfp_fp][144000]XNorm: 19.892753 Training: 2021-03-16 16:26:30,931-[cfp_fp][144000]Accuracy-Flip: 0.98271+-0.00587 Training: 2021-03-16 16:26:30,932-[cfp_fp][144000]Accuracy-Highest: 0.98657 Training: 2021-03-16 16:27:24,317-[agedb_30][144000]XNorm: 21.322820 Training: 2021-03-16 16:27:24,318-[agedb_30][144000]Accuracy-Flip: 0.97650+-0.00713 Training: 2021-03-16 16:27:24,318-[agedb_30][144000]Accuracy-Highest: 0.97950 Training: 2021-03-16 16:27:40,246-Speed 276.37 samples/sec Loss 1.6731 Epoch: 8 Global Step: 144050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:27:56,336-Speed 3182.11 samples/sec Loss 1.6644 Epoch: 8 Global Step: 144100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:28:12,664-Speed 3135.84 samples/sec Loss 1.6867 Epoch: 8 Global Step: 144150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:28:28,743-Speed 3184.46 samples/sec Loss 1.7075 Epoch: 8 Global Step: 144200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:28:44,877-Speed 3173.49 samples/sec Loss 1.6600 Epoch: 8 Global Step: 144250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:29:01,033-Speed 3169.34 samples/sec Loss 1.6685 Epoch: 8 Global Step: 144300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:29:17,012-Speed 3204.16 samples/sec Loss 1.6822 Epoch: 8 Global Step: 144350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:29:33,200-Speed 3162.98 samples/sec Loss 1.6715 Epoch: 8 Global Step: 144400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:29:50,359-Speed 2983.95 samples/sec Loss 1.6668 Epoch: 8 Global Step: 144450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:30:06,330-Speed 3205.99 samples/sec Loss 1.6973 Epoch: 8 Global Step: 144500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:30:22,310-Speed 3204.07 samples/sec Loss 1.6897 Epoch: 8 Global Step: 144550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:30:38,759-Speed 3112.75 samples/sec Loss 1.6660 Epoch: 8 Global Step: 144600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:30:55,204-Speed 3113.49 samples/sec Loss 1.6826 Epoch: 8 Global Step: 144650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:31:11,011-Speed 3239.11 samples/sec Loss 1.6709 Epoch: 8 Global Step: 144700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:31:27,192-Speed 3164.42 samples/sec Loss 1.6641 Epoch: 8 Global Step: 144750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:31:43,024-Speed 3233.93 samples/sec Loss 1.6940 Epoch: 8 Global Step: 144800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:31:59,316-Speed 3142.89 samples/sec Loss 1.6635 Epoch: 8 Global Step: 144850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:32:15,311-Speed 3201.01 samples/sec Loss 1.6729 Epoch: 8 Global Step: 144900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:32:31,480-Speed 3166.69 samples/sec Loss 1.6647 Epoch: 8 Global Step: 144950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:32:47,664-Speed 3163.79 samples/sec Loss 1.6524 Epoch: 8 Global Step: 145000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:33:03,756-Speed 3181.73 samples/sec Loss 1.6622 Epoch: 8 Global Step: 145050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:33:20,015-Speed 3149.21 samples/sec Loss 1.6691 Epoch: 8 Global Step: 145100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:33:35,903-Speed 3222.61 samples/sec Loss 1.6714 Epoch: 8 Global Step: 145150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:33:52,884-Speed 3015.19 samples/sec Loss 1.6368 Epoch: 8 Global Step: 145200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:34:09,980-Speed 2994.89 samples/sec Loss 1.6939 Epoch: 8 Global Step: 145250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:34:25,973-Speed 3201.60 samples/sec Loss 1.6700 Epoch: 8 Global Step: 145300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:34:42,093-Speed 3176.19 samples/sec Loss 1.6559 Epoch: 8 Global Step: 145350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:34:58,061-Speed 3206.60 samples/sec Loss 1.6508 Epoch: 8 Global Step: 145400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:35:13,837-Speed 3245.45 samples/sec Loss 1.6647 Epoch: 8 Global Step: 145450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:35:30,857-Speed 3008.39 samples/sec Loss 1.6706 Epoch: 8 Global Step: 145500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:35:46,954-Speed 3180.89 samples/sec Loss 1.6708 Epoch: 8 Global Step: 145550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:36:04,147-Speed 2978.04 samples/sec Loss 1.6830 Epoch: 8 Global Step: 145600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:36:20,457-Speed 3139.11 samples/sec Loss 1.6768 Epoch: 8 Global Step: 145650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:36:37,554-Speed 2994.77 samples/sec Loss 1.6963 Epoch: 8 Global Step: 145700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:36:53,654-Speed 3180.23 samples/sec Loss 1.6334 Epoch: 8 Global Step: 145750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:37:09,707-Speed 3189.64 samples/sec Loss 1.6612 Epoch: 8 Global Step: 145800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:37:25,782-Speed 3185.10 samples/sec Loss 1.6584 Epoch: 8 Global Step: 145850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:37:42,154-Speed 3127.40 samples/sec Loss 1.6871 Epoch: 8 Global Step: 145900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:37:58,017-Speed 3227.64 samples/sec Loss 1.6805 Epoch: 8 Global Step: 145950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:38:14,399-Speed 3125.60 samples/sec Loss 1.6800 Epoch: 8 Global Step: 146000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:39:07,736-[lfw][146000]XNorm: 24.661244 Training: 2021-03-16 16:39:07,737-[lfw][146000]Accuracy-Flip: 0.99700+-0.00256 Training: 2021-03-16 16:39:07,737-[lfw][146000]Accuracy-Highest: 0.99800 Training: 2021-03-16 16:40:09,537-[cfp_fp][146000]XNorm: 22.514436 Training: 2021-03-16 16:40:09,537-[cfp_fp][146000]Accuracy-Flip: 0.98229+-0.00492 Training: 2021-03-16 16:40:09,538-[cfp_fp][146000]Accuracy-Highest: 0.98657 Training: 2021-03-16 16:41:02,706-[agedb_30][146000]XNorm: 24.257440 Training: 2021-03-16 16:41:02,707-[agedb_30][146000]Accuracy-Flip: 0.97733+-0.00807 Training: 2021-03-16 16:41:02,707-[agedb_30][146000]Accuracy-Highest: 0.97950 Training: 2021-03-16 16:41:18,885-Speed 277.53 samples/sec Loss 1.7202 Epoch: 8 Global Step: 146050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:41:36,130-Speed 2969.14 samples/sec Loss 1.6657 Epoch: 8 Global Step: 146100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:41:52,689-Speed 3091.97 samples/sec Loss 1.6771 Epoch: 8 Global Step: 146150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:42:09,932-Speed 2969.49 samples/sec Loss 1.6930 Epoch: 8 Global Step: 146200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:42:25,899-Speed 3206.62 samples/sec Loss 1.6670 Epoch: 8 Global Step: 146250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:42:41,760-Speed 3228.31 samples/sec Loss 1.6861 Epoch: 8 Global Step: 146300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:42:57,934-Speed 3165.53 samples/sec Loss 1.6592 Epoch: 8 Global Step: 146350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:43:13,797-Speed 3227.87 samples/sec Loss 1.6664 Epoch: 8 Global Step: 146400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:43:29,789-Speed 3201.71 samples/sec Loss 1.7037 Epoch: 8 Global Step: 146450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:43:45,725-Speed 3212.93 samples/sec Loss 1.6568 Epoch: 8 Global Step: 146500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:44:01,795-Speed 3186.13 samples/sec Loss 1.6698 Epoch: 8 Global Step: 146550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:44:17,838-Speed 3191.44 samples/sec Loss 1.6700 Epoch: 8 Global Step: 146600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:44:35,048-Speed 2975.17 samples/sec Loss 1.6899 Epoch: 8 Global Step: 146650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:44:51,023-Speed 3205.10 samples/sec Loss 1.6830 Epoch: 8 Global Step: 146700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:45:06,824-Speed 3240.44 samples/sec Loss 1.6729 Epoch: 8 Global Step: 146750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:45:22,941-Speed 3176.78 samples/sec Loss 1.6641 Epoch: 8 Global Step: 146800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:45:39,025-Speed 3183.38 samples/sec Loss 1.6598 Epoch: 8 Global Step: 146850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:45:55,436-Speed 3120.00 samples/sec Loss 1.6622 Epoch: 8 Global Step: 146900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:46:11,404-Speed 3206.57 samples/sec Loss 1.6734 Epoch: 8 Global Step: 146950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:46:27,674-Speed 3146.98 samples/sec Loss 1.6740 Epoch: 8 Global Step: 147000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:46:43,972-Speed 3141.63 samples/sec Loss 1.6770 Epoch: 8 Global Step: 147050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:46:59,954-Speed 3203.58 samples/sec Loss 1.6693 Epoch: 8 Global Step: 147100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:47:15,832-Speed 3224.69 samples/sec Loss 1.6748 Epoch: 8 Global Step: 147150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:47:31,811-Speed 3204.38 samples/sec Loss 1.6685 Epoch: 8 Global Step: 147200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:47:47,976-Speed 3167.48 samples/sec Loss 1.6821 Epoch: 8 Global Step: 147250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:48:04,166-Speed 3162.59 samples/sec Loss 1.6781 Epoch: 8 Global Step: 147300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:48:21,330-Speed 2982.90 samples/sec Loss 1.6587 Epoch: 8 Global Step: 147350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:48:38,161-Speed 3042.18 samples/sec Loss 1.6500 Epoch: 8 Global Step: 147400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:48:54,074-Speed 3217.62 samples/sec Loss 1.6849 Epoch: 8 Global Step: 147450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:49:10,422-Speed 3132.01 samples/sec Loss 1.6605 Epoch: 8 Global Step: 147500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:49:26,654-Speed 3154.38 samples/sec Loss 1.6515 Epoch: 8 Global Step: 147550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:49:42,499-Speed 3231.42 samples/sec Loss 1.6822 Epoch: 8 Global Step: 147600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:49:59,555-Speed 3001.94 samples/sec Loss 1.6866 Epoch: 8 Global Step: 147650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:50:15,495-Speed 3212.14 samples/sec Loss 1.6624 Epoch: 8 Global Step: 147700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:50:31,563-Speed 3186.37 samples/sec Loss 1.6533 Epoch: 8 Global Step: 147750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:50:48,288-Speed 3061.40 samples/sec Loss 1.6522 Epoch: 8 Global Step: 147800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:51:05,353-Speed 3000.51 samples/sec Loss 1.6538 Epoch: 8 Global Step: 147850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:51:21,504-Speed 3170.09 samples/sec Loss 1.6607 Epoch: 8 Global Step: 147900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:51:37,459-Speed 3209.07 samples/sec Loss 1.6593 Epoch: 8 Global Step: 147950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:51:53,550-Speed 3182.11 samples/sec Loss 1.6519 Epoch: 8 Global Step: 148000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:52:46,879-[lfw][148000]XNorm: 23.397545 Training: 2021-03-16 16:52:46,879-[lfw][148000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-16 16:52:46,880-[lfw][148000]Accuracy-Highest: 0.99800 Training: 2021-03-16 16:53:49,015-[cfp_fp][148000]XNorm: 21.836096 Training: 2021-03-16 16:53:49,016-[cfp_fp][148000]Accuracy-Flip: 0.98586+-0.00529 Training: 2021-03-16 16:53:49,016-[cfp_fp][148000]Accuracy-Highest: 0.98657 Training: 2021-03-16 16:54:42,484-[agedb_30][148000]XNorm: 23.505014 Training: 2021-03-16 16:54:42,484-[agedb_30][148000]Accuracy-Flip: 0.97617+-0.00863 Training: 2021-03-16 16:54:42,484-[agedb_30][148000]Accuracy-Highest: 0.97950 Training: 2021-03-16 16:54:58,368-Speed 277.03 samples/sec Loss 1.6603 Epoch: 8 Global Step: 148050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:55:14,223-Speed 3229.32 samples/sec Loss 1.6576 Epoch: 8 Global Step: 148100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:55:30,155-Speed 3213.95 samples/sec Loss 1.6495 Epoch: 8 Global Step: 148150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:55:46,344-Speed 3162.74 samples/sec Loss 1.6666 Epoch: 8 Global Step: 148200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:56:02,437-Speed 3181.51 samples/sec Loss 1.6614 Epoch: 8 Global Step: 148250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:56:18,813-Speed 3126.69 samples/sec Loss 1.6764 Epoch: 8 Global Step: 148300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:56:35,776-Speed 3018.41 samples/sec Loss 1.6639 Epoch: 8 Global Step: 148350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:56:52,754-Speed 3015.69 samples/sec Loss 1.6570 Epoch: 8 Global Step: 148400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:57:08,846-Speed 3181.88 samples/sec Loss 1.6694 Epoch: 8 Global Step: 148450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:57:24,990-Speed 3171.62 samples/sec Loss 1.6606 Epoch: 8 Global Step: 148500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:57:41,060-Speed 3186.09 samples/sec Loss 1.6734 Epoch: 8 Global Step: 148550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:57:57,221-Speed 3168.28 samples/sec Loss 1.6638 Epoch: 8 Global Step: 148600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:58:13,316-Speed 3181.23 samples/sec Loss 1.6514 Epoch: 8 Global Step: 148650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:58:29,548-Speed 3154.33 samples/sec Loss 1.6664 Epoch: 8 Global Step: 148700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:58:45,779-Speed 3154.45 samples/sec Loss 1.6392 Epoch: 8 Global Step: 148750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:59:01,810-Speed 3193.91 samples/sec Loss 1.6473 Epoch: 8 Global Step: 148800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:59:18,716-Speed 3028.61 samples/sec Loss 1.6838 Epoch: 8 Global Step: 148850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:59:34,694-Speed 3204.51 samples/sec Loss 1.6490 Epoch: 8 Global Step: 148900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 16:59:50,707-Speed 3197.62 samples/sec Loss 1.6390 Epoch: 8 Global Step: 148950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:00:06,599-Speed 3221.77 samples/sec Loss 1.6370 Epoch: 8 Global Step: 149000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:00:22,766-Speed 3167.09 samples/sec Loss 1.6493 Epoch: 8 Global Step: 149050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:00:38,966-Speed 3160.51 samples/sec Loss 1.6555 Epoch: 8 Global Step: 149100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:00:54,803-Speed 3233.00 samples/sec Loss 1.6559 Epoch: 8 Global Step: 149150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:01:10,874-Speed 3186.05 samples/sec Loss 1.6442 Epoch: 8 Global Step: 149200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:01:26,992-Speed 3176.61 samples/sec Loss 1.6532 Epoch: 8 Global Step: 149250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:01:43,315-Speed 3136.74 samples/sec Loss 1.6463 Epoch: 8 Global Step: 149300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:01:59,267-Speed 3209.86 samples/sec Loss 1.6772 Epoch: 8 Global Step: 149350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:02:15,525-Speed 3149.33 samples/sec Loss 1.6555 Epoch: 8 Global Step: 149400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:02:31,646-Speed 3175.99 samples/sec Loss 1.6537 Epoch: 8 Global Step: 149450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:02:48,054-Speed 3120.54 samples/sec Loss 1.6484 Epoch: 8 Global Step: 149500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:03:04,932-Speed 3033.62 samples/sec Loss 1.6610 Epoch: 8 Global Step: 149550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:03:21,903-Speed 3017.06 samples/sec Loss 1.6355 Epoch: 8 Global Step: 149600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:03:37,979-Speed 3184.86 samples/sec Loss 1.6396 Epoch: 8 Global Step: 149650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:03:54,341-Speed 3129.43 samples/sec Loss 1.6412 Epoch: 8 Global Step: 149700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:04:10,435-Speed 3181.33 samples/sec Loss 1.6714 Epoch: 8 Global Step: 149750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:04:26,416-Speed 3203.88 samples/sec Loss 1.6485 Epoch: 8 Global Step: 149800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:04:43,346-Speed 3024.30 samples/sec Loss 1.6779 Epoch: 8 Global Step: 149850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:04:59,406-Speed 3188.21 samples/sec Loss 1.6479 Epoch: 8 Global Step: 149900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:05:15,496-Speed 3182.16 samples/sec Loss 1.6630 Epoch: 8 Global Step: 149950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:05:32,392-Speed 3030.41 samples/sec Loss 1.6319 Epoch: 8 Global Step: 150000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:06:25,457-[lfw][150000]XNorm: 22.964981 Training: 2021-03-16 17:06:25,457-[lfw][150000]Accuracy-Flip: 0.99733+-0.00291 Training: 2021-03-16 17:06:25,457-[lfw][150000]Accuracy-Highest: 0.99800 Training: 2021-03-16 17:07:27,433-[cfp_fp][150000]XNorm: 21.356889 Training: 2021-03-16 17:07:27,434-[cfp_fp][150000]Accuracy-Flip: 0.98543+-0.00660 Training: 2021-03-16 17:07:27,436-[cfp_fp][150000]Accuracy-Highest: 0.98657 Training: 2021-03-16 17:08:20,697-[agedb_30][150000]XNorm: 23.200550 Training: 2021-03-16 17:08:20,697-[agedb_30][150000]Accuracy-Flip: 0.97583+-0.00704 Training: 2021-03-16 17:08:20,697-[agedb_30][150000]Accuracy-Highest: 0.97950 Training: 2021-03-16 17:08:37,860-Speed 276.06 samples/sec Loss 1.6572 Epoch: 8 Global Step: 150050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:08:53,922-Speed 3187.87 samples/sec Loss 1.6644 Epoch: 8 Global Step: 150100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:09:10,058-Speed 3173.13 samples/sec Loss 1.6292 Epoch: 8 Global Step: 150150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:09:26,038-Speed 3204.03 samples/sec Loss 1.6669 Epoch: 8 Global Step: 150200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:09:56,112-Speed 1702.54 samples/sec Loss 1.5023 Epoch: 9 Global Step: 150250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:10:12,263-Speed 3170.10 samples/sec Loss 1.3852 Epoch: 9 Global Step: 150300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:10:27,990-Speed 3255.74 samples/sec Loss 1.3837 Epoch: 9 Global Step: 150350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:10:43,851-Speed 3228.14 samples/sec Loss 1.3739 Epoch: 9 Global Step: 150400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:10:59,867-Speed 3196.92 samples/sec Loss 1.3921 Epoch: 9 Global Step: 150450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:11:16,749-Speed 3032.92 samples/sec Loss 1.3750 Epoch: 9 Global Step: 150500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:11:32,537-Speed 3243.02 samples/sec Loss 1.3926 Epoch: 9 Global Step: 150550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:11:49,006-Speed 3108.97 samples/sec Loss 1.3786 Epoch: 9 Global Step: 150600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:12:04,868-Speed 3228.09 samples/sec Loss 1.4065 Epoch: 9 Global Step: 150650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:12:20,813-Speed 3211.12 samples/sec Loss 1.3774 Epoch: 9 Global Step: 150700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:12:36,877-Speed 3187.37 samples/sec Loss 1.4109 Epoch: 9 Global Step: 150750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:12:52,600-Speed 3256.34 samples/sec Loss 1.4065 Epoch: 9 Global Step: 150800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:13:08,478-Speed 3224.78 samples/sec Loss 1.4129 Epoch: 9 Global Step: 150850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:13:25,353-Speed 3034.17 samples/sec Loss 1.4401 Epoch: 9 Global Step: 150900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:13:41,380-Speed 3194.58 samples/sec Loss 1.4031 Epoch: 9 Global Step: 150950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:13:57,389-Speed 3198.34 samples/sec Loss 1.4138 Epoch: 9 Global Step: 151000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:14:13,437-Speed 3190.62 samples/sec Loss 1.4029 Epoch: 9 Global Step: 151050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:14:29,296-Speed 3228.58 samples/sec Loss 1.4254 Epoch: 9 Global Step: 151100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:14:45,135-Speed 3232.56 samples/sec Loss 1.4000 Epoch: 9 Global Step: 151150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:15:01,205-Speed 3186.27 samples/sec Loss 1.4192 Epoch: 9 Global Step: 151200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:15:17,124-Speed 3216.29 samples/sec Loss 1.4302 Epoch: 9 Global Step: 151250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:15:33,294-Speed 3166.38 samples/sec Loss 1.4275 Epoch: 9 Global Step: 151300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:15:49,360-Speed 3187.04 samples/sec Loss 1.3988 Epoch: 9 Global Step: 151350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:16:05,467-Speed 3178.86 samples/sec Loss 1.4320 Epoch: 9 Global Step: 151400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:16:21,265-Speed 3241.03 samples/sec Loss 1.3995 Epoch: 9 Global Step: 151450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:16:37,110-Speed 3231.48 samples/sec Loss 1.4076 Epoch: 9 Global Step: 151500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:16:52,993-Speed 3223.57 samples/sec Loss 1.4243 Epoch: 9 Global Step: 151550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:17:09,024-Speed 3193.87 samples/sec Loss 1.4065 Epoch: 9 Global Step: 151600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:17:25,204-Speed 3164.62 samples/sec Loss 1.4244 Epoch: 9 Global Step: 151650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:17:41,399-Speed 3161.52 samples/sec Loss 1.4296 Epoch: 9 Global Step: 151700 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:17:58,608-Speed 2975.22 samples/sec Loss 1.4183 Epoch: 9 Global Step: 151750 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:18:15,231-Speed 3080.18 samples/sec Loss 1.4148 Epoch: 9 Global Step: 151800 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:18:31,535-Speed 3140.42 samples/sec Loss 1.4463 Epoch: 9 Global Step: 151850 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:18:47,689-Speed 3169.69 samples/sec Loss 1.4203 Epoch: 9 Global Step: 151900 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:19:03,789-Speed 3180.20 samples/sec Loss 1.4413 Epoch: 9 Global Step: 151950 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:19:19,817-Speed 3194.48 samples/sec Loss 1.4402 Epoch: 9 Global Step: 152000 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:20:13,315-[lfw][152000]XNorm: 22.470513 Training: 2021-03-16 17:20:13,315-[lfw][152000]Accuracy-Flip: 0.99733+-0.00271 Training: 2021-03-16 17:20:13,315-[lfw][152000]Accuracy-Highest: 0.99800 Training: 2021-03-16 17:21:15,329-[cfp_fp][152000]XNorm: 21.535843 Training: 2021-03-16 17:21:15,330-[cfp_fp][152000]Accuracy-Flip: 0.98471+-0.00553 Training: 2021-03-16 17:21:15,330-[cfp_fp][152000]Accuracy-Highest: 0.98657 Training: 2021-03-16 17:22:08,866-[agedb_30][152000]XNorm: 22.604320 Training: 2021-03-16 17:22:08,866-[agedb_30][152000]Accuracy-Flip: 0.97867+-0.00690 Training: 2021-03-16 17:22:08,866-[agedb_30][152000]Accuracy-Highest: 0.97950 Training: 2021-03-16 17:22:25,495-Speed 275.75 samples/sec Loss 1.4218 Epoch: 9 Global Step: 152050 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:22:41,765-Speed 3146.95 samples/sec Loss 1.4716 Epoch: 9 Global Step: 152100 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:22:57,633-Speed 3226.73 samples/sec Loss 1.4364 Epoch: 9 Global Step: 152150 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:23:15,565-Speed 2855.22 samples/sec Loss 1.4657 Epoch: 9 Global Step: 152200 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:23:31,396-Speed 3234.44 samples/sec Loss 1.4441 Epoch: 9 Global Step: 152250 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:23:47,645-Speed 3150.96 samples/sec Loss 1.4664 Epoch: 9 Global Step: 152300 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:24:03,422-Speed 3245.35 samples/sec Loss 1.4423 Epoch: 9 Global Step: 152350 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:24:19,530-Speed 3178.55 samples/sec Loss 1.4591 Epoch: 9 Global Step: 152400 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:24:35,417-Speed 3222.99 samples/sec Loss 1.4501 Epoch: 9 Global Step: 152450 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:24:51,241-Speed 3235.64 samples/sec Loss 1.4522 Epoch: 9 Global Step: 152500 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:25:07,198-Speed 3208.81 samples/sec Loss 1.4400 Epoch: 9 Global Step: 152550 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:25:23,316-Speed 3176.55 samples/sec Loss 1.4552 Epoch: 9 Global Step: 152600 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:25:40,247-Speed 3024.16 samples/sec Loss 1.4692 Epoch: 9 Global Step: 152650 Fp16 Grad Scale: 16384 Required: 21 hours Training: 2021-03-16 17:25:55,971-Speed 3256.36 samples/sec Loss 1.4596 Epoch: 9 Global Step: 152700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:26:11,850-Speed 3224.36 samples/sec Loss 1.4810 Epoch: 9 Global Step: 152750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:26:27,660-Speed 3238.68 samples/sec Loss 1.4735 Epoch: 9 Global Step: 152800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:26:44,370-Speed 3064.06 samples/sec Loss 1.4617 Epoch: 9 Global Step: 152850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:27:00,640-Speed 3147.01 samples/sec Loss 1.4494 Epoch: 9 Global Step: 152900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:27:16,609-Speed 3206.30 samples/sec Loss 1.4473 Epoch: 9 Global Step: 152950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:27:32,563-Speed 3209.38 samples/sec Loss 1.4490 Epoch: 9 Global Step: 153000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:27:48,557-Speed 3201.18 samples/sec Loss 1.4862 Epoch: 9 Global Step: 153050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:28:05,314-Speed 3055.65 samples/sec Loss 1.4606 Epoch: 9 Global Step: 153100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:28:21,730-Speed 3118.88 samples/sec Loss 1.4819 Epoch: 9 Global Step: 153150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:28:37,747-Speed 3196.81 samples/sec Loss 1.4822 Epoch: 9 Global Step: 153200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:28:53,899-Speed 3169.84 samples/sec Loss 1.4973 Epoch: 9 Global Step: 153250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:29:09,776-Speed 3224.98 samples/sec Loss 1.4605 Epoch: 9 Global Step: 153300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:29:25,845-Speed 3186.31 samples/sec Loss 1.4810 Epoch: 9 Global Step: 153350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:29:41,918-Speed 3185.64 samples/sec Loss 1.4637 Epoch: 9 Global Step: 153400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:29:57,782-Speed 3227.48 samples/sec Loss 1.4768 Epoch: 9 Global Step: 153450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:30:13,917-Speed 3173.34 samples/sec Loss 1.4789 Epoch: 9 Global Step: 153500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:30:29,779-Speed 3227.97 samples/sec Loss 1.4714 Epoch: 9 Global Step: 153550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:30:46,018-Speed 3152.92 samples/sec Loss 1.4670 Epoch: 9 Global Step: 153600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:31:02,157-Speed 3172.50 samples/sec Loss 1.4783 Epoch: 9 Global Step: 153650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:31:18,223-Speed 3187.02 samples/sec Loss 1.4857 Epoch: 9 Global Step: 153700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:31:34,469-Speed 3151.68 samples/sec Loss 1.4783 Epoch: 9 Global Step: 153750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:31:50,479-Speed 3198.18 samples/sec Loss 1.4770 Epoch: 9 Global Step: 153800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:32:06,546-Speed 3186.66 samples/sec Loss 1.4661 Epoch: 9 Global Step: 153850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:32:22,764-Speed 3157.04 samples/sec Loss 1.4862 Epoch: 9 Global Step: 153900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:32:39,819-Speed 3002.19 samples/sec Loss 1.4799 Epoch: 9 Global Step: 153950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:32:55,761-Speed 3211.81 samples/sec Loss 1.4557 Epoch: 9 Global Step: 154000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:33:48,990-[lfw][154000]XNorm: 22.690746 Training: 2021-03-16 17:33:48,991-[lfw][154000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-16 17:33:48,991-[lfw][154000]Accuracy-Highest: 0.99800 Training: 2021-03-16 17:34:50,690-[cfp_fp][154000]XNorm: 21.768844 Training: 2021-03-16 17:34:50,690-[cfp_fp][154000]Accuracy-Flip: 0.98457+-0.00549 Training: 2021-03-16 17:34:50,690-[cfp_fp][154000]Accuracy-Highest: 0.98657 Training: 2021-03-16 17:35:43,980-[agedb_30][154000]XNorm: 22.881735 Training: 2021-03-16 17:35:43,981-[agedb_30][154000]Accuracy-Flip: 0.97800+-0.00649 Training: 2021-03-16 17:35:43,981-[agedb_30][154000]Accuracy-Highest: 0.97950 Training: 2021-03-16 17:35:59,907-Speed 278.04 samples/sec Loss 1.5133 Epoch: 9 Global Step: 154050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:36:16,615-Speed 3064.58 samples/sec Loss 1.4689 Epoch: 9 Global Step: 154100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:36:32,722-Speed 3178.75 samples/sec Loss 1.4951 Epoch: 9 Global Step: 154150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:36:49,588-Speed 3035.82 samples/sec Loss 1.4824 Epoch: 9 Global Step: 154200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:37:06,115-Speed 3098.07 samples/sec Loss 1.4750 Epoch: 9 Global Step: 154250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:37:22,170-Speed 3189.15 samples/sec Loss 1.4556 Epoch: 9 Global Step: 154300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:37:38,833-Speed 3072.81 samples/sec Loss 1.4787 Epoch: 9 Global Step: 154350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:37:54,809-Speed 3204.82 samples/sec Loss 1.4707 Epoch: 9 Global Step: 154400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:38:10,821-Speed 3197.72 samples/sec Loss 1.4683 Epoch: 9 Global Step: 154450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:38:27,637-Speed 3044.84 samples/sec Loss 1.4805 Epoch: 9 Global Step: 154500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:38:43,578-Speed 3211.87 samples/sec Loss 1.4735 Epoch: 9 Global Step: 154550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:38:59,773-Speed 3161.64 samples/sec Loss 1.4753 Epoch: 9 Global Step: 154600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:39:15,794-Speed 3195.88 samples/sec Loss 1.4827 Epoch: 9 Global Step: 154650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:39:32,034-Speed 3152.88 samples/sec Loss 1.5281 Epoch: 9 Global Step: 154700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:39:48,084-Speed 3190.05 samples/sec Loss 1.4914 Epoch: 9 Global Step: 154750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:40:03,983-Speed 3220.40 samples/sec Loss 1.5084 Epoch: 9 Global Step: 154800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:40:20,810-Speed 3042.81 samples/sec Loss 1.4955 Epoch: 9 Global Step: 154850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:40:36,895-Speed 3183.18 samples/sec Loss 1.5115 Epoch: 9 Global Step: 154900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:40:52,817-Speed 3215.82 samples/sec Loss 1.5038 Epoch: 9 Global Step: 154950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:41:08,931-Speed 3177.41 samples/sec Loss 1.5003 Epoch: 9 Global Step: 155000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:41:25,873-Speed 3022.28 samples/sec Loss 1.5091 Epoch: 9 Global Step: 155050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:41:41,791-Speed 3216.51 samples/sec Loss 1.5044 Epoch: 9 Global Step: 155100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:41:57,651-Speed 3228.41 samples/sec Loss 1.5140 Epoch: 9 Global Step: 155150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:42:13,508-Speed 3228.86 samples/sec Loss 1.4945 Epoch: 9 Global Step: 155200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:42:29,415-Speed 3218.90 samples/sec Loss 1.4854 Epoch: 9 Global Step: 155250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:42:46,192-Speed 3051.85 samples/sec Loss 1.5000 Epoch: 9 Global Step: 155300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:43:02,336-Speed 3171.46 samples/sec Loss 1.5030 Epoch: 9 Global Step: 155350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:43:18,346-Speed 3198.28 samples/sec Loss 1.5052 Epoch: 9 Global Step: 155400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:43:34,346-Speed 3199.96 samples/sec Loss 1.5108 Epoch: 9 Global Step: 155450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:43:50,609-Speed 3148.33 samples/sec Loss 1.5117 Epoch: 9 Global Step: 155500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:44:06,569-Speed 3208.10 samples/sec Loss 1.4897 Epoch: 9 Global Step: 155550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:44:22,764-Speed 3161.67 samples/sec Loss 1.4870 Epoch: 9 Global Step: 155600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:44:38,726-Speed 3207.57 samples/sec Loss 1.5071 Epoch: 9 Global Step: 155650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:44:54,957-Speed 3154.66 samples/sec Loss 1.5109 Epoch: 9 Global Step: 155700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:45:11,193-Speed 3153.64 samples/sec Loss 1.4967 Epoch: 9 Global Step: 155750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:45:26,995-Speed 3240.22 samples/sec Loss 1.5026 Epoch: 9 Global Step: 155800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:45:43,206-Speed 3158.44 samples/sec Loss 1.4889 Epoch: 9 Global Step: 155850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:45:59,226-Speed 3196.08 samples/sec Loss 1.4923 Epoch: 9 Global Step: 155900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:46:15,209-Speed 3203.35 samples/sec Loss 1.5056 Epoch: 9 Global Step: 155950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:46:31,404-Speed 3161.58 samples/sec Loss 1.4899 Epoch: 9 Global Step: 156000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:47:24,694-[lfw][156000]XNorm: 20.479022 Training: 2021-03-16 17:47:24,695-[lfw][156000]Accuracy-Flip: 0.99733+-0.00281 Training: 2021-03-16 17:47:24,695-[lfw][156000]Accuracy-Highest: 0.99800 Training: 2021-03-16 17:48:26,880-[cfp_fp][156000]XNorm: 20.244226 Training: 2021-03-16 17:48:26,880-[cfp_fp][156000]Accuracy-Flip: 0.98600+-0.00408 Training: 2021-03-16 17:48:26,880-[cfp_fp][156000]Accuracy-Highest: 0.98657 Training: 2021-03-16 17:49:20,047-[agedb_30][156000]XNorm: 21.204759 Training: 2021-03-16 17:49:20,048-[agedb_30][156000]Accuracy-Flip: 0.97867+-0.00888 Training: 2021-03-16 17:49:20,048-[agedb_30][156000]Accuracy-Highest: 0.97950 Training: 2021-03-16 17:49:36,044-Speed 277.30 samples/sec Loss 1.5020 Epoch: 9 Global Step: 156050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:49:52,962-Speed 3026.41 samples/sec Loss 1.5029 Epoch: 9 Global Step: 156100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:50:09,029-Speed 3186.83 samples/sec Loss 1.5291 Epoch: 9 Global Step: 156150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:50:25,118-Speed 3182.41 samples/sec Loss 1.5147 Epoch: 9 Global Step: 156200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:50:42,208-Speed 2996.02 samples/sec Loss 1.5230 Epoch: 9 Global Step: 156250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:50:58,275-Speed 3186.77 samples/sec Loss 1.5042 Epoch: 9 Global Step: 156300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:51:15,198-Speed 3025.58 samples/sec Loss 1.5239 Epoch: 9 Global Step: 156350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:51:31,292-Speed 3181.38 samples/sec Loss 1.5298 Epoch: 9 Global Step: 156400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:51:47,925-Speed 3078.35 samples/sec Loss 1.5038 Epoch: 9 Global Step: 156450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:52:03,710-Speed 3243.68 samples/sec Loss 1.4851 Epoch: 9 Global Step: 156500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:52:20,785-Speed 2998.54 samples/sec Loss 1.5122 Epoch: 9 Global Step: 156550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:52:36,910-Speed 3175.36 samples/sec Loss 1.5330 Epoch: 9 Global Step: 156600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:52:53,715-Speed 3046.72 samples/sec Loss 1.5019 Epoch: 9 Global Step: 156650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:53:10,252-Speed 3096.16 samples/sec Loss 1.5438 Epoch: 9 Global Step: 156700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:53:26,295-Speed 3191.65 samples/sec Loss 1.5215 Epoch: 9 Global Step: 156750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:53:42,330-Speed 3193.02 samples/sec Loss 1.5268 Epoch: 9 Global Step: 156800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:53:58,256-Speed 3215.02 samples/sec Loss 1.5186 Epoch: 9 Global Step: 156850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:54:14,366-Speed 3178.20 samples/sec Loss 1.5200 Epoch: 9 Global Step: 156900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:54:30,413-Speed 3190.74 samples/sec Loss 1.4960 Epoch: 9 Global Step: 156950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:54:46,592-Speed 3164.80 samples/sec Loss 1.5344 Epoch: 9 Global Step: 157000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:55:03,282-Speed 3067.80 samples/sec Loss 1.4790 Epoch: 9 Global Step: 157050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:55:19,255-Speed 3205.41 samples/sec Loss 1.5444 Epoch: 9 Global Step: 157100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:55:35,177-Speed 3215.91 samples/sec Loss 1.5546 Epoch: 9 Global Step: 157150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:55:51,586-Speed 3120.23 samples/sec Loss 1.5330 Epoch: 9 Global Step: 157200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:56:08,438-Speed 3038.43 samples/sec Loss 1.5363 Epoch: 9 Global Step: 157250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:56:24,596-Speed 3168.80 samples/sec Loss 1.5254 Epoch: 9 Global Step: 157300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:56:40,672-Speed 3185.01 samples/sec Loss 1.5075 Epoch: 9 Global Step: 157350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:56:56,587-Speed 3217.16 samples/sec Loss 1.5038 Epoch: 9 Global Step: 157400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:57:12,491-Speed 3219.28 samples/sec Loss 1.5217 Epoch: 9 Global Step: 157450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:57:29,406-Speed 3027.08 samples/sec Loss 1.5236 Epoch: 9 Global Step: 157500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:57:45,527-Speed 3175.98 samples/sec Loss 1.5187 Epoch: 9 Global Step: 157550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:58:01,521-Speed 3201.35 samples/sec Loss 1.5267 Epoch: 9 Global Step: 157600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:58:17,560-Speed 3192.37 samples/sec Loss 1.5256 Epoch: 9 Global Step: 157650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:58:33,803-Speed 3152.11 samples/sec Loss 1.5147 Epoch: 9 Global Step: 157700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:58:49,955-Speed 3170.02 samples/sec Loss 1.5174 Epoch: 9 Global Step: 157750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:59:06,170-Speed 3157.66 samples/sec Loss 1.5528 Epoch: 9 Global Step: 157800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:59:22,493-Speed 3136.87 samples/sec Loss 1.5255 Epoch: 9 Global Step: 157850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:59:38,808-Speed 3138.22 samples/sec Loss 1.5389 Epoch: 9 Global Step: 157900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 17:59:54,794-Speed 3202.92 samples/sec Loss 1.5333 Epoch: 9 Global Step: 157950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:00:10,703-Speed 3218.46 samples/sec Loss 1.5187 Epoch: 9 Global Step: 158000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:01:04,039-[lfw][158000]XNorm: 23.750954 Training: 2021-03-16 18:01:04,039-[lfw][158000]Accuracy-Flip: 0.99667+-0.00289 Training: 2021-03-16 18:01:04,039-[lfw][158000]Accuracy-Highest: 0.99800 Training: 2021-03-16 18:02:05,843-[cfp_fp][158000]XNorm: 22.262983 Training: 2021-03-16 18:02:05,843-[cfp_fp][158000]Accuracy-Flip: 0.98229+-0.00826 Training: 2021-03-16 18:02:05,843-[cfp_fp][158000]Accuracy-Highest: 0.98657 Training: 2021-03-16 18:02:58,857-[agedb_30][158000]XNorm: 23.652658 Training: 2021-03-16 18:02:58,858-[agedb_30][158000]Accuracy-Flip: 0.97783+-0.00799 Training: 2021-03-16 18:02:58,858-[agedb_30][158000]Accuracy-Highest: 0.97950 Training: 2021-03-16 18:03:14,872-Speed 278.01 samples/sec Loss 1.5272 Epoch: 9 Global Step: 158050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:03:30,940-Speed 3186.56 samples/sec Loss 1.5141 Epoch: 9 Global Step: 158100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:03:47,131-Speed 3162.24 samples/sec Loss 1.5218 Epoch: 9 Global Step: 158150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:04:03,091-Speed 3208.33 samples/sec Loss 1.5300 Epoch: 9 Global Step: 158200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:04:19,284-Speed 3161.78 samples/sec Loss 1.5079 Epoch: 9 Global Step: 158250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:04:36,184-Speed 3029.80 samples/sec Loss 1.5342 Epoch: 9 Global Step: 158300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:04:52,319-Speed 3173.35 samples/sec Loss 1.5395 Epoch: 9 Global Step: 158350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:05:08,774-Speed 3111.46 samples/sec Loss 1.5126 Epoch: 9 Global Step: 158400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:05:25,981-Speed 2975.76 samples/sec Loss 1.5348 Epoch: 9 Global Step: 158450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:05:42,120-Speed 3172.50 samples/sec Loss 1.5581 Epoch: 9 Global Step: 158500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:05:58,869-Speed 3056.88 samples/sec Loss 1.5373 Epoch: 9 Global Step: 158550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:06:14,776-Speed 3218.96 samples/sec Loss 1.5174 Epoch: 9 Global Step: 158600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:06:31,029-Speed 3150.17 samples/sec Loss 1.5019 Epoch: 9 Global Step: 158650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:06:48,014-Speed 3014.54 samples/sec Loss 1.5476 Epoch: 9 Global Step: 158700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:07:03,912-Speed 3220.71 samples/sec Loss 1.5362 Epoch: 9 Global Step: 158750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:07:20,291-Speed 3126.00 samples/sec Loss 1.5305 Epoch: 9 Global Step: 158800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:07:37,138-Speed 3039.33 samples/sec Loss 1.5317 Epoch: 9 Global Step: 158850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:07:53,017-Speed 3224.49 samples/sec Loss 1.5295 Epoch: 9 Global Step: 158900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:08:09,035-Speed 3196.36 samples/sec Loss 1.5415 Epoch: 9 Global Step: 158950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:08:24,956-Speed 3216.09 samples/sec Loss 1.5233 Epoch: 9 Global Step: 159000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:08:40,971-Speed 3197.03 samples/sec Loss 1.5261 Epoch: 9 Global Step: 159050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:08:57,074-Speed 3179.58 samples/sec Loss 1.5283 Epoch: 9 Global Step: 159100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:09:13,354-Speed 3145.05 samples/sec Loss 1.5391 Epoch: 9 Global Step: 159150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:09:30,084-Speed 3060.62 samples/sec Loss 1.5089 Epoch: 9 Global Step: 159200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:09:46,347-Speed 3148.30 samples/sec Loss 1.5576 Epoch: 9 Global Step: 159250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:10:02,490-Speed 3171.75 samples/sec Loss 1.5447 Epoch: 9 Global Step: 159300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:10:18,787-Speed 3141.71 samples/sec Loss 1.5614 Epoch: 9 Global Step: 159350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:10:35,078-Speed 3142.96 samples/sec Loss 1.5455 Epoch: 9 Global Step: 159400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:10:51,800-Speed 3061.99 samples/sec Loss 1.5338 Epoch: 9 Global Step: 159450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:11:07,858-Speed 3188.58 samples/sec Loss 1.5294 Epoch: 9 Global Step: 159500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:11:23,971-Speed 3177.60 samples/sec Loss 1.5164 Epoch: 9 Global Step: 159550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:11:39,876-Speed 3219.26 samples/sec Loss 1.5619 Epoch: 9 Global Step: 159600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:11:56,858-Speed 3015.09 samples/sec Loss 1.5432 Epoch: 9 Global Step: 159650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:12:13,016-Speed 3168.75 samples/sec Loss 1.5395 Epoch: 9 Global Step: 159700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:12:29,033-Speed 3196.69 samples/sec Loss 1.5200 Epoch: 9 Global Step: 159750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:12:45,123-Speed 3182.19 samples/sec Loss 1.5451 Epoch: 9 Global Step: 159800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:13:01,024-Speed 3220.05 samples/sec Loss 1.5307 Epoch: 9 Global Step: 159850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:13:17,323-Speed 3141.44 samples/sec Loss 1.5587 Epoch: 9 Global Step: 159900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:13:33,361-Speed 3192.46 samples/sec Loss 1.5366 Epoch: 9 Global Step: 159950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:13:49,269-Speed 3218.62 samples/sec Loss 1.5284 Epoch: 9 Global Step: 160000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:14:42,563-[lfw][160000]XNorm: 22.213187 Training: 2021-03-16 18:14:42,563-[lfw][160000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-16 18:14:42,564-[lfw][160000]Accuracy-Highest: 0.99800 Training: 2021-03-16 18:15:44,480-[cfp_fp][160000]XNorm: 20.807653 Training: 2021-03-16 18:15:44,481-[cfp_fp][160000]Accuracy-Flip: 0.98500+-0.00512 Training: 2021-03-16 18:15:44,481-[cfp_fp][160000]Accuracy-Highest: 0.98657 Training: 2021-03-16 18:16:37,804-[agedb_30][160000]XNorm: 22.470062 Training: 2021-03-16 18:16:37,805-[agedb_30][160000]Accuracy-Flip: 0.97867+-0.00702 Training: 2021-03-16 18:16:37,805-[agedb_30][160000]Accuracy-Highest: 0.97950 Training: 2021-03-16 18:16:53,717-Speed 277.59 samples/sec Loss 1.5385 Epoch: 9 Global Step: 160050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:17:10,024-Speed 3139.84 samples/sec Loss 1.5859 Epoch: 9 Global Step: 160100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:17:26,433-Speed 3120.27 samples/sec Loss 1.5355 Epoch: 9 Global Step: 160150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:17:42,381-Speed 3210.50 samples/sec Loss 1.5402 Epoch: 9 Global Step: 160200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:17:58,436-Speed 3189.22 samples/sec Loss 1.5439 Epoch: 9 Global Step: 160250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:18:14,389-Speed 3209.56 samples/sec Loss 1.5374 Epoch: 9 Global Step: 160300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:18:30,695-Speed 3140.08 samples/sec Loss 1.5610 Epoch: 9 Global Step: 160350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:18:46,727-Speed 3193.59 samples/sec Loss 1.5502 Epoch: 9 Global Step: 160400 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:19:02,839-Speed 3177.95 samples/sec Loss 1.5571 Epoch: 9 Global Step: 160450 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:19:18,997-Speed 3168.84 samples/sec Loss 1.5377 Epoch: 9 Global Step: 160500 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:19:35,951-Speed 3020.00 samples/sec Loss 1.5460 Epoch: 9 Global Step: 160550 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:19:52,729-Speed 3051.65 samples/sec Loss 1.5246 Epoch: 9 Global Step: 160600 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:20:08,969-Speed 3152.74 samples/sec Loss 1.5746 Epoch: 9 Global Step: 160650 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:20:24,877-Speed 3218.67 samples/sec Loss 1.5520 Epoch: 9 Global Step: 160700 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:20:40,739-Speed 3228.01 samples/sec Loss 1.5541 Epoch: 9 Global Step: 160750 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:20:57,864-Speed 2989.88 samples/sec Loss 1.5431 Epoch: 9 Global Step: 160800 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:21:13,978-Speed 3177.42 samples/sec Loss 1.4988 Epoch: 9 Global Step: 160850 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:21:30,871-Speed 3030.90 samples/sec Loss 1.5527 Epoch: 9 Global Step: 160900 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:21:47,254-Speed 3125.32 samples/sec Loss 1.5430 Epoch: 9 Global Step: 160950 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:22:03,260-Speed 3198.94 samples/sec Loss 1.5180 Epoch: 9 Global Step: 161000 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:22:20,329-Speed 2999.70 samples/sec Loss 1.5560 Epoch: 9 Global Step: 161050 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:22:36,241-Speed 3217.65 samples/sec Loss 1.5357 Epoch: 9 Global Step: 161100 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:22:52,437-Speed 3161.38 samples/sec Loss 1.5592 Epoch: 9 Global Step: 161150 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:23:08,375-Speed 3212.66 samples/sec Loss 1.5604 Epoch: 9 Global Step: 161200 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:23:24,403-Speed 3194.50 samples/sec Loss 1.5369 Epoch: 9 Global Step: 161250 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:23:40,386-Speed 3203.46 samples/sec Loss 1.5641 Epoch: 9 Global Step: 161300 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:23:56,506-Speed 3176.28 samples/sec Loss 1.5610 Epoch: 9 Global Step: 161350 Fp16 Grad Scale: 16384 Required: 20 hours Training: 2021-03-16 18:24:12,572-Speed 3186.86 samples/sec Loss 1.5606 Epoch: 9 Global Step: 161400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:24:28,719-Speed 3171.05 samples/sec Loss 1.5457 Epoch: 9 Global Step: 161450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:24:45,796-Speed 2998.30 samples/sec Loss 1.5524 Epoch: 9 Global Step: 161500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:25:01,704-Speed 3218.53 samples/sec Loss 1.5492 Epoch: 9 Global Step: 161550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:25:17,638-Speed 3213.40 samples/sec Loss 1.5545 Epoch: 9 Global Step: 161600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:25:33,858-Speed 3156.78 samples/sec Loss 1.5572 Epoch: 9 Global Step: 161650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:25:50,920-Speed 3000.93 samples/sec Loss 1.5460 Epoch: 9 Global Step: 161700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:26:07,146-Speed 3155.54 samples/sec Loss 1.5567 Epoch: 9 Global Step: 161750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:26:24,011-Speed 3035.90 samples/sec Loss 1.5366 Epoch: 9 Global Step: 161800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:26:40,281-Speed 3146.96 samples/sec Loss 1.5839 Epoch: 9 Global Step: 161850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:26:56,314-Speed 3193.44 samples/sec Loss 1.5612 Epoch: 9 Global Step: 161900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:27:12,381-Speed 3186.82 samples/sec Loss 1.5550 Epoch: 9 Global Step: 161950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:27:28,668-Speed 3143.76 samples/sec Loss 1.5769 Epoch: 9 Global Step: 162000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:28:21,944-[lfw][162000]XNorm: 21.888287 Training: 2021-03-16 18:28:21,944-[lfw][162000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-16 18:28:21,944-[lfw][162000]Accuracy-Highest: 0.99800 Training: 2021-03-16 18:29:24,051-[cfp_fp][162000]XNorm: 20.864556 Training: 2021-03-16 18:29:24,052-[cfp_fp][162000]Accuracy-Flip: 0.98357+-0.00851 Training: 2021-03-16 18:29:24,052-[cfp_fp][162000]Accuracy-Highest: 0.98657 Training: 2021-03-16 18:30:17,494-[agedb_30][162000]XNorm: 21.986226 Training: 2021-03-16 18:30:17,495-[agedb_30][162000]Accuracy-Flip: 0.97783+-0.00563 Training: 2021-03-16 18:30:17,495-[agedb_30][162000]Accuracy-Highest: 0.97950 Training: 2021-03-16 18:30:33,385-Speed 277.18 samples/sec Loss 1.5518 Epoch: 9 Global Step: 162050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:30:49,779-Speed 3123.21 samples/sec Loss 1.5687 Epoch: 9 Global Step: 162100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:31:05,689-Speed 3218.26 samples/sec Loss 1.5305 Epoch: 9 Global Step: 162150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:31:21,640-Speed 3209.98 samples/sec Loss 1.5534 Epoch: 9 Global Step: 162200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:31:37,754-Speed 3177.36 samples/sec Loss 1.5584 Epoch: 9 Global Step: 162250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:31:54,009-Speed 3150.00 samples/sec Loss 1.5623 Epoch: 9 Global Step: 162300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:32:10,288-Speed 3145.24 samples/sec Loss 1.5581 Epoch: 9 Global Step: 162350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:32:26,148-Speed 3228.24 samples/sec Loss 1.5336 Epoch: 9 Global Step: 162400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:32:42,121-Speed 3205.66 samples/sec Loss 1.5583 Epoch: 9 Global Step: 162450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:32:58,208-Speed 3182.70 samples/sec Loss 1.5739 Epoch: 9 Global Step: 162500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:33:14,325-Speed 3176.95 samples/sec Loss 1.5500 Epoch: 9 Global Step: 162550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:33:30,557-Speed 3154.30 samples/sec Loss 1.5524 Epoch: 9 Global Step: 162600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:33:46,944-Speed 3124.62 samples/sec Loss 1.5534 Epoch: 9 Global Step: 162650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:34:03,242-Speed 3141.56 samples/sec Loss 1.5831 Epoch: 9 Global Step: 162700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:34:21,555-Speed 2795.94 samples/sec Loss 1.5801 Epoch: 9 Global Step: 162750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:34:37,442-Speed 3222.74 samples/sec Loss 1.5432 Epoch: 9 Global Step: 162800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:34:53,524-Speed 3183.90 samples/sec Loss 1.5716 Epoch: 9 Global Step: 162850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:35:09,429-Speed 3219.13 samples/sec Loss 1.5663 Epoch: 9 Global Step: 162900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:35:25,529-Speed 3180.17 samples/sec Loss 1.5489 Epoch: 9 Global Step: 162950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:35:42,413-Speed 3032.60 samples/sec Loss 1.5581 Epoch: 9 Global Step: 163000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:35:59,329-Speed 3026.74 samples/sec Loss 1.5530 Epoch: 9 Global Step: 163050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:36:15,412-Speed 3183.70 samples/sec Loss 1.5780 Epoch: 9 Global Step: 163100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:36:32,003-Speed 3086.06 samples/sec Loss 1.5811 Epoch: 9 Global Step: 163150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:36:47,877-Speed 3225.46 samples/sec Loss 1.5627 Epoch: 9 Global Step: 163200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:37:04,169-Speed 3142.78 samples/sec Loss 1.5853 Epoch: 9 Global Step: 163250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:37:20,428-Speed 3149.10 samples/sec Loss 1.5577 Epoch: 9 Global Step: 163300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:37:37,425-Speed 3012.36 samples/sec Loss 1.5820 Epoch: 9 Global Step: 163350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:37:53,588-Speed 3167.78 samples/sec Loss 1.5449 Epoch: 9 Global Step: 163400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:38:09,721-Speed 3173.77 samples/sec Loss 1.5517 Epoch: 9 Global Step: 163450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:38:25,853-Speed 3174.05 samples/sec Loss 1.5865 Epoch: 9 Global Step: 163500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:38:41,938-Speed 3183.10 samples/sec Loss 1.5758 Epoch: 9 Global Step: 163550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:38:57,871-Speed 3213.52 samples/sec Loss 1.5390 Epoch: 9 Global Step: 163600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:39:14,612-Speed 3058.57 samples/sec Loss 1.5547 Epoch: 9 Global Step: 163650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:39:30,988-Speed 3126.57 samples/sec Loss 1.5744 Epoch: 9 Global Step: 163700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:39:46,973-Speed 3203.17 samples/sec Loss 1.5378 Epoch: 9 Global Step: 163750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:40:03,168-Speed 3161.44 samples/sec Loss 1.5591 Epoch: 9 Global Step: 163800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:40:20,036-Speed 3035.40 samples/sec Loss 1.5568 Epoch: 9 Global Step: 163850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:40:36,213-Speed 3165.06 samples/sec Loss 1.5789 Epoch: 9 Global Step: 163900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:40:52,170-Speed 3208.74 samples/sec Loss 1.5584 Epoch: 9 Global Step: 163950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:41:08,446-Speed 3145.85 samples/sec Loss 1.5865 Epoch: 9 Global Step: 164000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:42:01,508-[lfw][164000]XNorm: 21.835348 Training: 2021-03-16 18:42:01,508-[lfw][164000]Accuracy-Flip: 0.99800+-0.00277 Training: 2021-03-16 18:42:01,508-[lfw][164000]Accuracy-Highest: 0.99800 Training: 2021-03-16 18:43:03,285-[cfp_fp][164000]XNorm: 20.925740 Training: 2021-03-16 18:43:03,285-[cfp_fp][164000]Accuracy-Flip: 0.98429+-0.00759 Training: 2021-03-16 18:43:03,285-[cfp_fp][164000]Accuracy-Highest: 0.98657 Training: 2021-03-16 18:43:56,407-[agedb_30][164000]XNorm: 22.208057 Training: 2021-03-16 18:43:56,408-[agedb_30][164000]Accuracy-Flip: 0.97900+-0.00642 Training: 2021-03-16 18:43:56,408-[agedb_30][164000]Accuracy-Highest: 0.97950 Training: 2021-03-16 18:44:13,749-Speed 276.30 samples/sec Loss 1.5756 Epoch: 9 Global Step: 164050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:44:29,767-Speed 3196.50 samples/sec Loss 1.5577 Epoch: 9 Global Step: 164100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:44:45,899-Speed 3174.03 samples/sec Loss 1.5521 Epoch: 9 Global Step: 164150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:45:02,133-Speed 3153.84 samples/sec Loss 1.5677 Epoch: 9 Global Step: 164200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:45:18,185-Speed 3189.81 samples/sec Loss 1.5665 Epoch: 9 Global Step: 164250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:45:34,176-Speed 3201.95 samples/sec Loss 1.5659 Epoch: 9 Global Step: 164300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:45:50,289-Speed 3177.68 samples/sec Loss 1.5937 Epoch: 9 Global Step: 164350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:46:06,412-Speed 3175.59 samples/sec Loss 1.5760 Epoch: 9 Global Step: 164400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:46:23,048-Speed 3077.85 samples/sec Loss 1.5917 Epoch: 9 Global Step: 164450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:46:39,342-Speed 3142.22 samples/sec Loss 1.5601 Epoch: 9 Global Step: 164500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:46:55,444-Speed 3180.00 samples/sec Loss 1.5477 Epoch: 9 Global Step: 164550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:47:11,598-Speed 3169.45 samples/sec Loss 1.5703 Epoch: 9 Global Step: 164600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:47:27,809-Speed 3158.42 samples/sec Loss 1.5566 Epoch: 9 Global Step: 164650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:47:44,130-Speed 3137.21 samples/sec Loss 1.5662 Epoch: 9 Global Step: 164700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:48:00,289-Speed 3168.70 samples/sec Loss 1.5535 Epoch: 9 Global Step: 164750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:48:16,141-Speed 3229.88 samples/sec Loss 1.5825 Epoch: 9 Global Step: 164800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:48:32,348-Speed 3159.21 samples/sec Loss 1.5600 Epoch: 9 Global Step: 164850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:48:49,129-Speed 3051.27 samples/sec Loss 1.5663 Epoch: 9 Global Step: 164900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:49:05,944-Speed 3044.96 samples/sec Loss 1.5605 Epoch: 9 Global Step: 164950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:49:22,071-Speed 3174.79 samples/sec Loss 1.5477 Epoch: 9 Global Step: 165000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:49:38,086-Speed 3197.06 samples/sec Loss 1.5555 Epoch: 9 Global Step: 165050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:49:54,523-Speed 3115.03 samples/sec Loss 1.5888 Epoch: 9 Global Step: 165100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:50:10,774-Speed 3150.74 samples/sec Loss 1.5657 Epoch: 9 Global Step: 165150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:50:27,628-Speed 3037.97 samples/sec Loss 1.5805 Epoch: 9 Global Step: 165200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:50:44,664-Speed 3005.52 samples/sec Loss 1.5767 Epoch: 9 Global Step: 165250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:51:00,924-Speed 3148.88 samples/sec Loss 1.5550 Epoch: 9 Global Step: 165300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:51:16,934-Speed 3198.13 samples/sec Loss 1.5802 Epoch: 9 Global Step: 165350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:51:32,985-Speed 3189.88 samples/sec Loss 1.5693 Epoch: 9 Global Step: 165400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:51:48,990-Speed 3199.09 samples/sec Loss 1.5688 Epoch: 9 Global Step: 165450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:52:06,182-Speed 2978.26 samples/sec Loss 1.5617 Epoch: 9 Global Step: 165500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:52:22,236-Speed 3189.31 samples/sec Loss 1.5625 Epoch: 9 Global Step: 165550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:52:38,393-Speed 3169.07 samples/sec Loss 1.5540 Epoch: 9 Global Step: 165600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:52:54,571-Speed 3164.79 samples/sec Loss 1.5793 Epoch: 9 Global Step: 165650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:53:10,712-Speed 3172.23 samples/sec Loss 1.5711 Epoch: 9 Global Step: 165700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:53:26,718-Speed 3198.86 samples/sec Loss 1.5525 Epoch: 9 Global Step: 165750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:53:43,476-Speed 3055.30 samples/sec Loss 1.5627 Epoch: 9 Global Step: 165800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:53:59,674-Speed 3160.94 samples/sec Loss 1.5831 Epoch: 9 Global Step: 165850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:54:16,031-Speed 3130.34 samples/sec Loss 1.5600 Epoch: 9 Global Step: 165900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:54:32,371-Speed 3133.48 samples/sec Loss 1.5935 Epoch: 9 Global Step: 165950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:54:48,566-Speed 3161.59 samples/sec Loss 1.5691 Epoch: 9 Global Step: 166000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:55:41,835-[lfw][166000]XNorm: 21.832394 Training: 2021-03-16 18:55:41,836-[lfw][166000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-16 18:55:41,836-[lfw][166000]Accuracy-Highest: 0.99800 Training: 2021-03-16 18:56:43,329-[cfp_fp][166000]XNorm: 20.607286 Training: 2021-03-16 18:56:43,330-[cfp_fp][166000]Accuracy-Flip: 0.98414+-0.00591 Training: 2021-03-16 18:56:43,330-[cfp_fp][166000]Accuracy-Highest: 0.98657 Training: 2021-03-16 18:57:36,496-[agedb_30][166000]XNorm: 21.967437 Training: 2021-03-16 18:57:36,496-[agedb_30][166000]Accuracy-Flip: 0.97867+-0.00547 Training: 2021-03-16 18:57:36,496-[agedb_30][166000]Accuracy-Highest: 0.97950 Training: 2021-03-16 18:57:53,713-Speed 276.54 samples/sec Loss 1.5700 Epoch: 9 Global Step: 166050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:58:09,875-Speed 3168.07 samples/sec Loss 1.5558 Epoch: 9 Global Step: 166100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:58:26,152-Speed 3145.69 samples/sec Loss 1.5696 Epoch: 9 Global Step: 166150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:58:43,279-Speed 2989.42 samples/sec Loss 1.5821 Epoch: 9 Global Step: 166200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:58:59,852-Speed 3089.47 samples/sec Loss 1.5744 Epoch: 9 Global Step: 166250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:59:15,937-Speed 3183.19 samples/sec Loss 1.5871 Epoch: 9 Global Step: 166300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:59:32,200-Speed 3148.44 samples/sec Loss 1.5600 Epoch: 9 Global Step: 166350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 18:59:48,388-Speed 3162.90 samples/sec Loss 1.5739 Epoch: 9 Global Step: 166400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:00:04,622-Speed 3153.90 samples/sec Loss 1.5772 Epoch: 9 Global Step: 166450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:00:20,667-Speed 3191.19 samples/sec Loss 1.5740 Epoch: 9 Global Step: 166500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:00:37,146-Speed 3107.14 samples/sec Loss 1.5677 Epoch: 9 Global Step: 166550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:00:53,257-Speed 3178.00 samples/sec Loss 1.5720 Epoch: 9 Global Step: 166600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:01:09,429-Speed 3166.10 samples/sec Loss 1.5746 Epoch: 9 Global Step: 166650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:01:25,572-Speed 3171.80 samples/sec Loss 1.5796 Epoch: 9 Global Step: 166700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:01:41,761-Speed 3162.71 samples/sec Loss 1.5627 Epoch: 9 Global Step: 166750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:01:57,779-Speed 3196.36 samples/sec Loss 1.5791 Epoch: 9 Global Step: 166800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:02:13,968-Speed 3162.75 samples/sec Loss 1.5524 Epoch: 9 Global Step: 166850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:02:29,845-Speed 3225.02 samples/sec Loss 1.5742 Epoch: 9 Global Step: 166900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:03:00,917-Speed 1647.81 samples/sec Loss 1.3534 Epoch: 10 Global Step: 166950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:03:16,896-Speed 3204.27 samples/sec Loss 1.3031 Epoch: 10 Global Step: 167000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:03:32,856-Speed 3208.09 samples/sec Loss 1.2913 Epoch: 10 Global Step: 167050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:03:49,516-Speed 3073.35 samples/sec Loss 1.3026 Epoch: 10 Global Step: 167100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:04:06,294-Speed 3051.78 samples/sec Loss 1.3119 Epoch: 10 Global Step: 167150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:04:22,008-Speed 3258.41 samples/sec Loss 1.2989 Epoch: 10 Global Step: 167200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:04:38,027-Speed 3196.29 samples/sec Loss 1.2907 Epoch: 10 Global Step: 167250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:04:53,861-Speed 3233.65 samples/sec Loss 1.3199 Epoch: 10 Global Step: 167300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:05:09,542-Speed 3265.07 samples/sec Loss 1.3150 Epoch: 10 Global Step: 167350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:05:25,548-Speed 3199.00 samples/sec Loss 1.3096 Epoch: 10 Global Step: 167400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:05:42,662-Speed 2991.82 samples/sec Loss 1.3216 Epoch: 10 Global Step: 167450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:05:59,755-Speed 2995.51 samples/sec Loss 1.3103 Epoch: 10 Global Step: 167500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:06:15,739-Speed 3203.31 samples/sec Loss 1.3474 Epoch: 10 Global Step: 167550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:06:31,596-Speed 3228.82 samples/sec Loss 1.3206 Epoch: 10 Global Step: 167600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:06:47,440-Speed 3231.63 samples/sec Loss 1.3228 Epoch: 10 Global Step: 167650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:07:03,189-Speed 3251.20 samples/sec Loss 1.3448 Epoch: 10 Global Step: 167700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:07:19,099-Speed 3218.07 samples/sec Loss 1.3217 Epoch: 10 Global Step: 167750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:07:36,095-Speed 3012.61 samples/sec Loss 1.3413 Epoch: 10 Global Step: 167800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:07:51,816-Speed 3256.88 samples/sec Loss 1.3379 Epoch: 10 Global Step: 167850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:08:07,971-Speed 3169.41 samples/sec Loss 1.3471 Epoch: 10 Global Step: 167900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:08:24,057-Speed 3182.96 samples/sec Loss 1.3350 Epoch: 10 Global Step: 167950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:08:41,342-Speed 2962.30 samples/sec Loss 1.3456 Epoch: 10 Global Step: 168000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:09:34,511-[lfw][168000]XNorm: 22.613010 Training: 2021-03-16 19:09:34,511-[lfw][168000]Accuracy-Flip: 0.99767+-0.00291 Training: 2021-03-16 19:09:34,511-[lfw][168000]Accuracy-Highest: 0.99800 Training: 2021-03-16 19:10:36,430-[cfp_fp][168000]XNorm: 21.859385 Training: 2021-03-16 19:10:36,431-[cfp_fp][168000]Accuracy-Flip: 0.98357+-0.00532 Training: 2021-03-16 19:10:36,431-[cfp_fp][168000]Accuracy-Highest: 0.98657 Training: 2021-03-16 19:11:29,906-[agedb_30][168000]XNorm: 22.959050 Training: 2021-03-16 19:11:29,906-[agedb_30][168000]Accuracy-Flip: 0.97933+-0.00676 Training: 2021-03-16 19:11:29,906-[agedb_30][168000]Accuracy-Highest: 0.97950 Training: 2021-03-16 19:11:45,724-Speed 277.68 samples/sec Loss 1.3478 Epoch: 10 Global Step: 168050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:12:01,455-Speed 3254.82 samples/sec Loss 1.3337 Epoch: 10 Global Step: 168100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:12:17,631-Speed 3165.21 samples/sec Loss 1.3465 Epoch: 10 Global Step: 168150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:12:33,792-Speed 3168.33 samples/sec Loss 1.3300 Epoch: 10 Global Step: 168200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:12:50,518-Speed 3061.08 samples/sec Loss 1.3570 Epoch: 10 Global Step: 168250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:13:06,642-Speed 3175.56 samples/sec Loss 1.3398 Epoch: 10 Global Step: 168300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:13:22,294-Speed 3271.29 samples/sec Loss 1.3465 Epoch: 10 Global Step: 168350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:13:38,220-Speed 3214.79 samples/sec Loss 1.3587 Epoch: 10 Global Step: 168400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:13:54,201-Speed 3203.89 samples/sec Loss 1.3568 Epoch: 10 Global Step: 168450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:14:11,243-Speed 3004.46 samples/sec Loss 1.3794 Epoch: 10 Global Step: 168500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:14:27,089-Speed 3231.26 samples/sec Loss 1.3765 Epoch: 10 Global Step: 168550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:14:42,884-Speed 3241.67 samples/sec Loss 1.3543 Epoch: 10 Global Step: 168600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:14:58,765-Speed 3224.17 samples/sec Loss 1.3818 Epoch: 10 Global Step: 168650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:15:14,602-Speed 3232.88 samples/sec Loss 1.3735 Epoch: 10 Global Step: 168700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:15:30,560-Speed 3208.66 samples/sec Loss 1.3868 Epoch: 10 Global Step: 168750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:15:46,695-Speed 3173.29 samples/sec Loss 1.3982 Epoch: 10 Global Step: 168800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:16:02,543-Speed 3230.78 samples/sec Loss 1.3707 Epoch: 10 Global Step: 168850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:16:18,667-Speed 3175.46 samples/sec Loss 1.3621 Epoch: 10 Global Step: 168900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:16:34,705-Speed 3192.54 samples/sec Loss 1.3772 Epoch: 10 Global Step: 168950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:16:50,474-Speed 3246.86 samples/sec Loss 1.3800 Epoch: 10 Global Step: 169000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:17:06,379-Speed 3219.28 samples/sec Loss 1.3650 Epoch: 10 Global Step: 169050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:17:22,365-Speed 3202.85 samples/sec Loss 1.3897 Epoch: 10 Global Step: 169100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:17:38,169-Speed 3239.83 samples/sec Loss 1.4197 Epoch: 10 Global Step: 169150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:17:54,092-Speed 3215.60 samples/sec Loss 1.3965 Epoch: 10 Global Step: 169200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:18:11,304-Speed 2974.65 samples/sec Loss 1.3852 Epoch: 10 Global Step: 169250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:18:27,092-Speed 3243.22 samples/sec Loss 1.3744 Epoch: 10 Global Step: 169300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:18:43,493-Speed 3121.79 samples/sec Loss 1.4009 Epoch: 10 Global Step: 169350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:19:00,191-Speed 3066.38 samples/sec Loss 1.4118 Epoch: 10 Global Step: 169400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:19:16,207-Speed 3196.89 samples/sec Loss 1.3885 Epoch: 10 Global Step: 169450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:19:32,250-Speed 3191.47 samples/sec Loss 1.3966 Epoch: 10 Global Step: 169500 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:19:48,356-Speed 3179.00 samples/sec Loss 1.3931 Epoch: 10 Global Step: 169550 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:20:04,359-Speed 3199.51 samples/sec Loss 1.3912 Epoch: 10 Global Step: 169600 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:20:21,594-Speed 2970.74 samples/sec Loss 1.4094 Epoch: 10 Global Step: 169650 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:20:37,527-Speed 3213.64 samples/sec Loss 1.4051 Epoch: 10 Global Step: 169700 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:20:54,445-Speed 3026.33 samples/sec Loss 1.4072 Epoch: 10 Global Step: 169750 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:21:10,577-Speed 3174.02 samples/sec Loss 1.4035 Epoch: 10 Global Step: 169800 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:21:26,646-Speed 3186.34 samples/sec Loss 1.4025 Epoch: 10 Global Step: 169850 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:21:42,997-Speed 3131.48 samples/sec Loss 1.4237 Epoch: 10 Global Step: 169900 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:21:59,924-Speed 3024.77 samples/sec Loss 1.4175 Epoch: 10 Global Step: 169950 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:22:15,955-Speed 3193.91 samples/sec Loss 1.4099 Epoch: 10 Global Step: 170000 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:23:09,172-[lfw][170000]XNorm: 21.735423 Training: 2021-03-16 19:23:09,173-[lfw][170000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-16 19:23:09,173-[lfw][170000]Accuracy-Highest: 0.99800 Training: 2021-03-16 19:24:11,259-[cfp_fp][170000]XNorm: 20.529365 Training: 2021-03-16 19:24:11,260-[cfp_fp][170000]Accuracy-Flip: 0.98329+-0.00661 Training: 2021-03-16 19:24:11,260-[cfp_fp][170000]Accuracy-Highest: 0.98657 Training: 2021-03-16 19:25:04,393-[agedb_30][170000]XNorm: 22.010352 Training: 2021-03-16 19:25:04,394-[agedb_30][170000]Accuracy-Flip: 0.98017+-0.00612 Training: 2021-03-16 19:25:04,394-[agedb_30][170000]Accuracy-Highest: 0.98017 Training: 2021-03-16 19:25:20,288-Speed 277.76 samples/sec Loss 1.4240 Epoch: 10 Global Step: 170050 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:25:36,447-Speed 3168.53 samples/sec Loss 1.4231 Epoch: 10 Global Step: 170100 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:25:52,440-Speed 3201.54 samples/sec Loss 1.4095 Epoch: 10 Global Step: 170150 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:26:09,517-Speed 2998.30 samples/sec Loss 1.4074 Epoch: 10 Global Step: 170200 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:26:25,581-Speed 3187.30 samples/sec Loss 1.4027 Epoch: 10 Global Step: 170250 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:26:41,686-Speed 3179.30 samples/sec Loss 1.4065 Epoch: 10 Global Step: 170300 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:26:57,817-Speed 3174.16 samples/sec Loss 1.4097 Epoch: 10 Global Step: 170350 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:27:14,468-Speed 3074.94 samples/sec Loss 1.4309 Epoch: 10 Global Step: 170400 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:27:30,503-Speed 3193.20 samples/sec Loss 1.4134 Epoch: 10 Global Step: 170450 Fp16 Grad Scale: 16384 Required: 19 hours Training: 2021-03-16 19:27:46,485-Speed 3203.74 samples/sec Loss 1.4222 Epoch: 10 Global Step: 170500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:28:02,542-Speed 3188.60 samples/sec Loss 1.4391 Epoch: 10 Global Step: 170550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:28:18,787-Speed 3151.95 samples/sec Loss 1.4411 Epoch: 10 Global Step: 170600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:28:35,935-Speed 2985.88 samples/sec Loss 1.4379 Epoch: 10 Global Step: 170650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:28:51,795-Speed 3228.24 samples/sec Loss 1.4527 Epoch: 10 Global Step: 170700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:29:07,781-Speed 3202.91 samples/sec Loss 1.4296 Epoch: 10 Global Step: 170750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:29:24,001-Speed 3156.74 samples/sec Loss 1.4214 Epoch: 10 Global Step: 170800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:29:40,015-Speed 3197.15 samples/sec Loss 1.4130 Epoch: 10 Global Step: 170850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:29:56,102-Speed 3182.87 samples/sec Loss 1.4177 Epoch: 10 Global Step: 170900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:30:12,304-Speed 3160.16 samples/sec Loss 1.4257 Epoch: 10 Global Step: 170950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:30:28,250-Speed 3210.96 samples/sec Loss 1.4226 Epoch: 10 Global Step: 171000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:30:44,094-Speed 3231.52 samples/sec Loss 1.4283 Epoch: 10 Global Step: 171050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:31:00,551-Speed 3111.36 samples/sec Loss 1.4289 Epoch: 10 Global Step: 171100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:31:16,494-Speed 3211.48 samples/sec Loss 1.4501 Epoch: 10 Global Step: 171150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:31:33,033-Speed 3095.74 samples/sec Loss 1.4400 Epoch: 10 Global Step: 171200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:31:49,271-Speed 3153.28 samples/sec Loss 1.4476 Epoch: 10 Global Step: 171250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:32:05,091-Speed 3236.49 samples/sec Loss 1.4397 Epoch: 10 Global Step: 171300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:32:20,985-Speed 3221.54 samples/sec Loss 1.4261 Epoch: 10 Global Step: 171350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:32:37,054-Speed 3186.24 samples/sec Loss 1.4254 Epoch: 10 Global Step: 171400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:32:53,987-Speed 3023.88 samples/sec Loss 1.4382 Epoch: 10 Global Step: 171450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:33:09,802-Speed 3237.34 samples/sec Loss 1.4462 Epoch: 10 Global Step: 171500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:33:25,612-Speed 3238.70 samples/sec Loss 1.4102 Epoch: 10 Global Step: 171550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:33:42,607-Speed 3012.77 samples/sec Loss 1.4378 Epoch: 10 Global Step: 171600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:33:58,642-Speed 3192.94 samples/sec Loss 1.4134 Epoch: 10 Global Step: 171650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:34:14,680-Speed 3192.68 samples/sec Loss 1.4440 Epoch: 10 Global Step: 171700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:34:30,698-Speed 3196.49 samples/sec Loss 1.4267 Epoch: 10 Global Step: 171750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:34:47,816-Speed 2991.01 samples/sec Loss 1.4517 Epoch: 10 Global Step: 171800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:35:04,348-Speed 3097.22 samples/sec Loss 1.4499 Epoch: 10 Global Step: 171850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:35:20,242-Speed 3221.32 samples/sec Loss 1.4521 Epoch: 10 Global Step: 171900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:35:36,179-Speed 3212.78 samples/sec Loss 1.4765 Epoch: 10 Global Step: 171950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:35:53,464-Speed 2962.24 samples/sec Loss 1.4684 Epoch: 10 Global Step: 172000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:36:46,763-[lfw][172000]XNorm: 22.441147 Training: 2021-03-16 19:36:46,764-[lfw][172000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 19:36:46,764-[lfw][172000]Accuracy-Highest: 0.99800 Training: 2021-03-16 19:37:48,257-[cfp_fp][172000]XNorm: 20.875009 Training: 2021-03-16 19:37:48,257-[cfp_fp][172000]Accuracy-Flip: 0.98300+-0.00674 Training: 2021-03-16 19:37:48,257-[cfp_fp][172000]Accuracy-Highest: 0.98657 Training: 2021-03-16 19:38:41,830-[agedb_30][172000]XNorm: 22.411693 Training: 2021-03-16 19:38:41,830-[agedb_30][172000]Accuracy-Flip: 0.98017+-0.00697 Training: 2021-03-16 19:38:41,830-[agedb_30][172000]Accuracy-Highest: 0.98017 Training: 2021-03-16 19:38:58,317-Speed 276.98 samples/sec Loss 1.4651 Epoch: 10 Global Step: 172050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:39:14,437-Speed 3176.41 samples/sec Loss 1.4512 Epoch: 10 Global Step: 172100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:39:31,559-Speed 2990.29 samples/sec Loss 1.4646 Epoch: 10 Global Step: 172150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:39:47,653-Speed 3181.41 samples/sec Loss 1.4467 Epoch: 10 Global Step: 172200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:40:03,722-Speed 3186.34 samples/sec Loss 1.4615 Epoch: 10 Global Step: 172250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:40:20,097-Speed 3126.91 samples/sec Loss 1.4421 Epoch: 10 Global Step: 172300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:40:36,315-Speed 3157.00 samples/sec Loss 1.4415 Epoch: 10 Global Step: 172350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:40:52,327-Speed 3197.70 samples/sec Loss 1.4415 Epoch: 10 Global Step: 172400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:41:09,154-Speed 3042.89 samples/sec Loss 1.4481 Epoch: 10 Global Step: 172450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:41:25,452-Speed 3141.62 samples/sec Loss 1.4539 Epoch: 10 Global Step: 172500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:41:41,464-Speed 3197.68 samples/sec Loss 1.4567 Epoch: 10 Global Step: 172550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:41:58,470-Speed 3010.72 samples/sec Loss 1.4492 Epoch: 10 Global Step: 172600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:42:14,583-Speed 3177.67 samples/sec Loss 1.4604 Epoch: 10 Global Step: 172650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:42:30,924-Speed 3133.39 samples/sec Loss 1.4613 Epoch: 10 Global Step: 172700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:42:47,350-Speed 3116.97 samples/sec Loss 1.4840 Epoch: 10 Global Step: 172750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:43:03,430-Speed 3184.27 samples/sec Loss 1.4606 Epoch: 10 Global Step: 172800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:43:20,050-Speed 3080.76 samples/sec Loss 1.4660 Epoch: 10 Global Step: 172850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:43:36,916-Speed 3035.63 samples/sec Loss 1.4625 Epoch: 10 Global Step: 172900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:43:52,997-Speed 3183.96 samples/sec Loss 1.4858 Epoch: 10 Global Step: 172950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:44:09,292-Speed 3142.26 samples/sec Loss 1.4574 Epoch: 10 Global Step: 173000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:44:25,344-Speed 3189.80 samples/sec Loss 1.4512 Epoch: 10 Global Step: 173050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:44:41,710-Speed 3128.48 samples/sec Loss 1.4632 Epoch: 10 Global Step: 173100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:44:58,195-Speed 3105.94 samples/sec Loss 1.4986 Epoch: 10 Global Step: 173150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:45:14,278-Speed 3183.47 samples/sec Loss 1.4822 Epoch: 10 Global Step: 173200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:45:30,535-Speed 3149.63 samples/sec Loss 1.4673 Epoch: 10 Global Step: 173250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:45:46,717-Speed 3164.01 samples/sec Loss 1.4588 Epoch: 10 Global Step: 173300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:46:03,043-Speed 3136.22 samples/sec Loss 1.4832 Epoch: 10 Global Step: 173350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:46:19,124-Speed 3183.96 samples/sec Loss 1.4745 Epoch: 10 Global Step: 173400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:46:35,280-Speed 3169.22 samples/sec Loss 1.4683 Epoch: 10 Global Step: 173450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:46:51,435-Speed 3169.40 samples/sec Loss 1.4901 Epoch: 10 Global Step: 173500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:47:07,746-Speed 3139.12 samples/sec Loss 1.4654 Epoch: 10 Global Step: 173550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:47:24,700-Speed 3020.03 samples/sec Loss 1.4862 Epoch: 10 Global Step: 173600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:47:40,766-Speed 3186.92 samples/sec Loss 1.4761 Epoch: 10 Global Step: 173650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:47:56,690-Speed 3215.40 samples/sec Loss 1.4560 Epoch: 10 Global Step: 173700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:48:13,920-Speed 2971.63 samples/sec Loss 1.4897 Epoch: 10 Global Step: 173750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:48:30,196-Speed 3145.90 samples/sec Loss 1.5234 Epoch: 10 Global Step: 173800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:48:46,403-Speed 3159.22 samples/sec Loss 1.4904 Epoch: 10 Global Step: 173850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:49:02,852-Speed 3112.66 samples/sec Loss 1.4795 Epoch: 10 Global Step: 173900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:49:19,207-Speed 3130.73 samples/sec Loss 1.5036 Epoch: 10 Global Step: 173950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:49:36,200-Speed 3013.13 samples/sec Loss 1.4631 Epoch: 10 Global Step: 174000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:50:29,491-[lfw][174000]XNorm: 22.599994 Training: 2021-03-16 19:50:29,491-[lfw][174000]Accuracy-Flip: 0.99800+-0.00287 Training: 2021-03-16 19:50:29,491-[lfw][174000]Accuracy-Highest: 0.99800 Training: 2021-03-16 19:51:31,496-[cfp_fp][174000]XNorm: 21.724595 Training: 2021-03-16 19:51:31,497-[cfp_fp][174000]Accuracy-Flip: 0.98500+-0.00680 Training: 2021-03-16 19:51:31,497-[cfp_fp][174000]Accuracy-Highest: 0.98657 Training: 2021-03-16 19:52:25,785-[agedb_30][174000]XNorm: 23.106625 Training: 2021-03-16 19:52:25,785-[agedb_30][174000]Accuracy-Flip: 0.97850+-0.00765 Training: 2021-03-16 19:52:25,785-[agedb_30][174000]Accuracy-Highest: 0.98017 Training: 2021-03-16 19:52:41,754-Speed 275.93 samples/sec Loss 1.5023 Epoch: 10 Global Step: 174050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:52:58,069-Speed 3138.28 samples/sec Loss 1.4541 Epoch: 10 Global Step: 174100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:53:14,095-Speed 3194.79 samples/sec Loss 1.5043 Epoch: 10 Global Step: 174150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:53:31,258-Speed 2983.30 samples/sec Loss 1.4894 Epoch: 10 Global Step: 174200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:53:47,653-Speed 3122.93 samples/sec Loss 1.4942 Epoch: 10 Global Step: 174250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:54:03,704-Speed 3190.02 samples/sec Loss 1.4814 Epoch: 10 Global Step: 174300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:54:19,966-Speed 3148.50 samples/sec Loss 1.4861 Epoch: 10 Global Step: 174350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:54:36,102-Speed 3173.12 samples/sec Loss 1.4649 Epoch: 10 Global Step: 174400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:54:52,962-Speed 3036.87 samples/sec Loss 1.5070 Epoch: 10 Global Step: 174450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:55:09,279-Speed 3137.93 samples/sec Loss 1.4999 Epoch: 10 Global Step: 174500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:55:25,274-Speed 3201.15 samples/sec Loss 1.4943 Epoch: 10 Global Step: 174550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:55:42,415-Speed 2987.03 samples/sec Loss 1.4873 Epoch: 10 Global Step: 174600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:55:58,597-Speed 3164.09 samples/sec Loss 1.4928 Epoch: 10 Global Step: 174650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:56:14,768-Speed 3166.27 samples/sec Loss 1.4852 Epoch: 10 Global Step: 174700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:56:30,680-Speed 3217.92 samples/sec Loss 1.4984 Epoch: 10 Global Step: 174750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:56:46,758-Speed 3184.62 samples/sec Loss 1.4771 Epoch: 10 Global Step: 174800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:57:03,039-Speed 3144.70 samples/sec Loss 1.4919 Epoch: 10 Global Step: 174850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:57:19,926-Speed 3032.02 samples/sec Loss 1.5044 Epoch: 10 Global Step: 174900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:57:36,246-Speed 3137.51 samples/sec Loss 1.4810 Epoch: 10 Global Step: 174950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:57:52,373-Speed 3174.84 samples/sec Loss 1.4884 Epoch: 10 Global Step: 175000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:58:08,396-Speed 3195.47 samples/sec Loss 1.4881 Epoch: 10 Global Step: 175050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:58:25,256-Speed 3036.90 samples/sec Loss 1.5111 Epoch: 10 Global Step: 175100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:58:41,711-Speed 3111.57 samples/sec Loss 1.4965 Epoch: 10 Global Step: 175150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:58:57,856-Speed 3171.42 samples/sec Loss 1.5077 Epoch: 10 Global Step: 175200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:59:14,219-Speed 3129.01 samples/sec Loss 1.4949 Epoch: 10 Global Step: 175250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:59:30,554-Speed 3134.47 samples/sec Loss 1.5128 Epoch: 10 Global Step: 175300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 19:59:46,543-Speed 3202.25 samples/sec Loss 1.4823 Epoch: 10 Global Step: 175350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:00:02,469-Speed 3214.97 samples/sec Loss 1.4941 Epoch: 10 Global Step: 175400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:00:18,709-Speed 3152.83 samples/sec Loss 1.5301 Epoch: 10 Global Step: 175450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:00:34,775-Speed 3186.92 samples/sec Loss 1.5204 Epoch: 10 Global Step: 175500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:00:51,286-Speed 3101.19 samples/sec Loss 1.4863 Epoch: 10 Global Step: 175550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:01:07,385-Speed 3180.30 samples/sec Loss 1.5088 Epoch: 10 Global Step: 175600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:01:23,728-Speed 3132.99 samples/sec Loss 1.4937 Epoch: 10 Global Step: 175650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:01:39,669-Speed 3211.84 samples/sec Loss 1.4923 Epoch: 10 Global Step: 175700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:01:56,106-Speed 3115.02 samples/sec Loss 1.5058 Epoch: 10 Global Step: 175750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:02:13,372-Speed 2965.51 samples/sec Loss 1.5252 Epoch: 10 Global Step: 175800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:02:29,277-Speed 3219.14 samples/sec Loss 1.5022 Epoch: 10 Global Step: 175850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:02:46,314-Speed 3005.42 samples/sec Loss 1.5197 Epoch: 10 Global Step: 175900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:03:02,330-Speed 3196.82 samples/sec Loss 1.4845 Epoch: 10 Global Step: 175950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:03:18,417-Speed 3182.82 samples/sec Loss 1.5064 Epoch: 10 Global Step: 176000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:04:11,667-[lfw][176000]XNorm: 22.417570 Training: 2021-03-16 20:04:11,668-[lfw][176000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-16 20:04:11,668-[lfw][176000]Accuracy-Highest: 0.99817 Training: 2021-03-16 20:05:13,499-[cfp_fp][176000]XNorm: 21.317308 Training: 2021-03-16 20:05:13,500-[cfp_fp][176000]Accuracy-Flip: 0.98471+-0.00813 Training: 2021-03-16 20:05:13,500-[cfp_fp][176000]Accuracy-Highest: 0.98657 Training: 2021-03-16 20:06:06,885-[agedb_30][176000]XNorm: 22.594206 Training: 2021-03-16 20:06:06,886-[agedb_30][176000]Accuracy-Flip: 0.97900+-0.00629 Training: 2021-03-16 20:06:06,886-[agedb_30][176000]Accuracy-Highest: 0.98017 Training: 2021-03-16 20:06:23,382-Speed 276.81 samples/sec Loss 1.5212 Epoch: 10 Global Step: 176050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:06:39,156-Speed 3245.86 samples/sec Loss 1.4834 Epoch: 10 Global Step: 176100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:06:55,394-Speed 3153.13 samples/sec Loss 1.5073 Epoch: 10 Global Step: 176150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:07:12,079-Speed 3068.87 samples/sec Loss 1.5175 Epoch: 10 Global Step: 176200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:07:28,505-Speed 3117.06 samples/sec Loss 1.5349 Epoch: 10 Global Step: 176250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:07:44,881-Speed 3126.63 samples/sec Loss 1.4933 Epoch: 10 Global Step: 176300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:08:01,742-Speed 3036.70 samples/sec Loss 1.4994 Epoch: 10 Global Step: 176350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:08:17,983-Speed 3152.63 samples/sec Loss 1.5241 Epoch: 10 Global Step: 176400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:08:33,997-Speed 3197.29 samples/sec Loss 1.5155 Epoch: 10 Global Step: 176450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:08:50,211-Speed 3157.86 samples/sec Loss 1.5116 Epoch: 10 Global Step: 176500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:09:06,471-Speed 3148.81 samples/sec Loss 1.5289 Epoch: 10 Global Step: 176550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:09:22,754-Speed 3144.55 samples/sec Loss 1.5237 Epoch: 10 Global Step: 176600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:09:38,659-Speed 3219.26 samples/sec Loss 1.5126 Epoch: 10 Global Step: 176650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:09:55,696-Speed 3005.34 samples/sec Loss 1.5005 Epoch: 10 Global Step: 176700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:10:11,882-Speed 3163.16 samples/sec Loss 1.5172 Epoch: 10 Global Step: 176750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:10:29,235-Speed 2950.55 samples/sec Loss 1.5292 Epoch: 10 Global Step: 176800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:10:45,234-Speed 3200.40 samples/sec Loss 1.5111 Epoch: 10 Global Step: 176850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:11:01,698-Speed 3109.90 samples/sec Loss 1.5128 Epoch: 10 Global Step: 176900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:11:17,803-Speed 3179.23 samples/sec Loss 1.5295 Epoch: 10 Global Step: 176950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:11:33,906-Speed 3179.61 samples/sec Loss 1.5165 Epoch: 10 Global Step: 177000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:11:50,186-Speed 3145.08 samples/sec Loss 1.5187 Epoch: 10 Global Step: 177050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:12:06,581-Speed 3122.95 samples/sec Loss 1.5345 Epoch: 10 Global Step: 177100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:12:23,495-Speed 3027.30 samples/sec Loss 1.5313 Epoch: 10 Global Step: 177150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:12:39,379-Speed 3223.41 samples/sec Loss 1.5235 Epoch: 10 Global Step: 177200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:12:55,377-Speed 3200.38 samples/sec Loss 1.5034 Epoch: 10 Global Step: 177250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:13:12,505-Speed 2989.51 samples/sec Loss 1.5119 Epoch: 10 Global Step: 177300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:13:28,857-Speed 3131.14 samples/sec Loss 1.5440 Epoch: 10 Global Step: 177350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:13:44,851-Speed 3201.18 samples/sec Loss 1.5247 Epoch: 10 Global Step: 177400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:14:00,897-Speed 3190.96 samples/sec Loss 1.5265 Epoch: 10 Global Step: 177450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:14:17,097-Speed 3160.71 samples/sec Loss 1.5350 Epoch: 10 Global Step: 177500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:14:33,334-Speed 3153.25 samples/sec Loss 1.5353 Epoch: 10 Global Step: 177550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:14:49,490-Speed 3169.31 samples/sec Loss 1.5387 Epoch: 10 Global Step: 177600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:15:05,645-Speed 3169.26 samples/sec Loss 1.5590 Epoch: 10 Global Step: 177650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:15:21,628-Speed 3203.63 samples/sec Loss 1.5317 Epoch: 10 Global Step: 177700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:15:37,736-Speed 3178.55 samples/sec Loss 1.5071 Epoch: 10 Global Step: 177750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:15:53,958-Speed 3156.33 samples/sec Loss 1.5347 Epoch: 10 Global Step: 177800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:16:10,047-Speed 3182.50 samples/sec Loss 1.5276 Epoch: 10 Global Step: 177850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:16:26,180-Speed 3173.53 samples/sec Loss 1.5388 Epoch: 10 Global Step: 177900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:16:42,621-Speed 3114.35 samples/sec Loss 1.5369 Epoch: 10 Global Step: 177950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:16:59,568-Speed 3021.23 samples/sec Loss 1.5105 Epoch: 10 Global Step: 178000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:17:52,650-[lfw][178000]XNorm: 22.317445 Training: 2021-03-16 20:17:52,651-[lfw][178000]Accuracy-Flip: 0.99800+-0.00256 Training: 2021-03-16 20:17:52,651-[lfw][178000]Accuracy-Highest: 0.99817 Training: 2021-03-16 20:18:54,466-[cfp_fp][178000]XNorm: 20.822620 Training: 2021-03-16 20:18:54,466-[cfp_fp][178000]Accuracy-Flip: 0.98429+-0.00616 Training: 2021-03-16 20:18:54,466-[cfp_fp][178000]Accuracy-Highest: 0.98657 Training: 2021-03-16 20:19:47,390-[agedb_30][178000]XNorm: 22.707155 Training: 2021-03-16 20:19:47,391-[agedb_30][178000]Accuracy-Flip: 0.97917+-0.00847 Training: 2021-03-16 20:19:47,391-[agedb_30][178000]Accuracy-Highest: 0.98017 Training: 2021-03-16 20:20:03,485-Speed 278.39 samples/sec Loss 1.5321 Epoch: 10 Global Step: 178050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:20:20,615-Speed 2989.08 samples/sec Loss 1.5406 Epoch: 10 Global Step: 178100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:20:37,088-Speed 3108.15 samples/sec Loss 1.5156 Epoch: 10 Global Step: 178150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:20:53,140-Speed 3189.78 samples/sec Loss 1.5245 Epoch: 10 Global Step: 178200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:21:09,286-Speed 3171.02 samples/sec Loss 1.5080 Epoch: 10 Global Step: 178250 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:21:25,487-Speed 3160.51 samples/sec Loss 1.5559 Epoch: 10 Global Step: 178300 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:21:41,525-Speed 3192.39 samples/sec Loss 1.5296 Epoch: 10 Global Step: 178350 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:21:58,584-Speed 3001.46 samples/sec Loss 1.5339 Epoch: 10 Global Step: 178400 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:22:14,698-Speed 3177.55 samples/sec Loss 1.5477 Epoch: 10 Global Step: 178450 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:22:30,727-Speed 3194.29 samples/sec Loss 1.5231 Epoch: 10 Global Step: 178500 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:22:47,731-Speed 3011.08 samples/sec Loss 1.5307 Epoch: 10 Global Step: 178550 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:23:04,029-Speed 3141.68 samples/sec Loss 1.5291 Epoch: 10 Global Step: 178600 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:23:20,193-Speed 3167.54 samples/sec Loss 1.5680 Epoch: 10 Global Step: 178650 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:23:36,487-Speed 3142.37 samples/sec Loss 1.5277 Epoch: 10 Global Step: 178700 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:23:52,744-Speed 3149.62 samples/sec Loss 1.5185 Epoch: 10 Global Step: 178750 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:24:09,274-Speed 3097.42 samples/sec Loss 1.5225 Epoch: 10 Global Step: 178800 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:24:26,091-Speed 3044.73 samples/sec Loss 1.5364 Epoch: 10 Global Step: 178850 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:24:42,581-Speed 3104.96 samples/sec Loss 1.5397 Epoch: 10 Global Step: 178900 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:24:58,997-Speed 3119.01 samples/sec Loss 1.5381 Epoch: 10 Global Step: 178950 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:25:16,457-Speed 2932.44 samples/sec Loss 1.5519 Epoch: 10 Global Step: 179000 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:25:32,982-Speed 3098.55 samples/sec Loss 1.5426 Epoch: 10 Global Step: 179050 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:25:48,995-Speed 3197.50 samples/sec Loss 1.5309 Epoch: 10 Global Step: 179100 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:26:05,309-Speed 3138.53 samples/sec Loss 1.5570 Epoch: 10 Global Step: 179150 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:26:21,468-Speed 3168.62 samples/sec Loss 1.5519 Epoch: 10 Global Step: 179200 Fp16 Grad Scale: 16384 Required: 18 hours Training: 2021-03-16 20:26:37,571-Speed 3179.61 samples/sec Loss 1.5367 Epoch: 10 Global Step: 179250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:26:53,602-Speed 3193.76 samples/sec Loss 1.5304 Epoch: 10 Global Step: 179300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:27:10,482-Speed 3033.45 samples/sec Loss 1.5347 Epoch: 10 Global Step: 179350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:27:26,522-Speed 3191.97 samples/sec Loss 1.5478 Epoch: 10 Global Step: 179400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:27:42,708-Speed 3163.30 samples/sec Loss 1.5466 Epoch: 10 Global Step: 179450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:27:59,835-Speed 2989.52 samples/sec Loss 1.5171 Epoch: 10 Global Step: 179500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:28:15,885-Speed 3190.15 samples/sec Loss 1.5513 Epoch: 10 Global Step: 179550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:28:32,094-Speed 3158.76 samples/sec Loss 1.5202 Epoch: 10 Global Step: 179600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:28:48,248-Speed 3169.72 samples/sec Loss 1.5542 Epoch: 10 Global Step: 179650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:29:04,738-Speed 3105.02 samples/sec Loss 1.5635 Epoch: 10 Global Step: 179700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:29:20,842-Speed 3179.36 samples/sec Loss 1.5618 Epoch: 10 Global Step: 179750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:29:36,976-Speed 3173.53 samples/sec Loss 1.5550 Epoch: 10 Global Step: 179800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:29:53,142-Speed 3167.30 samples/sec Loss 1.5717 Epoch: 10 Global Step: 179850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:30:09,132-Speed 3202.03 samples/sec Loss 1.5531 Epoch: 10 Global Step: 179900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:30:25,590-Speed 3110.99 samples/sec Loss 1.5631 Epoch: 10 Global Step: 179950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:30:42,059-Speed 3109.08 samples/sec Loss 1.5705 Epoch: 10 Global Step: 180000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:31:35,265-[lfw][180000]XNorm: 23.839899 Training: 2021-03-16 20:31:35,265-[lfw][180000]Accuracy-Flip: 0.99750+-0.00261 Training: 2021-03-16 20:31:35,265-[lfw][180000]Accuracy-Highest: 0.99817 Training: 2021-03-16 20:32:37,003-[cfp_fp][180000]XNorm: 22.344888 Training: 2021-03-16 20:32:37,003-[cfp_fp][180000]Accuracy-Flip: 0.98543+-0.00556 Training: 2021-03-16 20:32:37,004-[cfp_fp][180000]Accuracy-Highest: 0.98657 Training: 2021-03-16 20:33:30,982-[agedb_30][180000]XNorm: 23.915926 Training: 2021-03-16 20:33:30,982-[agedb_30][180000]Accuracy-Flip: 0.97883+-0.00715 Training: 2021-03-16 20:33:30,982-[agedb_30][180000]Accuracy-Highest: 0.98017 Training: 2021-03-16 20:33:47,129-Speed 276.65 samples/sec Loss 1.5598 Epoch: 10 Global Step: 180050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:34:03,211-Speed 3183.80 samples/sec Loss 1.5575 Epoch: 10 Global Step: 180100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:34:20,208-Speed 3012.43 samples/sec Loss 1.5455 Epoch: 10 Global Step: 180150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:34:36,397-Speed 3162.68 samples/sec Loss 1.5437 Epoch: 10 Global Step: 180200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:34:53,174-Speed 3051.80 samples/sec Loss 1.5481 Epoch: 10 Global Step: 180250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:35:09,297-Speed 3175.77 samples/sec Loss 1.5404 Epoch: 10 Global Step: 180300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:35:25,417-Speed 3176.24 samples/sec Loss 1.5556 Epoch: 10 Global Step: 180350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:35:41,451-Speed 3193.43 samples/sec Loss 1.5400 Epoch: 10 Global Step: 180400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:35:57,510-Speed 3188.36 samples/sec Loss 1.5441 Epoch: 10 Global Step: 180450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:36:13,572-Speed 3187.69 samples/sec Loss 1.5602 Epoch: 10 Global Step: 180500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:36:30,561-Speed 3013.74 samples/sec Loss 1.5388 Epoch: 10 Global Step: 180550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:36:46,813-Speed 3150.58 samples/sec Loss 1.5534 Epoch: 10 Global Step: 180600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:37:02,898-Speed 3183.06 samples/sec Loss 1.5631 Epoch: 10 Global Step: 180650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:37:19,054-Speed 3169.18 samples/sec Loss 1.5429 Epoch: 10 Global Step: 180700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:37:35,207-Speed 3169.84 samples/sec Loss 1.5733 Epoch: 10 Global Step: 180750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:37:51,342-Speed 3173.32 samples/sec Loss 1.5501 Epoch: 10 Global Step: 180800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:38:08,552-Speed 2975.05 samples/sec Loss 1.5381 Epoch: 10 Global Step: 180850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:38:24,809-Speed 3149.64 samples/sec Loss 1.5642 Epoch: 10 Global Step: 180900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:38:41,253-Speed 3113.65 samples/sec Loss 1.5772 Epoch: 10 Global Step: 180950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:38:57,423-Speed 3166.39 samples/sec Loss 1.5768 Epoch: 10 Global Step: 181000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:39:13,528-Speed 3179.35 samples/sec Loss 1.5587 Epoch: 10 Global Step: 181050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:39:30,249-Speed 3062.03 samples/sec Loss 1.5451 Epoch: 10 Global Step: 181100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:39:47,479-Speed 2971.64 samples/sec Loss 1.5454 Epoch: 10 Global Step: 181150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:40:03,523-Speed 3191.29 samples/sec Loss 1.5165 Epoch: 10 Global Step: 181200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:40:20,501-Speed 3015.81 samples/sec Loss 1.5754 Epoch: 10 Global Step: 181250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:40:36,917-Speed 3118.92 samples/sec Loss 1.5561 Epoch: 10 Global Step: 181300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:40:53,073-Speed 3169.19 samples/sec Loss 1.5684 Epoch: 10 Global Step: 181350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:41:09,401-Speed 3135.92 samples/sec Loss 1.5570 Epoch: 10 Global Step: 181400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:41:25,830-Speed 3116.48 samples/sec Loss 1.5440 Epoch: 10 Global Step: 181450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:41:43,046-Speed 2974.10 samples/sec Loss 1.5632 Epoch: 10 Global Step: 181500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:41:59,179-Speed 3173.68 samples/sec Loss 1.5638 Epoch: 10 Global Step: 181550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:42:15,701-Speed 3098.91 samples/sec Loss 1.5608 Epoch: 10 Global Step: 181600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:42:32,134-Speed 3115.88 samples/sec Loss 1.5165 Epoch: 10 Global Step: 181650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:42:48,524-Speed 3123.96 samples/sec Loss 1.5732 Epoch: 10 Global Step: 181700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:43:05,440-Speed 3026.81 samples/sec Loss 1.5540 Epoch: 10 Global Step: 181750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:43:21,677-Speed 3153.33 samples/sec Loss 1.5658 Epoch: 10 Global Step: 181800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:43:38,027-Speed 3131.61 samples/sec Loss 1.5416 Epoch: 10 Global Step: 181850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:43:54,452-Speed 3117.24 samples/sec Loss 1.5498 Epoch: 10 Global Step: 181900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:44:10,534-Speed 3183.88 samples/sec Loss 1.5387 Epoch: 10 Global Step: 181950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:44:26,637-Speed 3179.58 samples/sec Loss 1.5351 Epoch: 10 Global Step: 182000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:45:19,707-[lfw][182000]XNorm: 21.301658 Training: 2021-03-16 20:45:19,707-[lfw][182000]Accuracy-Flip: 0.99783+-0.00279 Training: 2021-03-16 20:45:19,707-[lfw][182000]Accuracy-Highest: 0.99817 Training: 2021-03-16 20:46:21,566-[cfp_fp][182000]XNorm: 20.327853 Training: 2021-03-16 20:46:21,567-[cfp_fp][182000]Accuracy-Flip: 0.98386+-0.00655 Training: 2021-03-16 20:46:21,567-[cfp_fp][182000]Accuracy-Highest: 0.98657 Training: 2021-03-16 20:47:14,938-[agedb_30][182000]XNorm: 21.678610 Training: 2021-03-16 20:47:14,938-[agedb_30][182000]Accuracy-Flip: 0.97933+-0.00680 Training: 2021-03-16 20:47:14,938-[agedb_30][182000]Accuracy-Highest: 0.98017 Training: 2021-03-16 20:47:31,231-Speed 277.37 samples/sec Loss 1.5698 Epoch: 10 Global Step: 182050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:47:47,373-Speed 3172.09 samples/sec Loss 1.5698 Epoch: 10 Global Step: 182100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:48:03,398-Speed 3194.94 samples/sec Loss 1.5713 Epoch: 10 Global Step: 182150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:48:19,819-Speed 3118.14 samples/sec Loss 1.5598 Epoch: 10 Global Step: 182200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:48:36,106-Speed 3143.75 samples/sec Loss 1.5627 Epoch: 10 Global Step: 182250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:48:52,145-Speed 3192.40 samples/sec Loss 1.5721 Epoch: 10 Global Step: 182300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:49:09,264-Speed 2990.79 samples/sec Loss 1.5846 Epoch: 10 Global Step: 182350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:49:26,308-Speed 3004.05 samples/sec Loss 1.5573 Epoch: 10 Global Step: 182400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:49:42,376-Speed 3186.55 samples/sec Loss 1.5433 Epoch: 10 Global Step: 182450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:49:58,583-Speed 3159.41 samples/sec Loss 1.5685 Epoch: 10 Global Step: 182500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:50:15,109-Speed 3098.23 samples/sec Loss 1.5433 Epoch: 10 Global Step: 182550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:50:31,249-Speed 3172.30 samples/sec Loss 1.5657 Epoch: 10 Global Step: 182600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:50:47,447-Speed 3160.90 samples/sec Loss 1.5728 Epoch: 10 Global Step: 182650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:51:03,486-Speed 3192.38 samples/sec Loss 1.5373 Epoch: 10 Global Step: 182700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:51:20,382-Speed 3030.43 samples/sec Loss 1.5843 Epoch: 10 Global Step: 182750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:51:36,584-Speed 3160.20 samples/sec Loss 1.5689 Epoch: 10 Global Step: 182800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:51:52,959-Speed 3126.66 samples/sec Loss 1.5630 Epoch: 10 Global Step: 182850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:52:09,098-Speed 3172.66 samples/sec Loss 1.5643 Epoch: 10 Global Step: 182900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:52:25,549-Speed 3112.31 samples/sec Loss 1.5535 Epoch: 10 Global Step: 182950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:52:41,920-Speed 3127.50 samples/sec Loss 1.5511 Epoch: 10 Global Step: 183000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:52:59,196-Speed 2963.75 samples/sec Loss 1.5671 Epoch: 10 Global Step: 183050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:53:15,346-Speed 3170.34 samples/sec Loss 1.5713 Epoch: 10 Global Step: 183100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:53:31,508-Speed 3168.14 samples/sec Loss 1.5761 Epoch: 10 Global Step: 183150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:53:47,897-Speed 3124.11 samples/sec Loss 1.5780 Epoch: 10 Global Step: 183200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:54:03,974-Speed 3184.80 samples/sec Loss 1.5810 Epoch: 10 Global Step: 183250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:54:21,009-Speed 3005.69 samples/sec Loss 1.5726 Epoch: 10 Global Step: 183300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:54:37,332-Speed 3136.79 samples/sec Loss 1.5661 Epoch: 10 Global Step: 183350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:54:54,359-Speed 3007.11 samples/sec Loss 1.5522 Epoch: 10 Global Step: 183400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:55:10,512-Speed 3169.74 samples/sec Loss 1.5540 Epoch: 10 Global Step: 183450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:55:26,996-Speed 3106.14 samples/sec Loss 1.5640 Epoch: 10 Global Step: 183500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:55:43,428-Speed 3116.10 samples/sec Loss 1.5792 Epoch: 10 Global Step: 183550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:56:00,186-Speed 3055.21 samples/sec Loss 1.5438 Epoch: 10 Global Step: 183600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:56:30,559-Speed 1685.78 samples/sec Loss 1.1503 Epoch: 11 Global Step: 183650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:56:46,296-Speed 3253.54 samples/sec Loss 1.0398 Epoch: 11 Global Step: 183700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:57:02,249-Speed 3209.61 samples/sec Loss 1.0098 Epoch: 11 Global Step: 183750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:57:18,023-Speed 3245.89 samples/sec Loss 0.9927 Epoch: 11 Global Step: 183800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:57:33,826-Speed 3239.89 samples/sec Loss 0.9503 Epoch: 11 Global Step: 183850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:57:50,439-Speed 3082.07 samples/sec Loss 0.9493 Epoch: 11 Global Step: 183900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:58:06,227-Speed 3243.05 samples/sec Loss 0.9240 Epoch: 11 Global Step: 183950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:58:22,102-Speed 3225.39 samples/sec Loss 0.9242 Epoch: 11 Global Step: 184000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 20:59:15,196-[lfw][184000]XNorm: 22.382132 Training: 2021-03-16 20:59:15,196-[lfw][184000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-16 20:59:15,196-[lfw][184000]Accuracy-Highest: 0.99817 Training: 2021-03-16 21:00:17,131-[cfp_fp][184000]XNorm: 21.504406 Training: 2021-03-16 21:00:17,132-[cfp_fp][184000]Accuracy-Flip: 0.98786+-0.00488 Training: 2021-03-16 21:00:17,132-[cfp_fp][184000]Accuracy-Highest: 0.98786 Training: 2021-03-16 21:01:10,315-[agedb_30][184000]XNorm: 22.534973 Training: 2021-03-16 21:01:10,315-[agedb_30][184000]Accuracy-Flip: 0.98217+-0.00764 Training: 2021-03-16 21:01:10,315-[agedb_30][184000]Accuracy-Highest: 0.98217 Training: 2021-03-16 21:01:26,597-Speed 277.51 samples/sec Loss 0.9164 Epoch: 11 Global Step: 184050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:01:42,312-Speed 3258.13 samples/sec Loss 0.9158 Epoch: 11 Global Step: 184100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:01:58,112-Speed 3240.74 samples/sec Loss 0.9083 Epoch: 11 Global Step: 184150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:02:14,143-Speed 3193.84 samples/sec Loss 0.8894 Epoch: 11 Global Step: 184200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:02:30,015-Speed 3225.88 samples/sec Loss 0.8822 Epoch: 11 Global Step: 184250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:02:45,841-Speed 3235.31 samples/sec Loss 0.8702 Epoch: 11 Global Step: 184300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:03:01,852-Speed 3197.96 samples/sec Loss 0.8735 Epoch: 11 Global Step: 184350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:03:17,622-Speed 3246.82 samples/sec Loss 0.8606 Epoch: 11 Global Step: 184400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:03:33,588-Speed 3206.74 samples/sec Loss 0.8688 Epoch: 11 Global Step: 184450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:03:50,163-Speed 3089.18 samples/sec Loss 0.8681 Epoch: 11 Global Step: 184500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:04:05,837-Speed 3266.68 samples/sec Loss 0.8354 Epoch: 11 Global Step: 184550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:04:22,467-Speed 3078.91 samples/sec Loss 0.8599 Epoch: 11 Global Step: 184600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:04:38,143-Speed 3266.12 samples/sec Loss 0.8302 Epoch: 11 Global Step: 184650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:04:53,992-Speed 3230.62 samples/sec Loss 0.8410 Epoch: 11 Global Step: 184700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:05:09,762-Speed 3246.80 samples/sec Loss 0.8529 Epoch: 11 Global Step: 184750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:05:25,787-Speed 3195.15 samples/sec Loss 0.8216 Epoch: 11 Global Step: 184800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:05:42,100-Speed 3138.65 samples/sec Loss 0.8362 Epoch: 11 Global Step: 184850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:05:59,028-Speed 3024.76 samples/sec Loss 0.8331 Epoch: 11 Global Step: 184900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:06:14,916-Speed 3222.63 samples/sec Loss 0.8348 Epoch: 11 Global Step: 184950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:06:30,761-Speed 3231.41 samples/sec Loss 0.8396 Epoch: 11 Global Step: 185000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:06:46,654-Speed 3221.51 samples/sec Loss 0.8243 Epoch: 11 Global Step: 185050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:07:02,393-Speed 3253.17 samples/sec Loss 0.8240 Epoch: 11 Global Step: 185100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:07:18,189-Speed 3241.61 samples/sec Loss 0.8220 Epoch: 11 Global Step: 185150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:07:34,856-Speed 3071.92 samples/sec Loss 0.8228 Epoch: 11 Global Step: 185200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:07:50,636-Speed 3244.75 samples/sec Loss 0.8261 Epoch: 11 Global Step: 185250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:08:06,883-Speed 3151.44 samples/sec Loss 0.8092 Epoch: 11 Global Step: 185300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:08:22,703-Speed 3236.50 samples/sec Loss 0.8123 Epoch: 11 Global Step: 185350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:08:38,447-Speed 3252.17 samples/sec Loss 0.7911 Epoch: 11 Global Step: 185400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:08:54,538-Speed 3181.95 samples/sec Loss 0.7957 Epoch: 11 Global Step: 185450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:09:11,480-Speed 3022.18 samples/sec Loss 0.7966 Epoch: 11 Global Step: 185500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:09:28,696-Speed 2974.14 samples/sec Loss 0.8032 Epoch: 11 Global Step: 185550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:09:44,619-Speed 3215.58 samples/sec Loss 0.8149 Epoch: 11 Global Step: 185600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:10:00,489-Speed 3226.27 samples/sec Loss 0.8017 Epoch: 11 Global Step: 185650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:10:16,194-Speed 3260.11 samples/sec Loss 0.8032 Epoch: 11 Global Step: 185700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:10:31,885-Speed 3263.14 samples/sec Loss 0.7932 Epoch: 11 Global Step: 185750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:10:47,768-Speed 3223.64 samples/sec Loss 0.7844 Epoch: 11 Global Step: 185800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:11:04,667-Speed 3029.85 samples/sec Loss 0.7970 Epoch: 11 Global Step: 185850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:11:20,375-Speed 3259.75 samples/sec Loss 0.7986 Epoch: 11 Global Step: 185900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:11:36,332-Speed 3208.58 samples/sec Loss 0.7919 Epoch: 11 Global Step: 185950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:11:52,208-Speed 3225.23 samples/sec Loss 0.7857 Epoch: 11 Global Step: 186000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:12:45,555-[lfw][186000]XNorm: 22.182693 Training: 2021-03-16 21:12:45,555-[lfw][186000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 21:12:45,555-[lfw][186000]Accuracy-Highest: 0.99817 Training: 2021-03-16 21:13:47,376-[cfp_fp][186000]XNorm: 21.824832 Training: 2021-03-16 21:13:47,376-[cfp_fp][186000]Accuracy-Flip: 0.99086+-0.00439 Training: 2021-03-16 21:13:47,376-[cfp_fp][186000]Accuracy-Highest: 0.99086 Training: 2021-03-16 21:14:40,802-[agedb_30][186000]XNorm: 22.689192 Training: 2021-03-16 21:14:40,802-[agedb_30][186000]Accuracy-Flip: 0.98367+-0.00767 Training: 2021-03-16 21:14:40,802-[agedb_30][186000]Accuracy-Highest: 0.98367 Training: 2021-03-16 21:14:57,624-Speed 276.14 samples/sec Loss 0.8048 Epoch: 11 Global Step: 186050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:15:13,386-Speed 3248.44 samples/sec Loss 0.7847 Epoch: 11 Global Step: 186100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:15:29,372-Speed 3202.90 samples/sec Loss 0.7948 Epoch: 11 Global Step: 186150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:15:45,290-Speed 3216.70 samples/sec Loss 0.7930 Epoch: 11 Global Step: 186200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:16:01,237-Speed 3210.72 samples/sec Loss 0.7815 Epoch: 11 Global Step: 186250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:16:17,329-Speed 3181.75 samples/sec Loss 0.7698 Epoch: 11 Global Step: 186300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:16:33,060-Speed 3254.89 samples/sec Loss 0.7817 Epoch: 11 Global Step: 186350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:16:48,813-Speed 3250.32 samples/sec Loss 0.7770 Epoch: 11 Global Step: 186400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:17:04,661-Speed 3230.61 samples/sec Loss 0.7613 Epoch: 11 Global Step: 186450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:17:20,408-Speed 3251.67 samples/sec Loss 0.7926 Epoch: 11 Global Step: 186500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:17:36,167-Speed 3248.95 samples/sec Loss 0.7765 Epoch: 11 Global Step: 186550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:17:52,111-Speed 3211.27 samples/sec Loss 0.7535 Epoch: 11 Global Step: 186600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:18:07,809-Speed 3261.78 samples/sec Loss 0.7641 Epoch: 11 Global Step: 186650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:18:24,348-Speed 3095.80 samples/sec Loss 0.7781 Epoch: 11 Global Step: 186700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:18:40,112-Speed 3248.03 samples/sec Loss 0.7836 Epoch: 11 Global Step: 186750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:18:55,986-Speed 3225.49 samples/sec Loss 0.7801 Epoch: 11 Global Step: 186800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:19:12,454-Speed 3109.21 samples/sec Loss 0.7729 Epoch: 11 Global Step: 186850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:19:28,224-Speed 3246.58 samples/sec Loss 0.7610 Epoch: 11 Global Step: 186900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:19:44,525-Speed 3141.18 samples/sec Loss 0.7646 Epoch: 11 Global Step: 186950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:20:00,241-Speed 3257.85 samples/sec Loss 0.7664 Epoch: 11 Global Step: 187000 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:20:17,317-Speed 2998.37 samples/sec Loss 0.7636 Epoch: 11 Global Step: 187050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:20:33,541-Speed 3155.87 samples/sec Loss 0.7622 Epoch: 11 Global Step: 187100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:20:49,366-Speed 3235.66 samples/sec Loss 0.7459 Epoch: 11 Global Step: 187150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:21:05,270-Speed 3219.34 samples/sec Loss 0.7585 Epoch: 11 Global Step: 187200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:21:21,028-Speed 3249.27 samples/sec Loss 0.7468 Epoch: 11 Global Step: 187250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:21:36,962-Speed 3213.25 samples/sec Loss 0.7532 Epoch: 11 Global Step: 187300 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:21:52,935-Speed 3205.57 samples/sec Loss 0.7539 Epoch: 11 Global Step: 187350 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:22:08,712-Speed 3245.30 samples/sec Loss 0.7576 Epoch: 11 Global Step: 187400 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:22:25,282-Speed 3090.05 samples/sec Loss 0.7551 Epoch: 11 Global Step: 187450 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:22:40,961-Speed 3265.63 samples/sec Loss 0.7436 Epoch: 11 Global Step: 187500 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:22:56,709-Speed 3251.22 samples/sec Loss 0.7552 Epoch: 11 Global Step: 187550 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:23:12,612-Speed 3219.56 samples/sec Loss 0.7530 Epoch: 11 Global Step: 187600 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:23:28,389-Speed 3245.45 samples/sec Loss 0.7464 Epoch: 11 Global Step: 187650 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:23:44,954-Speed 3090.97 samples/sec Loss 0.7370 Epoch: 11 Global Step: 187700 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:24:01,563-Speed 3082.77 samples/sec Loss 0.7427 Epoch: 11 Global Step: 187750 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:24:17,357-Speed 3241.82 samples/sec Loss 0.7517 Epoch: 11 Global Step: 187800 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:24:33,110-Speed 3250.13 samples/sec Loss 0.7388 Epoch: 11 Global Step: 187850 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:24:48,842-Speed 3254.80 samples/sec Loss 0.7522 Epoch: 11 Global Step: 187900 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:25:04,528-Speed 3264.15 samples/sec Loss 0.7457 Epoch: 11 Global Step: 187950 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:25:20,345-Speed 3236.99 samples/sec Loss 0.7478 Epoch: 11 Global Step: 188000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:26:13,245-[lfw][188000]XNorm: 22.228471 Training: 2021-03-16 21:26:13,245-[lfw][188000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-16 21:26:13,245-[lfw][188000]Accuracy-Highest: 0.99817 Training: 2021-03-16 21:27:16,038-[cfp_fp][188000]XNorm: 21.880526 Training: 2021-03-16 21:27:16,038-[cfp_fp][188000]Accuracy-Flip: 0.99114+-0.00522 Training: 2021-03-16 21:27:16,038-[cfp_fp][188000]Accuracy-Highest: 0.99114 Training: 2021-03-16 21:28:09,136-[agedb_30][188000]XNorm: 22.689111 Training: 2021-03-16 21:28:09,136-[agedb_30][188000]Accuracy-Flip: 0.98367+-0.00710 Training: 2021-03-16 21:28:09,137-[agedb_30][188000]Accuracy-Highest: 0.98367 Training: 2021-03-16 21:28:24,762-Speed 277.63 samples/sec Loss 0.7487 Epoch: 11 Global Step: 188050 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:28:40,758-Speed 3200.98 samples/sec Loss 0.7538 Epoch: 11 Global Step: 188100 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:28:57,780-Speed 3007.86 samples/sec Loss 0.7557 Epoch: 11 Global Step: 188150 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:29:13,844-Speed 3187.53 samples/sec Loss 0.7482 Epoch: 11 Global Step: 188200 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:29:29,612-Speed 3247.02 samples/sec Loss 0.7433 Epoch: 11 Global Step: 188250 Fp16 Grad Scale: 16384 Required: 17 hours Training: 2021-03-16 21:29:45,304-Speed 3262.87 samples/sec Loss 0.7400 Epoch: 11 Global Step: 188300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:30:02,060-Speed 3055.82 samples/sec Loss 0.7296 Epoch: 11 Global Step: 188350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:30:17,957-Speed 3220.72 samples/sec Loss 0.7353 Epoch: 11 Global Step: 188400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:30:33,850-Speed 3221.78 samples/sec Loss 0.7273 Epoch: 11 Global Step: 188450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:30:49,521-Speed 3267.25 samples/sec Loss 0.7201 Epoch: 11 Global Step: 188500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:31:05,212-Speed 3263.13 samples/sec Loss 0.7386 Epoch: 11 Global Step: 188550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:31:21,059-Speed 3231.00 samples/sec Loss 0.7257 Epoch: 11 Global Step: 188600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:31:36,814-Speed 3249.89 samples/sec Loss 0.7361 Epoch: 11 Global Step: 188650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:31:52,621-Speed 3239.21 samples/sec Loss 0.7141 Epoch: 11 Global Step: 188700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:32:08,573-Speed 3209.63 samples/sec Loss 0.7347 Epoch: 11 Global Step: 188750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:32:24,480-Speed 3218.85 samples/sec Loss 0.7425 Epoch: 11 Global Step: 188800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:32:40,335-Speed 3229.27 samples/sec Loss 0.7385 Epoch: 11 Global Step: 188850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:32:56,990-Speed 3074.35 samples/sec Loss 0.7337 Epoch: 11 Global Step: 188900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:33:12,924-Speed 3213.30 samples/sec Loss 0.7427 Epoch: 11 Global Step: 188950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:33:28,762-Speed 3232.79 samples/sec Loss 0.7255 Epoch: 11 Global Step: 189000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:33:45,789-Speed 3007.12 samples/sec Loss 0.7275 Epoch: 11 Global Step: 189050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:34:01,613-Speed 3235.75 samples/sec Loss 0.7237 Epoch: 11 Global Step: 189100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:34:17,475-Speed 3227.96 samples/sec Loss 0.7305 Epoch: 11 Global Step: 189150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:34:34,704-Speed 2971.80 samples/sec Loss 0.7285 Epoch: 11 Global Step: 189200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:34:50,425-Speed 3256.86 samples/sec Loss 0.7478 Epoch: 11 Global Step: 189250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:35:06,147-Speed 3256.76 samples/sec Loss 0.7356 Epoch: 11 Global Step: 189300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:35:21,995-Speed 3230.72 samples/sec Loss 0.7235 Epoch: 11 Global Step: 189350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:35:37,817-Speed 3236.04 samples/sec Loss 0.7267 Epoch: 11 Global Step: 189400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:35:53,850-Speed 3193.60 samples/sec Loss 0.7237 Epoch: 11 Global Step: 189450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:36:10,016-Speed 3167.25 samples/sec Loss 0.7347 Epoch: 11 Global Step: 189500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:36:25,992-Speed 3204.90 samples/sec Loss 0.7452 Epoch: 11 Global Step: 189550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:36:41,776-Speed 3243.85 samples/sec Loss 0.7366 Epoch: 11 Global Step: 189600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:36:57,467-Speed 3263.13 samples/sec Loss 0.7125 Epoch: 11 Global Step: 189650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:37:14,134-Speed 3072.01 samples/sec Loss 0.7203 Epoch: 11 Global Step: 189700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:37:29,912-Speed 3245.08 samples/sec Loss 0.7239 Epoch: 11 Global Step: 189750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:37:45,956-Speed 3191.38 samples/sec Loss 0.7116 Epoch: 11 Global Step: 189800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:38:01,737-Speed 3244.54 samples/sec Loss 0.7216 Epoch: 11 Global Step: 189850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:38:18,335-Speed 3084.84 samples/sec Loss 0.7196 Epoch: 11 Global Step: 189900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:38:35,059-Speed 3061.51 samples/sec Loss 0.7187 Epoch: 11 Global Step: 189950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:38:51,152-Speed 3181.65 samples/sec Loss 0.7267 Epoch: 11 Global Step: 190000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:39:44,367-[lfw][190000]XNorm: 22.427373 Training: 2021-03-16 21:39:44,368-[lfw][190000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 21:39:44,368-[lfw][190000]Accuracy-Highest: 0.99817 Training: 2021-03-16 21:40:46,655-[cfp_fp][190000]XNorm: 22.170092 Training: 2021-03-16 21:40:46,656-[cfp_fp][190000]Accuracy-Flip: 0.99157+-0.00435 Training: 2021-03-16 21:40:46,656-[cfp_fp][190000]Accuracy-Highest: 0.99157 Training: 2021-03-16 21:41:39,804-[agedb_30][190000]XNorm: 23.060443 Training: 2021-03-16 21:41:39,804-[agedb_30][190000]Accuracy-Flip: 0.98433+-0.00793 Training: 2021-03-16 21:41:39,804-[agedb_30][190000]Accuracy-Highest: 0.98433 Training: 2021-03-16 21:41:55,591-Speed 277.60 samples/sec Loss 0.7299 Epoch: 11 Global Step: 190050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:42:11,465-Speed 3225.49 samples/sec Loss 0.7227 Epoch: 11 Global Step: 190100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:42:27,296-Speed 3234.24 samples/sec Loss 0.7112 Epoch: 11 Global Step: 190150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:42:43,186-Speed 3222.34 samples/sec Loss 0.7106 Epoch: 11 Global Step: 190200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:42:59,476-Speed 3143.17 samples/sec Loss 0.7279 Epoch: 11 Global Step: 190250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:43:15,497-Speed 3195.83 samples/sec Loss 0.7179 Epoch: 11 Global Step: 190300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:43:31,139-Speed 3273.29 samples/sec Loss 0.7385 Epoch: 11 Global Step: 190350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:43:47,061-Speed 3215.90 samples/sec Loss 0.7265 Epoch: 11 Global Step: 190400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:44:03,892-Speed 3042.02 samples/sec Loss 0.7146 Epoch: 11 Global Step: 190450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:44:20,621-Speed 3060.79 samples/sec Loss 0.7081 Epoch: 11 Global Step: 190500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:44:36,370-Speed 3251.14 samples/sec Loss 0.7119 Epoch: 11 Global Step: 190550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:44:52,085-Speed 3258.08 samples/sec Loss 0.6956 Epoch: 11 Global Step: 190600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:45:07,733-Speed 3272.01 samples/sec Loss 0.7200 Epoch: 11 Global Step: 190650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:45:23,566-Speed 3233.93 samples/sec Loss 0.7162 Epoch: 11 Global Step: 190700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:45:39,375-Speed 3238.77 samples/sec Loss 0.7226 Epoch: 11 Global Step: 190750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:45:55,196-Speed 3236.31 samples/sec Loss 0.7092 Epoch: 11 Global Step: 190800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:46:11,144-Speed 3210.46 samples/sec Loss 0.7078 Epoch: 11 Global Step: 190850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:46:27,091-Speed 3210.71 samples/sec Loss 0.7147 Epoch: 11 Global Step: 190900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:46:42,993-Speed 3219.80 samples/sec Loss 0.7155 Epoch: 11 Global Step: 190950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:46:58,855-Speed 3227.96 samples/sec Loss 0.7139 Epoch: 11 Global Step: 191000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:47:14,831-Speed 3204.99 samples/sec Loss 0.7162 Epoch: 11 Global Step: 191050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:47:31,665-Speed 3041.48 samples/sec Loss 0.7040 Epoch: 11 Global Step: 191100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:47:47,659-Speed 3201.43 samples/sec Loss 0.7124 Epoch: 11 Global Step: 191150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:48:03,352-Speed 3262.70 samples/sec Loss 0.7060 Epoch: 11 Global Step: 191200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:48:19,020-Speed 3267.99 samples/sec Loss 0.6980 Epoch: 11 Global Step: 191250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:48:35,708-Speed 3068.01 samples/sec Loss 0.7025 Epoch: 11 Global Step: 191300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:48:51,399-Speed 3263.18 samples/sec Loss 0.7019 Epoch: 11 Global Step: 191350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:49:07,947-Speed 3094.16 samples/sec Loss 0.7257 Epoch: 11 Global Step: 191400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:49:23,684-Speed 3253.54 samples/sec Loss 0.7255 Epoch: 11 Global Step: 191450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:49:39,521-Speed 3233.04 samples/sec Loss 0.7032 Epoch: 11 Global Step: 191500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:49:55,635-Speed 3177.58 samples/sec Loss 0.7044 Epoch: 11 Global Step: 191550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:50:11,626-Speed 3201.85 samples/sec Loss 0.7126 Epoch: 11 Global Step: 191600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:50:27,498-Speed 3225.75 samples/sec Loss 0.7015 Epoch: 11 Global Step: 191650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:50:43,277-Speed 3244.94 samples/sec Loss 0.6987 Epoch: 11 Global Step: 191700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:50:59,006-Speed 3255.25 samples/sec Loss 0.7109 Epoch: 11 Global Step: 191750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:51:14,698-Speed 3263.00 samples/sec Loss 0.7079 Epoch: 11 Global Step: 191800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:51:31,279-Speed 3087.99 samples/sec Loss 0.7161 Epoch: 11 Global Step: 191850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:51:47,039-Speed 3248.68 samples/sec Loss 0.7020 Epoch: 11 Global Step: 191900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:52:02,837-Speed 3241.11 samples/sec Loss 0.7065 Epoch: 11 Global Step: 191950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:52:18,895-Speed 3188.58 samples/sec Loss 0.7064 Epoch: 11 Global Step: 192000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:53:12,049-[lfw][192000]XNorm: 22.455447 Training: 2021-03-16 21:53:12,049-[lfw][192000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 21:53:12,049-[lfw][192000]Accuracy-Highest: 0.99817 Training: 2021-03-16 21:54:14,375-[cfp_fp][192000]XNorm: 22.293139 Training: 2021-03-16 21:54:14,375-[cfp_fp][192000]Accuracy-Flip: 0.99071+-0.00543 Training: 2021-03-16 21:54:14,375-[cfp_fp][192000]Accuracy-Highest: 0.99157 Training: 2021-03-16 21:55:07,360-[agedb_30][192000]XNorm: 23.127723 Training: 2021-03-16 21:55:07,360-[agedb_30][192000]Accuracy-Flip: 0.98317+-0.00693 Training: 2021-03-16 21:55:07,360-[agedb_30][192000]Accuracy-Highest: 0.98433 Training: 2021-03-16 21:55:23,050-Speed 278.03 samples/sec Loss 0.7050 Epoch: 11 Global Step: 192050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:55:39,473-Speed 3117.64 samples/sec Loss 0.6984 Epoch: 11 Global Step: 192100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:55:56,294-Speed 3043.94 samples/sec Loss 0.7165 Epoch: 11 Global Step: 192150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:56:12,394-Speed 3180.15 samples/sec Loss 0.7130 Epoch: 11 Global Step: 192200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:56:28,226-Speed 3234.23 samples/sec Loss 0.7065 Epoch: 11 Global Step: 192250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:56:44,001-Speed 3245.71 samples/sec Loss 0.7108 Epoch: 11 Global Step: 192300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:57:00,185-Speed 3163.61 samples/sec Loss 0.7167 Epoch: 11 Global Step: 192350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:57:15,867-Speed 3265.16 samples/sec Loss 0.6910 Epoch: 11 Global Step: 192400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:57:31,544-Speed 3265.95 samples/sec Loss 0.6949 Epoch: 11 Global Step: 192450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:57:47,383-Speed 3232.63 samples/sec Loss 0.7015 Epoch: 11 Global Step: 192500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:58:03,545-Speed 3168.10 samples/sec Loss 0.7054 Epoch: 11 Global Step: 192550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:58:20,180-Speed 3077.99 samples/sec Loss 0.6933 Epoch: 11 Global Step: 192600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:58:36,862-Speed 3069.15 samples/sec Loss 0.6855 Epoch: 11 Global Step: 192650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:58:53,240-Speed 3126.37 samples/sec Loss 0.7015 Epoch: 11 Global Step: 192700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:59:09,140-Speed 3220.24 samples/sec Loss 0.6988 Epoch: 11 Global Step: 192750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:59:25,269-Speed 3174.50 samples/sec Loss 0.6885 Epoch: 11 Global Step: 192800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:59:41,020-Speed 3250.56 samples/sec Loss 0.6923 Epoch: 11 Global Step: 192850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 21:59:57,237-Speed 3157.36 samples/sec Loss 0.7130 Epoch: 11 Global Step: 192900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:00:12,937-Speed 3261.21 samples/sec Loss 0.7056 Epoch: 11 Global Step: 192950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:00:28,982-Speed 3191.16 samples/sec Loss 0.6930 Epoch: 11 Global Step: 193000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:00:44,862-Speed 3224.18 samples/sec Loss 0.6922 Epoch: 11 Global Step: 193050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:01:00,659-Speed 3241.29 samples/sec Loss 0.6953 Epoch: 11 Global Step: 193100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:01:16,346-Speed 3263.99 samples/sec Loss 0.7104 Epoch: 11 Global Step: 193150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:01:32,380-Speed 3193.34 samples/sec Loss 0.6972 Epoch: 11 Global Step: 193200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:01:48,086-Speed 3260.07 samples/sec Loss 0.6926 Epoch: 11 Global Step: 193250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:02:04,848-Speed 3054.54 samples/sec Loss 0.6901 Epoch: 11 Global Step: 193300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:02:20,695-Speed 3231.04 samples/sec Loss 0.6878 Epoch: 11 Global Step: 193350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:02:36,745-Speed 3190.12 samples/sec Loss 0.6816 Epoch: 11 Global Step: 193400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:02:52,369-Speed 3276.99 samples/sec Loss 0.6978 Epoch: 11 Global Step: 193450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:03:08,874-Speed 3102.31 samples/sec Loss 0.6881 Epoch: 11 Global Step: 193500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:03:25,540-Speed 3072.12 samples/sec Loss 0.7025 Epoch: 11 Global Step: 193550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:03:41,357-Speed 3237.24 samples/sec Loss 0.7056 Epoch: 11 Global Step: 193600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:03:57,130-Speed 3246.00 samples/sec Loss 0.6755 Epoch: 11 Global Step: 193650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:04:12,890-Speed 3248.87 samples/sec Loss 0.6811 Epoch: 11 Global Step: 193700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:04:28,619-Speed 3255.26 samples/sec Loss 0.6880 Epoch: 11 Global Step: 193750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:04:44,412-Speed 3242.17 samples/sec Loss 0.6809 Epoch: 11 Global Step: 193800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:05:00,511-Speed 3180.30 samples/sec Loss 0.6911 Epoch: 11 Global Step: 193850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:05:16,296-Speed 3243.67 samples/sec Loss 0.6710 Epoch: 11 Global Step: 193900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:05:32,107-Speed 3238.41 samples/sec Loss 0.6914 Epoch: 11 Global Step: 193950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:05:47,842-Speed 3253.98 samples/sec Loss 0.6963 Epoch: 11 Global Step: 194000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:06:40,813-[lfw][194000]XNorm: 22.030973 Training: 2021-03-16 22:06:40,814-[lfw][194000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 22:06:40,814-[lfw][194000]Accuracy-Highest: 0.99817 Training: 2021-03-16 22:07:42,705-[cfp_fp][194000]XNorm: 22.070574 Training: 2021-03-16 22:07:42,705-[cfp_fp][194000]Accuracy-Flip: 0.99043+-0.00596 Training: 2021-03-16 22:07:42,705-[cfp_fp][194000]Accuracy-Highest: 0.99157 Training: 2021-03-16 22:08:35,995-[agedb_30][194000]XNorm: 22.838572 Training: 2021-03-16 22:08:35,995-[agedb_30][194000]Accuracy-Flip: 0.98317+-0.00697 Training: 2021-03-16 22:08:35,995-[agedb_30][194000]Accuracy-Highest: 0.98433 Training: 2021-03-16 22:08:52,483-Speed 277.30 samples/sec Loss 0.6877 Epoch: 11 Global Step: 194050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:09:08,332-Speed 3230.58 samples/sec Loss 0.6833 Epoch: 11 Global Step: 194100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:09:24,153-Speed 3236.31 samples/sec Loss 0.6770 Epoch: 11 Global Step: 194150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:09:40,067-Speed 3217.29 samples/sec Loss 0.6903 Epoch: 11 Global Step: 194200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:09:56,292-Speed 3155.77 samples/sec Loss 0.6901 Epoch: 11 Global Step: 194250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:10:12,223-Speed 3214.03 samples/sec Loss 0.6797 Epoch: 11 Global Step: 194300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:10:29,867-Speed 2901.98 samples/sec Loss 0.6718 Epoch: 11 Global Step: 194350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:10:45,600-Speed 3254.23 samples/sec Loss 0.6907 Epoch: 11 Global Step: 194400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:11:01,303-Speed 3260.79 samples/sec Loss 0.6969 Epoch: 11 Global Step: 194450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:11:17,017-Speed 3258.22 samples/sec Loss 0.6830 Epoch: 11 Global Step: 194500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:11:32,749-Speed 3254.74 samples/sec Loss 0.7040 Epoch: 11 Global Step: 194550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:11:48,520-Speed 3246.52 samples/sec Loss 0.6902 Epoch: 11 Global Step: 194600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:12:04,294-Speed 3246.00 samples/sec Loss 0.6854 Epoch: 11 Global Step: 194650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:12:19,930-Speed 3274.50 samples/sec Loss 0.6906 Epoch: 11 Global Step: 194700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:12:35,529-Speed 3282.35 samples/sec Loss 0.6919 Epoch: 11 Global Step: 194750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:12:52,276-Speed 3057.33 samples/sec Loss 0.6808 Epoch: 11 Global Step: 194800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:13:08,782-Speed 3102.02 samples/sec Loss 0.6825 Epoch: 11 Global Step: 194850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:13:24,708-Speed 3214.98 samples/sec Loss 0.6874 Epoch: 11 Global Step: 194900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:13:40,496-Speed 3243.13 samples/sec Loss 0.6895 Epoch: 11 Global Step: 194950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:13:56,185-Speed 3263.59 samples/sec Loss 0.6855 Epoch: 11 Global Step: 195000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:14:11,910-Speed 3255.92 samples/sec Loss 0.6921 Epoch: 11 Global Step: 195050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:14:27,736-Speed 3235.31 samples/sec Loss 0.6865 Epoch: 11 Global Step: 195100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:14:43,658-Speed 3215.87 samples/sec Loss 0.6720 Epoch: 11 Global Step: 195150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:14:59,436-Speed 3245.01 samples/sec Loss 0.6860 Epoch: 11 Global Step: 195200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:15:15,159-Speed 3256.52 samples/sec Loss 0.6812 Epoch: 11 Global Step: 195250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:15:30,956-Speed 3241.30 samples/sec Loss 0.6926 Epoch: 11 Global Step: 195300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:15:46,916-Speed 3207.98 samples/sec Loss 0.6698 Epoch: 11 Global Step: 195350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:16:02,773-Speed 3229.08 samples/sec Loss 0.6764 Epoch: 11 Global Step: 195400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:16:18,691-Speed 3216.44 samples/sec Loss 0.6732 Epoch: 11 Global Step: 195450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:16:34,786-Speed 3181.31 samples/sec Loss 0.6743 Epoch: 11 Global Step: 195500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:16:50,370-Speed 3285.51 samples/sec Loss 0.6815 Epoch: 11 Global Step: 195550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:17:07,247-Speed 3033.84 samples/sec Loss 0.6744 Epoch: 11 Global Step: 195600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:17:22,948-Speed 3260.92 samples/sec Loss 0.6733 Epoch: 11 Global Step: 195650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:17:40,280-Speed 2954.27 samples/sec Loss 0.6622 Epoch: 11 Global Step: 195700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:17:56,047-Speed 3247.41 samples/sec Loss 0.6816 Epoch: 11 Global Step: 195750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:18:12,013-Speed 3206.84 samples/sec Loss 0.6654 Epoch: 11 Global Step: 195800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:18:27,882-Speed 3226.58 samples/sec Loss 0.6581 Epoch: 11 Global Step: 195850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:18:43,915-Speed 3193.53 samples/sec Loss 0.6730 Epoch: 11 Global Step: 195900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:18:59,796-Speed 3223.93 samples/sec Loss 0.6781 Epoch: 11 Global Step: 195950 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:19:15,638-Speed 3232.02 samples/sec Loss 0.6701 Epoch: 11 Global Step: 196000 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:20:08,775-[lfw][196000]XNorm: 21.926666 Training: 2021-03-16 22:20:08,775-[lfw][196000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 22:20:08,775-[lfw][196000]Accuracy-Highest: 0.99817 Training: 2021-03-16 22:21:10,830-[cfp_fp][196000]XNorm: 22.038842 Training: 2021-03-16 22:21:10,830-[cfp_fp][196000]Accuracy-Flip: 0.99100+-0.00599 Training: 2021-03-16 22:21:10,831-[cfp_fp][196000]Accuracy-Highest: 0.99157 Training: 2021-03-16 22:22:04,308-[agedb_30][196000]XNorm: 22.776065 Training: 2021-03-16 22:22:04,308-[agedb_30][196000]Accuracy-Flip: 0.98300+-0.00636 Training: 2021-03-16 22:22:04,308-[agedb_30][196000]Accuracy-Highest: 0.98433 Training: 2021-03-16 22:22:19,981-Speed 277.74 samples/sec Loss 0.6791 Epoch: 11 Global Step: 196050 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:22:35,779-Speed 3241.07 samples/sec Loss 0.6916 Epoch: 11 Global Step: 196100 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:22:51,517-Speed 3253.41 samples/sec Loss 0.6698 Epoch: 11 Global Step: 196150 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:23:07,339-Speed 3236.00 samples/sec Loss 0.6630 Epoch: 11 Global Step: 196200 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:23:23,960-Speed 3080.54 samples/sec Loss 0.6728 Epoch: 11 Global Step: 196250 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:23:39,690-Speed 3254.95 samples/sec Loss 0.6700 Epoch: 11 Global Step: 196300 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:23:55,482-Speed 3242.45 samples/sec Loss 0.6505 Epoch: 11 Global Step: 196350 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:24:11,511-Speed 3194.14 samples/sec Loss 0.6783 Epoch: 11 Global Step: 196400 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:24:27,293-Speed 3244.29 samples/sec Loss 0.6866 Epoch: 11 Global Step: 196450 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:24:44,338-Speed 3004.06 samples/sec Loss 0.6704 Epoch: 11 Global Step: 196500 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:25:00,903-Speed 3090.95 samples/sec Loss 0.6690 Epoch: 11 Global Step: 196550 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:25:16,716-Speed 3237.91 samples/sec Loss 0.6575 Epoch: 11 Global Step: 196600 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:25:32,415-Speed 3261.49 samples/sec Loss 0.6768 Epoch: 11 Global Step: 196650 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:25:48,604-Speed 3162.75 samples/sec Loss 0.6875 Epoch: 11 Global Step: 196700 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:26:04,294-Speed 3263.17 samples/sec Loss 0.6765 Epoch: 11 Global Step: 196750 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:26:20,073-Speed 3244.93 samples/sec Loss 0.6726 Epoch: 11 Global Step: 196800 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:26:35,832-Speed 3249.17 samples/sec Loss 0.6878 Epoch: 11 Global Step: 196850 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:26:51,883-Speed 3189.78 samples/sec Loss 0.6719 Epoch: 11 Global Step: 196900 Fp16 Grad Scale: 16384 Required: 16 hours Training: 2021-03-16 22:27:07,786-Speed 3219.69 samples/sec Loss 0.6796 Epoch: 11 Global Step: 196950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:27:23,769-Speed 3203.61 samples/sec Loss 0.6818 Epoch: 11 Global Step: 197000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:27:40,395-Speed 3079.49 samples/sec Loss 0.6811 Epoch: 11 Global Step: 197050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:27:57,023-Speed 3079.33 samples/sec Loss 0.6860 Epoch: 11 Global Step: 197100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:28:12,969-Speed 3210.92 samples/sec Loss 0.6730 Epoch: 11 Global Step: 197150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:28:28,851-Speed 3223.85 samples/sec Loss 0.6692 Epoch: 11 Global Step: 197200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:28:44,528-Speed 3266.05 samples/sec Loss 0.6741 Epoch: 11 Global Step: 197250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:29:00,296-Speed 3247.07 samples/sec Loss 0.6752 Epoch: 11 Global Step: 197300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:29:16,058-Speed 3248.57 samples/sec Loss 0.6783 Epoch: 11 Global Step: 197350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:29:31,721-Speed 3268.89 samples/sec Loss 0.6755 Epoch: 11 Global Step: 197400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:29:47,426-Speed 3260.25 samples/sec Loss 0.6674 Epoch: 11 Global Step: 197450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:30:03,132-Speed 3259.88 samples/sec Loss 0.6690 Epoch: 11 Global Step: 197500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:30:19,322-Speed 3162.54 samples/sec Loss 0.6671 Epoch: 11 Global Step: 197550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:30:35,436-Speed 3177.60 samples/sec Loss 0.6673 Epoch: 11 Global Step: 197600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:30:51,231-Speed 3241.55 samples/sec Loss 0.6680 Epoch: 11 Global Step: 197650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:31:07,045-Speed 3237.85 samples/sec Loss 0.6593 Epoch: 11 Global Step: 197700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:31:23,738-Speed 3067.07 samples/sec Loss 0.6738 Epoch: 11 Global Step: 197750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:31:39,730-Speed 3201.76 samples/sec Loss 0.6782 Epoch: 11 Global Step: 197800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:31:56,592-Speed 3036.62 samples/sec Loss 0.6606 Epoch: 11 Global Step: 197850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:32:12,258-Speed 3268.25 samples/sec Loss 0.6788 Epoch: 11 Global Step: 197900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:32:28,967-Speed 3064.32 samples/sec Loss 0.6727 Epoch: 11 Global Step: 197950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:32:44,753-Speed 3243.45 samples/sec Loss 0.6707 Epoch: 11 Global Step: 198000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:33:38,160-[lfw][198000]XNorm: 22.039262 Training: 2021-03-16 22:33:38,160-[lfw][198000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 22:33:38,161-[lfw][198000]Accuracy-Highest: 0.99817 Training: 2021-03-16 22:34:40,277-[cfp_fp][198000]XNorm: 22.169008 Training: 2021-03-16 22:34:40,278-[cfp_fp][198000]Accuracy-Flip: 0.99143+-0.00478 Training: 2021-03-16 22:34:40,278-[cfp_fp][198000]Accuracy-Highest: 0.99157 Training: 2021-03-16 22:35:33,770-[agedb_30][198000]XNorm: 22.923061 Training: 2021-03-16 22:35:33,771-[agedb_30][198000]Accuracy-Flip: 0.98300+-0.00678 Training: 2021-03-16 22:35:33,771-[agedb_30][198000]Accuracy-Highest: 0.98433 Training: 2021-03-16 22:35:49,447-Speed 277.22 samples/sec Loss 0.6579 Epoch: 11 Global Step: 198050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:36:05,169-Speed 3256.76 samples/sec Loss 0.6608 Epoch: 11 Global Step: 198100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:36:20,915-Speed 3251.63 samples/sec Loss 0.6758 Epoch: 11 Global Step: 198150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:36:36,938-Speed 3195.51 samples/sec Loss 0.6716 Epoch: 11 Global Step: 198200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:36:52,732-Speed 3241.90 samples/sec Loss 0.6647 Epoch: 11 Global Step: 198250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:37:08,730-Speed 3200.44 samples/sec Loss 0.6575 Epoch: 11 Global Step: 198300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:37:24,565-Speed 3233.55 samples/sec Loss 0.6747 Epoch: 11 Global Step: 198350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:37:40,326-Speed 3248.63 samples/sec Loss 0.6614 Epoch: 11 Global Step: 198400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:37:56,964-Speed 3077.34 samples/sec Loss 0.6476 Epoch: 11 Global Step: 198450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:38:12,652-Speed 3263.74 samples/sec Loss 0.6525 Epoch: 11 Global Step: 198500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:38:28,394-Speed 3252.55 samples/sec Loss 0.6618 Epoch: 11 Global Step: 198550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:38:44,366-Speed 3205.86 samples/sec Loss 0.6702 Epoch: 11 Global Step: 198600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:39:00,256-Speed 3222.10 samples/sec Loss 0.6598 Epoch: 11 Global Step: 198650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:39:17,658-Speed 2942.32 samples/sec Loss 0.6676 Epoch: 11 Global Step: 198700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:39:33,408-Speed 3250.92 samples/sec Loss 0.6557 Epoch: 11 Global Step: 198750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:39:49,464-Speed 3189.01 samples/sec Loss 0.6625 Epoch: 11 Global Step: 198800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:40:05,234-Speed 3246.65 samples/sec Loss 0.6763 Epoch: 11 Global Step: 198850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:40:20,944-Speed 3259.19 samples/sec Loss 0.6743 Epoch: 11 Global Step: 198900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:40:36,619-Speed 3266.49 samples/sec Loss 0.6678 Epoch: 11 Global Step: 198950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:40:52,409-Speed 3242.75 samples/sec Loss 0.6547 Epoch: 11 Global Step: 199000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:41:08,203-Speed 3241.74 samples/sec Loss 0.6503 Epoch: 11 Global Step: 199050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:41:24,214-Speed 3197.93 samples/sec Loss 0.6681 Epoch: 11 Global Step: 199100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:41:40,054-Speed 3232.44 samples/sec Loss 0.6666 Epoch: 11 Global Step: 199150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:41:55,963-Speed 3218.32 samples/sec Loss 0.6793 Epoch: 11 Global Step: 199200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:42:13,611-Speed 2901.31 samples/sec Loss 0.6597 Epoch: 11 Global Step: 199250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:42:29,359-Speed 3251.23 samples/sec Loss 0.6748 Epoch: 11 Global Step: 199300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:42:45,432-Speed 3185.55 samples/sec Loss 0.6615 Epoch: 11 Global Step: 199350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:43:01,316-Speed 3223.50 samples/sec Loss 0.6626 Epoch: 11 Global Step: 199400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:43:17,177-Speed 3228.08 samples/sec Loss 0.6640 Epoch: 11 Global Step: 199450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:43:32,964-Speed 3243.30 samples/sec Loss 0.6709 Epoch: 11 Global Step: 199500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:43:48,686-Speed 3256.85 samples/sec Loss 0.6642 Epoch: 11 Global Step: 199550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:44:04,869-Speed 3163.72 samples/sec Loss 0.6571 Epoch: 11 Global Step: 199600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:44:20,744-Speed 3225.46 samples/sec Loss 0.6664 Epoch: 11 Global Step: 199650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:44:36,651-Speed 3218.74 samples/sec Loss 0.6486 Epoch: 11 Global Step: 199700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:44:52,391-Speed 3252.91 samples/sec Loss 0.6825 Epoch: 11 Global Step: 199750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:45:08,334-Speed 3211.59 samples/sec Loss 0.6548 Epoch: 11 Global Step: 199800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:45:24,474-Speed 3172.38 samples/sec Loss 0.6768 Epoch: 11 Global Step: 199850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:45:40,212-Speed 3253.35 samples/sec Loss 0.6526 Epoch: 11 Global Step: 199900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:45:56,894-Speed 3069.32 samples/sec Loss 0.6537 Epoch: 11 Global Step: 199950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:46:12,703-Speed 3238.64 samples/sec Loss 0.6762 Epoch: 11 Global Step: 200000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:47:05,901-[lfw][200000]XNorm: 22.045735 Training: 2021-03-16 22:47:05,901-[lfw][200000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 22:47:05,901-[lfw][200000]Accuracy-Highest: 0.99817 Training: 2021-03-16 22:48:07,867-[cfp_fp][200000]XNorm: 22.203108 Training: 2021-03-16 22:48:07,868-[cfp_fp][200000]Accuracy-Flip: 0.99186+-0.00523 Training: 2021-03-16 22:48:07,868-[cfp_fp][200000]Accuracy-Highest: 0.99186 Training: 2021-03-16 22:49:01,568-[agedb_30][200000]XNorm: 22.811448 Training: 2021-03-16 22:49:01,568-[agedb_30][200000]Accuracy-Flip: 0.98250+-0.00651 Training: 2021-03-16 22:49:01,568-[agedb_30][200000]Accuracy-Highest: 0.98433 Training: 2021-03-16 22:49:18,249-Speed 275.94 samples/sec Loss 0.6716 Epoch: 11 Global Step: 200050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:49:35,130-Speed 3033.11 samples/sec Loss 0.6526 Epoch: 11 Global Step: 200100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:49:50,913-Speed 3244.00 samples/sec Loss 0.6567 Epoch: 11 Global Step: 200150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:50:07,153-Speed 3152.74 samples/sec Loss 0.6712 Epoch: 11 Global Step: 200200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:50:23,153-Speed 3200.22 samples/sec Loss 0.6555 Epoch: 11 Global Step: 200250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:50:53,292-Speed 1698.82 samples/sec Loss 0.6594 Epoch: 12 Global Step: 200300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:51:09,286-Speed 3201.24 samples/sec Loss 0.6038 Epoch: 12 Global Step: 200350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:51:25,000-Speed 3258.37 samples/sec Loss 0.5931 Epoch: 12 Global Step: 200400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:51:41,008-Speed 3198.58 samples/sec Loss 0.6035 Epoch: 12 Global Step: 200450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:51:56,952-Speed 3211.36 samples/sec Loss 0.5891 Epoch: 12 Global Step: 200500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:52:12,740-Speed 3243.03 samples/sec Loss 0.6014 Epoch: 12 Global Step: 200550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:52:28,526-Speed 3243.53 samples/sec Loss 0.6035 Epoch: 12 Global Step: 200600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:52:44,712-Speed 3163.37 samples/sec Loss 0.5936 Epoch: 12 Global Step: 200650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:53:01,691-Speed 3015.51 samples/sec Loss 0.5933 Epoch: 12 Global Step: 200700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:53:17,579-Speed 3222.74 samples/sec Loss 0.6109 Epoch: 12 Global Step: 200750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:53:33,482-Speed 3219.62 samples/sec Loss 0.5925 Epoch: 12 Global Step: 200800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:53:49,240-Speed 3249.07 samples/sec Loss 0.5963 Epoch: 12 Global Step: 200850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:54:07,310-Speed 2833.64 samples/sec Loss 0.5921 Epoch: 12 Global Step: 200900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:54:23,402-Speed 3181.77 samples/sec Loss 0.6043 Epoch: 12 Global Step: 200950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:54:39,376-Speed 3205.30 samples/sec Loss 0.5973 Epoch: 12 Global Step: 201000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:54:55,441-Speed 3187.18 samples/sec Loss 0.5848 Epoch: 12 Global Step: 201050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:55:11,306-Speed 3227.21 samples/sec Loss 0.5920 Epoch: 12 Global Step: 201100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:55:27,288-Speed 3203.87 samples/sec Loss 0.5923 Epoch: 12 Global Step: 201150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:55:43,166-Speed 3224.57 samples/sec Loss 0.6141 Epoch: 12 Global Step: 201200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:55:58,970-Speed 3239.86 samples/sec Loss 0.5898 Epoch: 12 Global Step: 201250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:56:14,991-Speed 3195.95 samples/sec Loss 0.5941 Epoch: 12 Global Step: 201300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:56:30,880-Speed 3222.47 samples/sec Loss 0.5938 Epoch: 12 Global Step: 201350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:56:47,630-Speed 3056.74 samples/sec Loss 0.6046 Epoch: 12 Global Step: 201400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:57:04,754-Speed 2990.02 samples/sec Loss 0.6031 Epoch: 12 Global Step: 201450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:57:20,493-Speed 3253.06 samples/sec Loss 0.5894 Epoch: 12 Global Step: 201500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:57:36,297-Speed 3239.83 samples/sec Loss 0.6017 Epoch: 12 Global Step: 201550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:57:52,182-Speed 3223.30 samples/sec Loss 0.5939 Epoch: 12 Global Step: 201600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:58:08,053-Speed 3226.02 samples/sec Loss 0.5979 Epoch: 12 Global Step: 201650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:58:23,837-Speed 3243.97 samples/sec Loss 0.5932 Epoch: 12 Global Step: 201700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:58:39,658-Speed 3236.35 samples/sec Loss 0.6074 Epoch: 12 Global Step: 201750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:58:55,649-Speed 3201.79 samples/sec Loss 0.5943 Epoch: 12 Global Step: 201800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:59:11,786-Speed 3172.95 samples/sec Loss 0.5984 Epoch: 12 Global Step: 201850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:59:27,705-Speed 3216.50 samples/sec Loss 0.5932 Epoch: 12 Global Step: 201900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:59:43,597-Speed 3221.80 samples/sec Loss 0.5883 Epoch: 12 Global Step: 201950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 22:59:59,665-Speed 3186.51 samples/sec Loss 0.5885 Epoch: 12 Global Step: 202000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:00:52,832-[lfw][202000]XNorm: 22.109381 Training: 2021-03-16 23:00:52,833-[lfw][202000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 23:00:52,833-[lfw][202000]Accuracy-Highest: 0.99817 Training: 2021-03-16 23:01:55,053-[cfp_fp][202000]XNorm: 22.367066 Training: 2021-03-16 23:01:55,054-[cfp_fp][202000]Accuracy-Flip: 0.99086+-0.00487 Training: 2021-03-16 23:01:55,054-[cfp_fp][202000]Accuracy-Highest: 0.99186 Training: 2021-03-16 23:02:48,511-[agedb_30][202000]XNorm: 23.047221 Training: 2021-03-16 23:02:48,512-[agedb_30][202000]Accuracy-Flip: 0.98217+-0.00771 Training: 2021-03-16 23:02:48,512-[agedb_30][202000]Accuracy-Highest: 0.98433 Training: 2021-03-16 23:03:04,500-Speed 277.01 samples/sec Loss 0.6081 Epoch: 12 Global Step: 202050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:03:20,225-Speed 3255.96 samples/sec Loss 0.5990 Epoch: 12 Global Step: 202100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:03:36,867-Speed 3076.67 samples/sec Loss 0.6078 Epoch: 12 Global Step: 202150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:03:52,808-Speed 3211.94 samples/sec Loss 0.6170 Epoch: 12 Global Step: 202200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:04:09,969-Speed 2983.67 samples/sec Loss 0.6054 Epoch: 12 Global Step: 202250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:04:25,952-Speed 3203.46 samples/sec Loss 0.5939 Epoch: 12 Global Step: 202300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:04:42,024-Speed 3185.72 samples/sec Loss 0.5880 Epoch: 12 Global Step: 202350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:04:58,621-Speed 3085.08 samples/sec Loss 0.5947 Epoch: 12 Global Step: 202400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:05:14,479-Speed 3228.60 samples/sec Loss 0.5999 Epoch: 12 Global Step: 202450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:05:30,291-Speed 3238.23 samples/sec Loss 0.5889 Epoch: 12 Global Step: 202500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:05:46,223-Speed 3213.86 samples/sec Loss 0.5922 Epoch: 12 Global Step: 202550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:06:02,113-Speed 3222.17 samples/sec Loss 0.6116 Epoch: 12 Global Step: 202600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:06:17,947-Speed 3233.81 samples/sec Loss 0.5873 Epoch: 12 Global Step: 202650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:06:33,774-Speed 3235.07 samples/sec Loss 0.6253 Epoch: 12 Global Step: 202700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:06:49,880-Speed 3179.03 samples/sec Loss 0.5958 Epoch: 12 Global Step: 202750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:07:05,757-Speed 3224.80 samples/sec Loss 0.6085 Epoch: 12 Global Step: 202800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:07:21,675-Speed 3216.67 samples/sec Loss 0.5881 Epoch: 12 Global Step: 202850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:07:38,479-Speed 3046.83 samples/sec Loss 0.5952 Epoch: 12 Global Step: 202900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:07:54,346-Speed 3226.95 samples/sec Loss 0.6042 Epoch: 12 Global Step: 202950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:08:10,705-Speed 3130.00 samples/sec Loss 0.6082 Epoch: 12 Global Step: 203000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:08:26,588-Speed 3223.56 samples/sec Loss 0.6012 Epoch: 12 Global Step: 203050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:08:43,195-Speed 3083.10 samples/sec Loss 0.5911 Epoch: 12 Global Step: 203100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:09:00,048-Speed 3038.14 samples/sec Loss 0.5964 Epoch: 12 Global Step: 203150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:09:15,849-Speed 3240.52 samples/sec Loss 0.6034 Epoch: 12 Global Step: 203200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:09:31,890-Speed 3191.85 samples/sec Loss 0.5963 Epoch: 12 Global Step: 203250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:09:47,897-Speed 3198.79 samples/sec Loss 0.6165 Epoch: 12 Global Step: 203300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:10:03,816-Speed 3216.30 samples/sec Loss 0.5918 Epoch: 12 Global Step: 203350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:10:19,639-Speed 3235.96 samples/sec Loss 0.6113 Epoch: 12 Global Step: 203400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:10:35,843-Speed 3159.73 samples/sec Loss 0.6039 Epoch: 12 Global Step: 203450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:10:52,068-Speed 3155.64 samples/sec Loss 0.5971 Epoch: 12 Global Step: 203500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:11:08,182-Speed 3177.55 samples/sec Loss 0.5996 Epoch: 12 Global Step: 203550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:11:26,300-Speed 2825.88 samples/sec Loss 0.5897 Epoch: 12 Global Step: 203600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:11:42,279-Speed 3204.31 samples/sec Loss 0.6083 Epoch: 12 Global Step: 203650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:11:58,188-Speed 3218.44 samples/sec Loss 0.6161 Epoch: 12 Global Step: 203700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:12:14,139-Speed 3210.01 samples/sec Loss 0.6026 Epoch: 12 Global Step: 203750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:12:30,147-Speed 3198.51 samples/sec Loss 0.5877 Epoch: 12 Global Step: 203800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:12:45,970-Speed 3235.85 samples/sec Loss 0.6004 Epoch: 12 Global Step: 203850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:13:01,912-Speed 3211.58 samples/sec Loss 0.6082 Epoch: 12 Global Step: 203900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:13:17,717-Speed 3239.74 samples/sec Loss 0.5997 Epoch: 12 Global Step: 203950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:13:33,691-Speed 3205.15 samples/sec Loss 0.5904 Epoch: 12 Global Step: 204000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:14:26,982-[lfw][204000]XNorm: 22.247021 Training: 2021-03-16 23:14:26,983-[lfw][204000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-16 23:14:26,983-[lfw][204000]Accuracy-Highest: 0.99817 Training: 2021-03-16 23:15:28,857-[cfp_fp][204000]XNorm: 22.201364 Training: 2021-03-16 23:15:28,858-[cfp_fp][204000]Accuracy-Flip: 0.99014+-0.00529 Training: 2021-03-16 23:15:28,858-[cfp_fp][204000]Accuracy-Highest: 0.99186 Training: 2021-03-16 23:16:22,262-[agedb_30][204000]XNorm: 22.950226 Training: 2021-03-16 23:16:22,263-[agedb_30][204000]Accuracy-Flip: 0.98300+-0.00702 Training: 2021-03-16 23:16:22,263-[agedb_30][204000]Accuracy-Highest: 0.98433 Training: 2021-03-16 23:16:38,204-Speed 277.49 samples/sec Loss 0.6075 Epoch: 12 Global Step: 204050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:16:54,122-Speed 3216.58 samples/sec Loss 0.5910 Epoch: 12 Global Step: 204100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:17:09,948-Speed 3235.33 samples/sec Loss 0.6139 Epoch: 12 Global Step: 204150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:17:25,765-Speed 3236.95 samples/sec Loss 0.5946 Epoch: 12 Global Step: 204200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:17:41,550-Speed 3243.79 samples/sec Loss 0.6003 Epoch: 12 Global Step: 204250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:17:58,373-Speed 3043.61 samples/sec Loss 0.6063 Epoch: 12 Global Step: 204300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:18:14,261-Speed 3222.52 samples/sec Loss 0.6026 Epoch: 12 Global Step: 204350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:18:30,177-Speed 3216.95 samples/sec Loss 0.5909 Epoch: 12 Global Step: 204400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:18:46,078-Speed 3220.03 samples/sec Loss 0.5948 Epoch: 12 Global Step: 204450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:19:01,769-Speed 3263.24 samples/sec Loss 0.6057 Epoch: 12 Global Step: 204500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:19:17,776-Speed 3198.57 samples/sec Loss 0.6224 Epoch: 12 Global Step: 204550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:19:34,597-Speed 3044.00 samples/sec Loss 0.6043 Epoch: 12 Global Step: 204600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:19:50,413-Speed 3237.24 samples/sec Loss 0.5954 Epoch: 12 Global Step: 204650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:20:07,196-Speed 3050.86 samples/sec Loss 0.6122 Epoch: 12 Global Step: 204700 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:20:22,956-Speed 3248.67 samples/sec Loss 0.6114 Epoch: 12 Global Step: 204750 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:20:39,057-Speed 3180.19 samples/sec Loss 0.5887 Epoch: 12 Global Step: 204800 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:20:55,010-Speed 3209.54 samples/sec Loss 0.5979 Epoch: 12 Global Step: 204850 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:21:10,846-Speed 3233.22 samples/sec Loss 0.5902 Epoch: 12 Global Step: 204900 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:21:26,788-Speed 3211.60 samples/sec Loss 0.6085 Epoch: 12 Global Step: 204950 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:21:42,615-Speed 3235.18 samples/sec Loss 0.5883 Epoch: 12 Global Step: 205000 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:21:58,497-Speed 3223.90 samples/sec Loss 0.6071 Epoch: 12 Global Step: 205050 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:22:14,431-Speed 3213.40 samples/sec Loss 0.6107 Epoch: 12 Global Step: 205100 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:22:31,221-Speed 3049.45 samples/sec Loss 0.6055 Epoch: 12 Global Step: 205150 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:22:47,248-Speed 3194.76 samples/sec Loss 0.5946 Epoch: 12 Global Step: 205200 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:23:04,044-Speed 3048.28 samples/sec Loss 0.6165 Epoch: 12 Global Step: 205250 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:23:19,900-Speed 3229.32 samples/sec Loss 0.6021 Epoch: 12 Global Step: 205300 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:23:36,524-Speed 3079.81 samples/sec Loss 0.6037 Epoch: 12 Global Step: 205350 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:23:52,628-Speed 3179.49 samples/sec Loss 0.5976 Epoch: 12 Global Step: 205400 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:24:08,822-Speed 3161.84 samples/sec Loss 0.6027 Epoch: 12 Global Step: 205450 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:24:24,729-Speed 3218.72 samples/sec Loss 0.5977 Epoch: 12 Global Step: 205500 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:24:40,716-Speed 3202.75 samples/sec Loss 0.6051 Epoch: 12 Global Step: 205550 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:24:56,943-Speed 3155.25 samples/sec Loss 0.6067 Epoch: 12 Global Step: 205600 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:25:12,972-Speed 3194.27 samples/sec Loss 0.6065 Epoch: 12 Global Step: 205650 Fp16 Grad Scale: 16384 Required: 15 hours Training: 2021-03-16 23:25:28,914-Speed 3211.77 samples/sec Loss 0.6102 Epoch: 12 Global Step: 205700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:25:45,000-Speed 3182.98 samples/sec Loss 0.5783 Epoch: 12 Global Step: 205750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:26:02,624-Speed 2905.19 samples/sec Loss 0.5980 Epoch: 12 Global Step: 205800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:26:18,514-Speed 3222.27 samples/sec Loss 0.6116 Epoch: 12 Global Step: 205850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:26:34,727-Speed 3158.04 samples/sec Loss 0.5980 Epoch: 12 Global Step: 205900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:26:50,894-Speed 3166.99 samples/sec Loss 0.6097 Epoch: 12 Global Step: 205950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:27:06,854-Speed 3208.21 samples/sec Loss 0.5901 Epoch: 12 Global Step: 206000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:28:00,129-[lfw][206000]XNorm: 21.761901 Training: 2021-03-16 23:28:00,129-[lfw][206000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 23:28:00,129-[lfw][206000]Accuracy-Highest: 0.99817 Training: 2021-03-16 23:29:01,992-[cfp_fp][206000]XNorm: 22.009016 Training: 2021-03-16 23:29:01,992-[cfp_fp][206000]Accuracy-Flip: 0.99086+-0.00457 Training: 2021-03-16 23:29:01,992-[cfp_fp][206000]Accuracy-Highest: 0.99186 Training: 2021-03-16 23:29:55,591-[agedb_30][206000]XNorm: 22.596029 Training: 2021-03-16 23:29:55,591-[agedb_30][206000]Accuracy-Flip: 0.98233+-0.00672 Training: 2021-03-16 23:29:55,591-[agedb_30][206000]Accuracy-Highest: 0.98433 Training: 2021-03-16 23:30:11,513-Speed 277.27 samples/sec Loss 0.6018 Epoch: 12 Global Step: 206050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:30:27,689-Speed 3165.38 samples/sec Loss 0.5959 Epoch: 12 Global Step: 206100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:30:43,650-Speed 3207.84 samples/sec Loss 0.6027 Epoch: 12 Global Step: 206150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:30:59,558-Speed 3218.61 samples/sec Loss 0.6075 Epoch: 12 Global Step: 206200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:31:15,506-Speed 3210.54 samples/sec Loss 0.5966 Epoch: 12 Global Step: 206250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:31:31,889-Speed 3125.35 samples/sec Loss 0.5863 Epoch: 12 Global Step: 206300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:31:48,006-Speed 3176.70 samples/sec Loss 0.5952 Epoch: 12 Global Step: 206350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:32:04,002-Speed 3200.93 samples/sec Loss 0.6052 Epoch: 12 Global Step: 206400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:32:20,795-Speed 3048.97 samples/sec Loss 0.5970 Epoch: 12 Global Step: 206450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:32:36,709-Speed 3217.38 samples/sec Loss 0.5943 Epoch: 12 Global Step: 206500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:32:52,777-Speed 3186.63 samples/sec Loss 0.6008 Epoch: 12 Global Step: 206550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:33:08,762-Speed 3203.03 samples/sec Loss 0.6127 Epoch: 12 Global Step: 206600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:33:24,826-Speed 3187.39 samples/sec Loss 0.5912 Epoch: 12 Global Step: 206650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:33:40,766-Speed 3212.10 samples/sec Loss 0.6157 Epoch: 12 Global Step: 206700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:33:56,767-Speed 3200.02 samples/sec Loss 0.6178 Epoch: 12 Global Step: 206750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:34:12,886-Speed 3176.34 samples/sec Loss 0.5992 Epoch: 12 Global Step: 206800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:34:28,936-Speed 3190.08 samples/sec Loss 0.5986 Epoch: 12 Global Step: 206850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:34:45,717-Speed 3051.30 samples/sec Loss 0.6016 Epoch: 12 Global Step: 206900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:35:01,717-Speed 3200.08 samples/sec Loss 0.5972 Epoch: 12 Global Step: 206950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:35:18,619-Speed 3029.24 samples/sec Loss 0.6078 Epoch: 12 Global Step: 207000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:35:34,691-Speed 3185.67 samples/sec Loss 0.6125 Epoch: 12 Global Step: 207050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:35:50,717-Speed 3194.92 samples/sec Loss 0.6013 Epoch: 12 Global Step: 207100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:36:06,700-Speed 3203.48 samples/sec Loss 0.5995 Epoch: 12 Global Step: 207150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:36:22,676-Speed 3205.04 samples/sec Loss 0.6040 Epoch: 12 Global Step: 207200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:36:38,709-Speed 3193.49 samples/sec Loss 0.5997 Epoch: 12 Global Step: 207250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:36:54,574-Speed 3227.29 samples/sec Loss 0.6020 Epoch: 12 Global Step: 207300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:37:11,468-Speed 3030.79 samples/sec Loss 0.6010 Epoch: 12 Global Step: 207350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:37:27,454-Speed 3202.75 samples/sec Loss 0.6037 Epoch: 12 Global Step: 207400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:37:43,657-Speed 3160.01 samples/sec Loss 0.5877 Epoch: 12 Global Step: 207450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:38:00,123-Speed 3109.54 samples/sec Loss 0.5945 Epoch: 12 Global Step: 207500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:38:17,018-Speed 3030.69 samples/sec Loss 0.5932 Epoch: 12 Global Step: 207550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:38:33,063-Speed 3191.15 samples/sec Loss 0.6021 Epoch: 12 Global Step: 207600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:38:50,003-Speed 3022.55 samples/sec Loss 0.5920 Epoch: 12 Global Step: 207650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:39:06,309-Speed 3140.05 samples/sec Loss 0.6056 Epoch: 12 Global Step: 207700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:39:22,515-Speed 3159.33 samples/sec Loss 0.6011 Epoch: 12 Global Step: 207750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:39:38,450-Speed 3213.23 samples/sec Loss 0.6089 Epoch: 12 Global Step: 207800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:39:54,746-Speed 3141.95 samples/sec Loss 0.5986 Epoch: 12 Global Step: 207850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:40:10,788-Speed 3191.66 samples/sec Loss 0.5953 Epoch: 12 Global Step: 207900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:40:26,794-Speed 3198.86 samples/sec Loss 0.6113 Epoch: 12 Global Step: 207950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:40:44,680-Speed 2862.60 samples/sec Loss 0.5962 Epoch: 12 Global Step: 208000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:41:38,087-[lfw][208000]XNorm: 22.201682 Training: 2021-03-16 23:41:38,087-[lfw][208000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-16 23:41:38,087-[lfw][208000]Accuracy-Highest: 0.99817 Training: 2021-03-16 23:42:39,982-[cfp_fp][208000]XNorm: 22.216781 Training: 2021-03-16 23:42:39,982-[cfp_fp][208000]Accuracy-Flip: 0.99100+-0.00478 Training: 2021-03-16 23:42:39,982-[cfp_fp][208000]Accuracy-Highest: 0.99186 Training: 2021-03-16 23:43:33,139-[agedb_30][208000]XNorm: 22.891523 Training: 2021-03-16 23:43:33,139-[agedb_30][208000]Accuracy-Flip: 0.98350+-0.00677 Training: 2021-03-16 23:43:33,139-[agedb_30][208000]Accuracy-Highest: 0.98433 Training: 2021-03-16 23:43:49,304-Speed 277.32 samples/sec Loss 0.5971 Epoch: 12 Global Step: 208050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:44:05,351-Speed 3190.69 samples/sec Loss 0.6002 Epoch: 12 Global Step: 208100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:44:21,591-Speed 3152.98 samples/sec Loss 0.5932 Epoch: 12 Global Step: 208150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:44:37,804-Speed 3158.01 samples/sec Loss 0.5959 Epoch: 12 Global Step: 208200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:44:53,907-Speed 3179.51 samples/sec Loss 0.5942 Epoch: 12 Global Step: 208250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:45:09,900-Speed 3201.57 samples/sec Loss 0.5980 Epoch: 12 Global Step: 208300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:45:25,943-Speed 3191.53 samples/sec Loss 0.6090 Epoch: 12 Global Step: 208350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:45:41,957-Speed 3197.20 samples/sec Loss 0.6051 Epoch: 12 Global Step: 208400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:45:58,081-Speed 3175.66 samples/sec Loss 0.5973 Epoch: 12 Global Step: 208450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:46:14,085-Speed 3199.13 samples/sec Loss 0.6108 Epoch: 12 Global Step: 208500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:46:30,056-Speed 3205.93 samples/sec Loss 0.6003 Epoch: 12 Global Step: 208550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:46:46,096-Speed 3192.12 samples/sec Loss 0.6005 Epoch: 12 Global Step: 208600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:47:03,045-Speed 3020.93 samples/sec Loss 0.5918 Epoch: 12 Global Step: 208650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:47:19,148-Speed 3179.72 samples/sec Loss 0.6047 Epoch: 12 Global Step: 208700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:47:35,640-Speed 3104.46 samples/sec Loss 0.5834 Epoch: 12 Global Step: 208750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:47:51,747-Speed 3178.98 samples/sec Loss 0.5906 Epoch: 12 Global Step: 208800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:48:07,867-Speed 3176.19 samples/sec Loss 0.6052 Epoch: 12 Global Step: 208850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:48:23,922-Speed 3189.10 samples/sec Loss 0.6046 Epoch: 12 Global Step: 208900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:48:40,036-Speed 3177.50 samples/sec Loss 0.6093 Epoch: 12 Global Step: 208950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:48:56,029-Speed 3201.45 samples/sec Loss 0.5951 Epoch: 12 Global Step: 209000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:49:12,324-Speed 3142.28 samples/sec Loss 0.5980 Epoch: 12 Global Step: 209050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:49:29,294-Speed 3017.20 samples/sec Loss 0.5986 Epoch: 12 Global Step: 209100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:49:45,283-Speed 3202.27 samples/sec Loss 0.5885 Epoch: 12 Global Step: 209150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:50:02,098-Speed 3045.03 samples/sec Loss 0.5980 Epoch: 12 Global Step: 209200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:50:18,194-Speed 3180.87 samples/sec Loss 0.5918 Epoch: 12 Global Step: 209250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:50:34,389-Speed 3161.63 samples/sec Loss 0.5962 Epoch: 12 Global Step: 209300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:50:50,530-Speed 3172.14 samples/sec Loss 0.5925 Epoch: 12 Global Step: 209350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:51:06,642-Speed 3177.92 samples/sec Loss 0.6044 Epoch: 12 Global Step: 209400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:51:22,927-Speed 3144.07 samples/sec Loss 0.5971 Epoch: 12 Global Step: 209450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:51:39,102-Speed 3165.52 samples/sec Loss 0.5934 Epoch: 12 Global Step: 209500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:51:56,006-Speed 3028.82 samples/sec Loss 0.5930 Epoch: 12 Global Step: 209550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:52:12,059-Speed 3189.61 samples/sec Loss 0.6076 Epoch: 12 Global Step: 209600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:52:28,321-Speed 3148.46 samples/sec Loss 0.6073 Epoch: 12 Global Step: 209650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:52:45,373-Speed 3002.80 samples/sec Loss 0.6103 Epoch: 12 Global Step: 209700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:53:01,611-Speed 3153.13 samples/sec Loss 0.5909 Epoch: 12 Global Step: 209750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:53:17,687-Speed 3184.97 samples/sec Loss 0.6067 Epoch: 12 Global Step: 209800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:53:34,003-Speed 3138.06 samples/sec Loss 0.5933 Epoch: 12 Global Step: 209850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:53:50,043-Speed 3192.20 samples/sec Loss 0.5885 Epoch: 12 Global Step: 209900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:54:06,901-Speed 3037.15 samples/sec Loss 0.5957 Epoch: 12 Global Step: 209950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:54:23,074-Speed 3165.89 samples/sec Loss 0.5992 Epoch: 12 Global Step: 210000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:55:16,258-[lfw][210000]XNorm: 21.995100 Training: 2021-03-16 23:55:16,259-[lfw][210000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-16 23:55:16,259-[lfw][210000]Accuracy-Highest: 0.99817 Training: 2021-03-16 23:56:18,240-[cfp_fp][210000]XNorm: 22.150707 Training: 2021-03-16 23:56:18,241-[cfp_fp][210000]Accuracy-Flip: 0.99171+-0.00486 Training: 2021-03-16 23:56:18,241-[cfp_fp][210000]Accuracy-Highest: 0.99186 Training: 2021-03-16 23:57:12,209-[agedb_30][210000]XNorm: 22.925775 Training: 2021-03-16 23:57:12,209-[agedb_30][210000]Accuracy-Flip: 0.98267+-0.00646 Training: 2021-03-16 23:57:12,209-[agedb_30][210000]Accuracy-Highest: 0.98433 Training: 2021-03-16 23:57:28,340-Speed 276.36 samples/sec Loss 0.5922 Epoch: 12 Global Step: 210050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:57:44,381-Speed 3191.85 samples/sec Loss 0.5947 Epoch: 12 Global Step: 210100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:58:00,582-Speed 3160.55 samples/sec Loss 0.5847 Epoch: 12 Global Step: 210150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:58:17,696-Speed 2991.76 samples/sec Loss 0.6069 Epoch: 12 Global Step: 210200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:58:34,985-Speed 2961.44 samples/sec Loss 0.5924 Epoch: 12 Global Step: 210250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:58:51,151-Speed 3167.26 samples/sec Loss 0.5966 Epoch: 12 Global Step: 210300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:59:07,389-Speed 3153.21 samples/sec Loss 0.6018 Epoch: 12 Global Step: 210350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:59:23,568-Speed 3164.63 samples/sec Loss 0.5864 Epoch: 12 Global Step: 210400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:59:39,617-Speed 3190.32 samples/sec Loss 0.5937 Epoch: 12 Global Step: 210450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-16 23:59:55,709-Speed 3181.82 samples/sec Loss 0.6073 Epoch: 12 Global Step: 210500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:00:11,742-Speed 3193.53 samples/sec Loss 0.6080 Epoch: 12 Global Step: 210550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:00:27,922-Speed 3164.48 samples/sec Loss 0.6020 Epoch: 12 Global Step: 210600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:00:44,268-Speed 3132.41 samples/sec Loss 0.5856 Epoch: 12 Global Step: 210650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:01:00,512-Speed 3151.98 samples/sec Loss 0.5993 Epoch: 12 Global Step: 210700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:01:16,757-Speed 3151.81 samples/sec Loss 0.5955 Epoch: 12 Global Step: 210750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:01:33,066-Speed 3139.51 samples/sec Loss 0.5955 Epoch: 12 Global Step: 210800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:01:50,196-Speed 2989.04 samples/sec Loss 0.6184 Epoch: 12 Global Step: 210850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:02:06,384-Speed 3162.93 samples/sec Loss 0.5936 Epoch: 12 Global Step: 210900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:02:22,514-Speed 3174.14 samples/sec Loss 0.6093 Epoch: 12 Global Step: 210950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:02:38,662-Speed 3170.91 samples/sec Loss 0.6052 Epoch: 12 Global Step: 211000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:02:54,853-Speed 3162.19 samples/sec Loss 0.5986 Epoch: 12 Global Step: 211050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:03:11,257-Speed 3121.36 samples/sec Loss 0.6079 Epoch: 12 Global Step: 211100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:03:27,529-Speed 3146.63 samples/sec Loss 0.5866 Epoch: 12 Global Step: 211150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:03:43,799-Speed 3147.00 samples/sec Loss 0.5858 Epoch: 12 Global Step: 211200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:03:59,969-Speed 3166.49 samples/sec Loss 0.5961 Epoch: 12 Global Step: 211250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:04:16,945-Speed 3016.14 samples/sec Loss 0.5839 Epoch: 12 Global Step: 211300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:04:33,185-Speed 3152.72 samples/sec Loss 0.5954 Epoch: 12 Global Step: 211350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:04:50,242-Speed 3001.86 samples/sec Loss 0.6092 Epoch: 12 Global Step: 211400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:05:06,811-Speed 3090.07 samples/sec Loss 0.5996 Epoch: 12 Global Step: 211450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:05:23,068-Speed 3149.48 samples/sec Loss 0.5966 Epoch: 12 Global Step: 211500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:05:39,231-Speed 3167.80 samples/sec Loss 0.5955 Epoch: 12 Global Step: 211550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:05:55,807-Speed 3088.98 samples/sec Loss 0.5822 Epoch: 12 Global Step: 211600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:06:12,000-Speed 3162.01 samples/sec Loss 0.6008 Epoch: 12 Global Step: 211650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:06:28,116-Speed 3177.05 samples/sec Loss 0.5917 Epoch: 12 Global Step: 211700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:06:44,396-Speed 3144.94 samples/sec Loss 0.5972 Epoch: 12 Global Step: 211750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:07:00,669-Speed 3146.51 samples/sec Loss 0.5904 Epoch: 12 Global Step: 211800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:07:17,699-Speed 3006.46 samples/sec Loss 0.6090 Epoch: 12 Global Step: 211850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:07:33,980-Speed 3144.98 samples/sec Loss 0.6095 Epoch: 12 Global Step: 211900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:07:51,579-Speed 2909.39 samples/sec Loss 0.5985 Epoch: 12 Global Step: 211950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:08:07,992-Speed 3119.50 samples/sec Loss 0.6142 Epoch: 12 Global Step: 212000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:09:01,174-[lfw][212000]XNorm: 22.072165 Training: 2021-03-17 00:09:01,174-[lfw][212000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-17 00:09:01,174-[lfw][212000]Accuracy-Highest: 0.99817 Training: 2021-03-17 00:10:03,002-[cfp_fp][212000]XNorm: 22.207782 Training: 2021-03-17 00:10:03,003-[cfp_fp][212000]Accuracy-Flip: 0.99143+-0.00482 Training: 2021-03-17 00:10:03,003-[cfp_fp][212000]Accuracy-Highest: 0.99186 Training: 2021-03-17 00:10:56,189-[agedb_30][212000]XNorm: 22.988146 Training: 2021-03-17 00:10:56,189-[agedb_30][212000]Accuracy-Flip: 0.98317+-0.00693 Training: 2021-03-17 00:10:56,189-[agedb_30][212000]Accuracy-Highest: 0.98433 Training: 2021-03-17 00:11:12,475-Speed 277.53 samples/sec Loss 0.6092 Epoch: 12 Global Step: 212050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:11:28,716-Speed 3152.62 samples/sec Loss 0.6069 Epoch: 12 Global Step: 212100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:11:45,058-Speed 3133.10 samples/sec Loss 0.6064 Epoch: 12 Global Step: 212150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:12:02,104-Speed 3003.70 samples/sec Loss 0.5931 Epoch: 12 Global Step: 212200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:12:18,487-Speed 3125.43 samples/sec Loss 0.5993 Epoch: 12 Global Step: 212250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:12:34,654-Speed 3167.07 samples/sec Loss 0.5813 Epoch: 12 Global Step: 212300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:12:51,801-Speed 2986.03 samples/sec Loss 0.5908 Epoch: 12 Global Step: 212350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:13:08,061-Speed 3148.94 samples/sec Loss 0.6162 Epoch: 12 Global Step: 212400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:13:25,061-Speed 3011.81 samples/sec Loss 0.5928 Epoch: 12 Global Step: 212450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:13:41,266-Speed 3159.58 samples/sec Loss 0.5909 Epoch: 12 Global Step: 212500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:13:57,553-Speed 3143.56 samples/sec Loss 0.6064 Epoch: 12 Global Step: 212550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:14:13,845-Speed 3142.89 samples/sec Loss 0.6062 Epoch: 12 Global Step: 212600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:14:29,976-Speed 3174.00 samples/sec Loss 0.5887 Epoch: 12 Global Step: 212650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:14:46,274-Speed 3141.62 samples/sec Loss 0.5968 Epoch: 12 Global Step: 212700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:15:02,500-Speed 3155.58 samples/sec Loss 0.6086 Epoch: 12 Global Step: 212750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:15:18,817-Speed 3137.98 samples/sec Loss 0.6031 Epoch: 12 Global Step: 212800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:15:35,391-Speed 3089.10 samples/sec Loss 0.5957 Epoch: 12 Global Step: 212850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:15:51,572-Speed 3164.38 samples/sec Loss 0.5905 Epoch: 12 Global Step: 212900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:16:07,722-Speed 3170.39 samples/sec Loss 0.5973 Epoch: 12 Global Step: 212950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:16:24,895-Speed 2981.45 samples/sec Loss 0.5954 Epoch: 12 Global Step: 213000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:16:41,108-Speed 3158.10 samples/sec Loss 0.5920 Epoch: 12 Global Step: 213050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:16:57,572-Speed 3109.88 samples/sec Loss 0.6016 Epoch: 12 Global Step: 213100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:17:13,733-Speed 3168.22 samples/sec Loss 0.6018 Epoch: 12 Global Step: 213150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:17:30,332-Speed 3084.63 samples/sec Loss 0.5926 Epoch: 12 Global Step: 213200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:17:46,746-Speed 3119.37 samples/sec Loss 0.6029 Epoch: 12 Global Step: 213250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:18:03,045-Speed 3141.43 samples/sec Loss 0.5850 Epoch: 12 Global Step: 213300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:18:19,284-Speed 3153.10 samples/sec Loss 0.5904 Epoch: 12 Global Step: 213350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:18:35,558-Speed 3146.08 samples/sec Loss 0.5711 Epoch: 12 Global Step: 213400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:18:51,663-Speed 3179.30 samples/sec Loss 0.5918 Epoch: 12 Global Step: 213450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:19:07,815-Speed 3169.96 samples/sec Loss 0.6080 Epoch: 12 Global Step: 213500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:19:24,875-Speed 3001.36 samples/sec Loss 0.5910 Epoch: 12 Global Step: 213550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:19:41,134-Speed 3148.98 samples/sec Loss 0.6056 Epoch: 12 Global Step: 213600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:19:57,499-Speed 3128.86 samples/sec Loss 0.5968 Epoch: 12 Global Step: 213650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:20:14,689-Speed 2978.52 samples/sec Loss 0.5861 Epoch: 12 Global Step: 213700 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:20:31,188-Speed 3103.33 samples/sec Loss 0.5906 Epoch: 12 Global Step: 213750 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:20:47,504-Speed 3138.02 samples/sec Loss 0.5913 Epoch: 12 Global Step: 213800 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:21:03,941-Speed 3115.06 samples/sec Loss 0.5989 Epoch: 12 Global Step: 213850 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:21:20,289-Speed 3131.92 samples/sec Loss 0.5987 Epoch: 12 Global Step: 213900 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:21:37,975-Speed 2895.04 samples/sec Loss 0.5969 Epoch: 12 Global Step: 213950 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:21:54,236-Speed 3148.81 samples/sec Loss 0.5978 Epoch: 12 Global Step: 214000 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:22:47,439-[lfw][214000]XNorm: 21.355235 Training: 2021-03-17 00:22:47,439-[lfw][214000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 00:22:47,439-[lfw][214000]Accuracy-Highest: 0.99817 Training: 2021-03-17 00:23:49,687-[cfp_fp][214000]XNorm: 21.837195 Training: 2021-03-17 00:23:49,687-[cfp_fp][214000]Accuracy-Flip: 0.99171+-0.00534 Training: 2021-03-17 00:23:49,687-[cfp_fp][214000]Accuracy-Highest: 0.99186 Training: 2021-03-17 00:24:42,928-[agedb_30][214000]XNorm: 22.333690 Training: 2021-03-17 00:24:42,929-[agedb_30][214000]Accuracy-Flip: 0.98350+-0.00656 Training: 2021-03-17 00:24:42,929-[agedb_30][214000]Accuracy-Highest: 0.98433 Training: 2021-03-17 00:24:59,506-Speed 276.35 samples/sec Loss 0.6026 Epoch: 12 Global Step: 214050 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:25:16,531-Speed 3007.46 samples/sec Loss 0.5829 Epoch: 12 Global Step: 214100 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:25:32,849-Speed 3137.82 samples/sec Loss 0.5937 Epoch: 12 Global Step: 214150 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:25:49,076-Speed 3155.34 samples/sec Loss 0.6025 Epoch: 12 Global Step: 214200 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:26:05,146-Speed 3186.16 samples/sec Loss 0.5852 Epoch: 12 Global Step: 214250 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:26:21,338-Speed 3162.15 samples/sec Loss 0.5880 Epoch: 12 Global Step: 214300 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:26:37,666-Speed 3135.80 samples/sec Loss 0.6037 Epoch: 12 Global Step: 214350 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:26:54,036-Speed 3127.64 samples/sec Loss 0.5954 Epoch: 12 Global Step: 214400 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:27:10,320-Speed 3144.29 samples/sec Loss 0.5939 Epoch: 12 Global Step: 214450 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:27:26,556-Speed 3153.75 samples/sec Loss 0.5971 Epoch: 12 Global Step: 214500 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:27:44,440-Speed 2862.92 samples/sec Loss 0.5845 Epoch: 12 Global Step: 214550 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:28:00,581-Speed 3172.09 samples/sec Loss 0.5915 Epoch: 12 Global Step: 214600 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:28:17,619-Speed 3005.12 samples/sec Loss 0.5973 Epoch: 12 Global Step: 214650 Fp16 Grad Scale: 16384 Required: 14 hours Training: 2021-03-17 00:28:34,024-Speed 3121.09 samples/sec Loss 0.5888 Epoch: 12 Global Step: 214700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:28:50,247-Speed 3156.10 samples/sec Loss 0.5959 Epoch: 12 Global Step: 214750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:29:06,827-Speed 3088.29 samples/sec Loss 0.5949 Epoch: 12 Global Step: 214800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:29:22,977-Speed 3170.39 samples/sec Loss 0.5991 Epoch: 12 Global Step: 214850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:29:39,517-Speed 3095.51 samples/sec Loss 0.5990 Epoch: 12 Global Step: 214900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:29:55,944-Speed 3117.04 samples/sec Loss 0.5884 Epoch: 12 Global Step: 214950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:30:12,288-Speed 3132.71 samples/sec Loss 0.5880 Epoch: 12 Global Step: 215000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:30:28,513-Speed 3155.73 samples/sec Loss 0.5855 Epoch: 12 Global Step: 215050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:30:44,731-Speed 3157.09 samples/sec Loss 0.6068 Epoch: 12 Global Step: 215100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:31:00,964-Speed 3154.11 samples/sec Loss 0.5973 Epoch: 12 Global Step: 215150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:31:17,273-Speed 3139.39 samples/sec Loss 0.5745 Epoch: 12 Global Step: 215200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:31:33,568-Speed 3142.10 samples/sec Loss 0.5982 Epoch: 12 Global Step: 215250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:31:50,780-Speed 2974.85 samples/sec Loss 0.5955 Epoch: 12 Global Step: 215300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:32:07,069-Speed 3143.19 samples/sec Loss 0.5821 Epoch: 12 Global Step: 215350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:32:23,558-Speed 3105.32 samples/sec Loss 0.5929 Epoch: 12 Global Step: 215400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:32:39,804-Speed 3151.57 samples/sec Loss 0.6032 Epoch: 12 Global Step: 215450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:32:56,252-Speed 3112.96 samples/sec Loss 0.5882 Epoch: 12 Global Step: 215500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:33:12,622-Speed 3127.71 samples/sec Loss 0.6014 Epoch: 12 Global Step: 215550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:33:28,892-Speed 3146.95 samples/sec Loss 0.6090 Epoch: 12 Global Step: 215600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:33:45,134-Speed 3152.60 samples/sec Loss 0.5737 Epoch: 12 Global Step: 215650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:34:01,522-Speed 3124.19 samples/sec Loss 0.5980 Epoch: 12 Global Step: 215700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:34:18,784-Speed 2966.12 samples/sec Loss 0.5904 Epoch: 12 Global Step: 215750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:34:35,111-Speed 3136.10 samples/sec Loss 0.5831 Epoch: 12 Global Step: 215800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:34:51,533-Speed 3117.89 samples/sec Loss 0.6055 Epoch: 12 Global Step: 215850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:35:08,860-Speed 2954.91 samples/sec Loss 0.6030 Epoch: 12 Global Step: 215900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:35:25,386-Speed 3098.30 samples/sec Loss 0.5891 Epoch: 12 Global Step: 215950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:35:41,652-Speed 3147.73 samples/sec Loss 0.6002 Epoch: 12 Global Step: 216000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:36:34,990-[lfw][216000]XNorm: 22.028067 Training: 2021-03-17 00:36:34,991-[lfw][216000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 00:36:34,991-[lfw][216000]Accuracy-Highest: 0.99817 Training: 2021-03-17 00:37:36,855-[cfp_fp][216000]XNorm: 22.225613 Training: 2021-03-17 00:37:36,855-[cfp_fp][216000]Accuracy-Flip: 0.99171+-0.00464 Training: 2021-03-17 00:37:36,855-[cfp_fp][216000]Accuracy-Highest: 0.99186 Training: 2021-03-17 00:38:29,995-[agedb_30][216000]XNorm: 23.121382 Training: 2021-03-17 00:38:29,995-[agedb_30][216000]Accuracy-Flip: 0.98300+-0.00614 Training: 2021-03-17 00:38:29,995-[agedb_30][216000]Accuracy-Highest: 0.98433 Training: 2021-03-17 00:38:46,231-Speed 277.39 samples/sec Loss 0.5863 Epoch: 12 Global Step: 216050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:39:02,581-Speed 3131.50 samples/sec Loss 0.6001 Epoch: 12 Global Step: 216100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:39:19,653-Speed 2999.21 samples/sec Loss 0.5826 Epoch: 12 Global Step: 216150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:39:36,136-Speed 3106.41 samples/sec Loss 0.5882 Epoch: 12 Global Step: 216200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:39:53,373-Speed 2970.37 samples/sec Loss 0.6023 Epoch: 12 Global Step: 216250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:40:09,694-Speed 3137.22 samples/sec Loss 0.6070 Epoch: 12 Global Step: 216300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:40:26,014-Speed 3137.32 samples/sec Loss 0.5952 Epoch: 12 Global Step: 216350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:40:42,397-Speed 3125.19 samples/sec Loss 0.5926 Epoch: 12 Global Step: 216400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:40:58,852-Speed 3111.66 samples/sec Loss 0.5895 Epoch: 12 Global Step: 216450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:41:15,015-Speed 3167.79 samples/sec Loss 0.5875 Epoch: 12 Global Step: 216500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:41:31,365-Speed 3131.55 samples/sec Loss 0.5961 Epoch: 12 Global Step: 216550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:41:47,668-Speed 3140.78 samples/sec Loss 0.5861 Epoch: 12 Global Step: 216600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:42:03,898-Speed 3154.59 samples/sec Loss 0.6080 Epoch: 12 Global Step: 216650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:42:21,161-Speed 2966.01 samples/sec Loss 0.6004 Epoch: 12 Global Step: 216700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:42:38,310-Speed 2985.62 samples/sec Loss 0.5841 Epoch: 12 Global Step: 216750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:42:54,701-Speed 3123.93 samples/sec Loss 0.5782 Epoch: 12 Global Step: 216800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:43:11,082-Speed 3125.66 samples/sec Loss 0.6139 Epoch: 12 Global Step: 216850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:43:28,315-Speed 2971.09 samples/sec Loss 0.5781 Epoch: 12 Global Step: 216900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:43:44,544-Speed 3154.96 samples/sec Loss 0.5844 Epoch: 12 Global Step: 216950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:44:14,765-Speed 1694.18 samples/sec Loss 0.5803 Epoch: 13 Global Step: 217000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:44:31,166-Speed 3121.98 samples/sec Loss 0.5235 Epoch: 13 Global Step: 217050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:44:47,546-Speed 3125.87 samples/sec Loss 0.5323 Epoch: 13 Global Step: 217100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:45:03,705-Speed 3168.51 samples/sec Loss 0.5380 Epoch: 13 Global Step: 217150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:45:19,744-Speed 3192.46 samples/sec Loss 0.5226 Epoch: 13 Global Step: 217200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:45:36,041-Speed 3141.77 samples/sec Loss 0.5375 Epoch: 13 Global Step: 217250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:45:52,358-Speed 3137.83 samples/sec Loss 0.5282 Epoch: 13 Global Step: 217300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:46:08,888-Speed 3097.61 samples/sec Loss 0.5303 Epoch: 13 Global Step: 217350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:46:26,673-Speed 2878.81 samples/sec Loss 0.5312 Epoch: 13 Global Step: 217400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:46:42,762-Speed 3182.42 samples/sec Loss 0.5222 Epoch: 13 Global Step: 217450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:46:59,465-Speed 3065.49 samples/sec Loss 0.5304 Epoch: 13 Global Step: 217500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:47:15,987-Speed 3098.96 samples/sec Loss 0.5285 Epoch: 13 Global Step: 217550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:47:32,408-Speed 3118.15 samples/sec Loss 0.5251 Epoch: 13 Global Step: 217600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:47:48,661-Speed 3150.30 samples/sec Loss 0.5249 Epoch: 13 Global Step: 217650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:48:05,065-Speed 3121.13 samples/sec Loss 0.5487 Epoch: 13 Global Step: 217700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:48:21,367-Speed 3140.87 samples/sec Loss 0.5315 Epoch: 13 Global Step: 217750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:48:37,649-Speed 3144.71 samples/sec Loss 0.5320 Epoch: 13 Global Step: 217800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:48:54,024-Speed 3126.86 samples/sec Loss 0.5501 Epoch: 13 Global Step: 217850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:49:10,164-Speed 3172.26 samples/sec Loss 0.5219 Epoch: 13 Global Step: 217900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:49:26,624-Speed 3110.73 samples/sec Loss 0.5349 Epoch: 13 Global Step: 217950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:49:43,163-Speed 3095.76 samples/sec Loss 0.5361 Epoch: 13 Global Step: 218000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:50:36,438-[lfw][218000]XNorm: 21.801090 Training: 2021-03-17 00:50:36,438-[lfw][218000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 00:50:36,438-[lfw][218000]Accuracy-Highest: 0.99817 Training: 2021-03-17 00:51:38,420-[cfp_fp][218000]XNorm: 22.294330 Training: 2021-03-17 00:51:38,420-[cfp_fp][218000]Accuracy-Flip: 0.99100+-0.00531 Training: 2021-03-17 00:51:38,420-[cfp_fp][218000]Accuracy-Highest: 0.99186 Training: 2021-03-17 00:52:31,934-[agedb_30][218000]XNorm: 22.848956 Training: 2021-03-17 00:52:31,934-[agedb_30][218000]Accuracy-Flip: 0.98150+-0.00643 Training: 2021-03-17 00:52:31,935-[agedb_30][218000]Accuracy-Highest: 0.98433 Training: 2021-03-17 00:52:50,863-Speed 272.78 samples/sec Loss 0.5421 Epoch: 13 Global Step: 218050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:53:07,047-Speed 3163.86 samples/sec Loss 0.5336 Epoch: 13 Global Step: 218100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:53:23,355-Speed 3139.51 samples/sec Loss 0.5368 Epoch: 13 Global Step: 218150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:53:39,914-Speed 3092.23 samples/sec Loss 0.5421 Epoch: 13 Global Step: 218200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:53:56,348-Speed 3115.58 samples/sec Loss 0.5250 Epoch: 13 Global Step: 218250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:54:12,742-Speed 3123.10 samples/sec Loss 0.5427 Epoch: 13 Global Step: 218300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:54:29,020-Speed 3145.46 samples/sec Loss 0.5377 Epoch: 13 Global Step: 218350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:54:46,229-Speed 2975.18 samples/sec Loss 0.5336 Epoch: 13 Global Step: 218400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:55:02,601-Speed 3127.35 samples/sec Loss 0.5372 Epoch: 13 Global Step: 218450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:55:18,881-Speed 3145.16 samples/sec Loss 0.5362 Epoch: 13 Global Step: 218500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:55:35,291-Speed 3120.08 samples/sec Loss 0.5452 Epoch: 13 Global Step: 218550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:55:51,387-Speed 3181.12 samples/sec Loss 0.5240 Epoch: 13 Global Step: 218600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:56:07,618-Speed 3154.56 samples/sec Loss 0.5368 Epoch: 13 Global Step: 218650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:56:23,978-Speed 3129.65 samples/sec Loss 0.5384 Epoch: 13 Global Step: 218700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:56:40,447-Speed 3109.01 samples/sec Loss 0.5392 Epoch: 13 Global Step: 218750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:56:56,730-Speed 3144.45 samples/sec Loss 0.5455 Epoch: 13 Global Step: 218800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:57:13,070-Speed 3133.45 samples/sec Loss 0.5249 Epoch: 13 Global Step: 218850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:57:30,215-Speed 2986.30 samples/sec Loss 0.5344 Epoch: 13 Global Step: 218900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:57:46,582-Speed 3128.43 samples/sec Loss 0.5359 Epoch: 13 Global Step: 218950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:58:03,570-Speed 3014.04 samples/sec Loss 0.5391 Epoch: 13 Global Step: 219000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:58:19,970-Speed 3121.94 samples/sec Loss 0.5395 Epoch: 13 Global Step: 219050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:58:36,249-Speed 3145.18 samples/sec Loss 0.5455 Epoch: 13 Global Step: 219100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:58:53,605-Speed 2950.11 samples/sec Loss 0.5366 Epoch: 13 Global Step: 219150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:59:09,870-Speed 3148.07 samples/sec Loss 0.5288 Epoch: 13 Global Step: 219200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:59:26,337-Speed 3109.30 samples/sec Loss 0.5275 Epoch: 13 Global Step: 219250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:59:42,869-Speed 3097.04 samples/sec Loss 0.5344 Epoch: 13 Global Step: 219300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 00:59:59,232-Speed 3129.11 samples/sec Loss 0.5480 Epoch: 13 Global Step: 219350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:00:15,487-Speed 3149.99 samples/sec Loss 0.5440 Epoch: 13 Global Step: 219400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:00:31,710-Speed 3155.99 samples/sec Loss 0.5243 Epoch: 13 Global Step: 219450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:00:48,077-Speed 3128.50 samples/sec Loss 0.5366 Epoch: 13 Global Step: 219500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:01:04,456-Speed 3125.97 samples/sec Loss 0.5449 Epoch: 13 Global Step: 219550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:01:20,722-Speed 3147.75 samples/sec Loss 0.5390 Epoch: 13 Global Step: 219600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:01:37,802-Speed 2997.78 samples/sec Loss 0.5486 Epoch: 13 Global Step: 219650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:01:54,017-Speed 3157.63 samples/sec Loss 0.5528 Epoch: 13 Global Step: 219700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:02:10,184-Speed 3167.08 samples/sec Loss 0.5414 Epoch: 13 Global Step: 219750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:02:26,390-Speed 3159.35 samples/sec Loss 0.5449 Epoch: 13 Global Step: 219800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:02:42,724-Speed 3134.70 samples/sec Loss 0.5284 Epoch: 13 Global Step: 219850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:02:59,108-Speed 3125.12 samples/sec Loss 0.5389 Epoch: 13 Global Step: 219900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:03:15,452-Speed 3132.61 samples/sec Loss 0.5430 Epoch: 13 Global Step: 219950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:03:31,733-Speed 3144.97 samples/sec Loss 0.5248 Epoch: 13 Global Step: 220000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:04:24,645-[lfw][220000]XNorm: 22.095316 Training: 2021-03-17 01:04:24,645-[lfw][220000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 01:04:24,645-[lfw][220000]Accuracy-Highest: 0.99817 Training: 2021-03-17 01:05:26,351-[cfp_fp][220000]XNorm: 22.520383 Training: 2021-03-17 01:05:26,351-[cfp_fp][220000]Accuracy-Flip: 0.98986+-0.00577 Training: 2021-03-17 01:05:26,351-[cfp_fp][220000]Accuracy-Highest: 0.99186 Training: 2021-03-17 01:06:19,411-[agedb_30][220000]XNorm: 23.169479 Training: 2021-03-17 01:06:19,412-[agedb_30][220000]Accuracy-Flip: 0.98417+-0.00680 Training: 2021-03-17 01:06:19,412-[agedb_30][220000]Accuracy-Highest: 0.98433 Training: 2021-03-17 01:06:35,962-Speed 277.91 samples/sec Loss 0.5521 Epoch: 13 Global Step: 220050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:06:52,565-Speed 3083.86 samples/sec Loss 0.5419 Epoch: 13 Global Step: 220100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:07:09,088-Speed 3098.81 samples/sec Loss 0.5403 Epoch: 13 Global Step: 220150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:07:25,325-Speed 3153.41 samples/sec Loss 0.5425 Epoch: 13 Global Step: 220200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:07:42,581-Speed 2967.26 samples/sec Loss 0.5336 Epoch: 13 Global Step: 220250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:08:00,110-Speed 2920.97 samples/sec Loss 0.5334 Epoch: 13 Global Step: 220300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:08:16,363-Speed 3150.22 samples/sec Loss 0.5465 Epoch: 13 Global Step: 220350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:08:32,611-Speed 3151.30 samples/sec Loss 0.5381 Epoch: 13 Global Step: 220400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:08:48,826-Speed 3157.61 samples/sec Loss 0.5415 Epoch: 13 Global Step: 220450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:09:05,292-Speed 3109.44 samples/sec Loss 0.5344 Epoch: 13 Global Step: 220500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:09:21,554-Speed 3148.64 samples/sec Loss 0.5432 Epoch: 13 Global Step: 220550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:09:39,424-Speed 2865.17 samples/sec Loss 0.5367 Epoch: 13 Global Step: 220600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:09:55,715-Speed 3142.99 samples/sec Loss 0.5345 Epoch: 13 Global Step: 220650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:10:12,022-Speed 3139.94 samples/sec Loss 0.5509 Epoch: 13 Global Step: 220700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:10:28,278-Speed 3149.69 samples/sec Loss 0.5364 Epoch: 13 Global Step: 220750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:10:44,600-Speed 3136.89 samples/sec Loss 0.5347 Epoch: 13 Global Step: 220800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:11:00,969-Speed 3128.05 samples/sec Loss 0.5447 Epoch: 13 Global Step: 220850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:11:17,378-Speed 3120.19 samples/sec Loss 0.5279 Epoch: 13 Global Step: 220900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:11:33,699-Speed 3137.18 samples/sec Loss 0.5451 Epoch: 13 Global Step: 220950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:11:50,025-Speed 3136.29 samples/sec Loss 0.5410 Epoch: 13 Global Step: 221000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:12:06,404-Speed 3126.07 samples/sec Loss 0.5409 Epoch: 13 Global Step: 221050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:12:22,933-Speed 3097.54 samples/sec Loss 0.5491 Epoch: 13 Global Step: 221100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:12:40,798-Speed 2865.99 samples/sec Loss 0.5459 Epoch: 13 Global Step: 221150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:12:57,356-Speed 3092.27 samples/sec Loss 0.5472 Epoch: 13 Global Step: 221200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:13:13,827-Speed 3108.59 samples/sec Loss 0.5320 Epoch: 13 Global Step: 221250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:13:31,198-Speed 2947.55 samples/sec Loss 0.5635 Epoch: 13 Global Step: 221300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:13:47,448-Speed 3150.89 samples/sec Loss 0.5288 Epoch: 13 Global Step: 221350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:14:03,629-Speed 3164.23 samples/sec Loss 0.5314 Epoch: 13 Global Step: 221400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:14:20,055-Speed 3117.22 samples/sec Loss 0.5393 Epoch: 13 Global Step: 221450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:14:36,251-Speed 3161.30 samples/sec Loss 0.5434 Epoch: 13 Global Step: 221500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:14:52,585-Speed 3134.75 samples/sec Loss 0.5350 Epoch: 13 Global Step: 221550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:15:09,314-Speed 3060.64 samples/sec Loss 0.5467 Epoch: 13 Global Step: 221600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:15:26,154-Speed 3040.39 samples/sec Loss 0.5489 Epoch: 13 Global Step: 221650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:15:43,082-Speed 3024.65 samples/sec Loss 0.5488 Epoch: 13 Global Step: 221700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:15:59,507-Speed 3117.26 samples/sec Loss 0.5333 Epoch: 13 Global Step: 221750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:16:16,760-Speed 2967.69 samples/sec Loss 0.5321 Epoch: 13 Global Step: 221800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:16:33,011-Speed 3150.79 samples/sec Loss 0.5421 Epoch: 13 Global Step: 221850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:16:49,262-Speed 3150.50 samples/sec Loss 0.5510 Epoch: 13 Global Step: 221900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:17:05,493-Speed 3154.70 samples/sec Loss 0.5522 Epoch: 13 Global Step: 221950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:17:21,814-Speed 3137.15 samples/sec Loss 0.5574 Epoch: 13 Global Step: 222000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:18:14,920-[lfw][222000]XNorm: 21.927883 Training: 2021-03-17 01:18:14,921-[lfw][222000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 01:18:14,921-[lfw][222000]Accuracy-Highest: 0.99817 Training: 2021-03-17 01:19:16,771-[cfp_fp][222000]XNorm: 22.372789 Training: 2021-03-17 01:19:16,771-[cfp_fp][222000]Accuracy-Flip: 0.99157+-0.00529 Training: 2021-03-17 01:19:16,771-[cfp_fp][222000]Accuracy-Highest: 0.99186 Training: 2021-03-17 01:20:09,961-[agedb_30][222000]XNorm: 23.080170 Training: 2021-03-17 01:20:09,961-[agedb_30][222000]Accuracy-Flip: 0.98350+-0.00689 Training: 2021-03-17 01:20:09,961-[agedb_30][222000]Accuracy-Highest: 0.98433 Training: 2021-03-17 01:20:26,343-Speed 277.46 samples/sec Loss 0.5392 Epoch: 13 Global Step: 222050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:20:42,580-Speed 3153.43 samples/sec Loss 0.5437 Epoch: 13 Global Step: 222100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:20:58,881-Speed 3140.95 samples/sec Loss 0.5522 Epoch: 13 Global Step: 222150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:21:15,201-Speed 3137.43 samples/sec Loss 0.5408 Epoch: 13 Global Step: 222200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:21:31,446-Speed 3151.83 samples/sec Loss 0.5340 Epoch: 13 Global Step: 222250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:21:47,718-Speed 3146.66 samples/sec Loss 0.5402 Epoch: 13 Global Step: 222300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:22:03,991-Speed 3146.38 samples/sec Loss 0.5323 Epoch: 13 Global Step: 222350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:22:20,360-Speed 3127.90 samples/sec Loss 0.5376 Epoch: 13 Global Step: 222400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:22:36,544-Speed 3163.72 samples/sec Loss 0.5468 Epoch: 13 Global Step: 222450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:22:54,298-Speed 2883.83 samples/sec Loss 0.5647 Epoch: 13 Global Step: 222500 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:23:10,635-Speed 3134.17 samples/sec Loss 0.5462 Epoch: 13 Global Step: 222550 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:23:26,972-Speed 3134.08 samples/sec Loss 0.5481 Epoch: 13 Global Step: 222600 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:23:43,695-Speed 3061.70 samples/sec Loss 0.5449 Epoch: 13 Global Step: 222650 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:24:00,161-Speed 3109.55 samples/sec Loss 0.5582 Epoch: 13 Global Step: 222700 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:24:16,430-Speed 3147.12 samples/sec Loss 0.5455 Epoch: 13 Global Step: 222750 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:24:33,497-Speed 3000.08 samples/sec Loss 0.5321 Epoch: 13 Global Step: 222800 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:24:49,714-Speed 3157.30 samples/sec Loss 0.5317 Epoch: 13 Global Step: 222850 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:25:06,701-Speed 3014.15 samples/sec Loss 0.5403 Epoch: 13 Global Step: 222900 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:25:22,963-Speed 3148.63 samples/sec Loss 0.5485 Epoch: 13 Global Step: 222950 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:25:39,277-Speed 3138.42 samples/sec Loss 0.5562 Epoch: 13 Global Step: 223000 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:25:55,555-Speed 3145.41 samples/sec Loss 0.5551 Epoch: 13 Global Step: 223050 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:26:11,967-Speed 3119.74 samples/sec Loss 0.5559 Epoch: 13 Global Step: 223100 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:26:28,149-Speed 3164.12 samples/sec Loss 0.5357 Epoch: 13 Global Step: 223150 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:26:44,493-Speed 3132.76 samples/sec Loss 0.5457 Epoch: 13 Global Step: 223200 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:27:00,708-Speed 3157.69 samples/sec Loss 0.5436 Epoch: 13 Global Step: 223250 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:27:17,711-Speed 3011.22 samples/sec Loss 0.5451 Epoch: 13 Global Step: 223300 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:27:35,076-Speed 2948.63 samples/sec Loss 0.5568 Epoch: 13 Global Step: 223350 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:27:51,637-Speed 3091.68 samples/sec Loss 0.5520 Epoch: 13 Global Step: 223400 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:28:07,879-Speed 3152.40 samples/sec Loss 0.5300 Epoch: 13 Global Step: 223450 Fp16 Grad Scale: 16384 Required: 13 hours Training: 2021-03-17 01:28:24,966-Speed 2996.61 samples/sec Loss 0.5465 Epoch: 13 Global Step: 223500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:28:41,156-Speed 3162.53 samples/sec Loss 0.5433 Epoch: 13 Global Step: 223550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:28:57,758-Speed 3083.98 samples/sec Loss 0.5346 Epoch: 13 Global Step: 223600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:29:14,025-Speed 3147.53 samples/sec Loss 0.5380 Epoch: 13 Global Step: 223650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:29:30,613-Speed 3086.75 samples/sec Loss 0.5482 Epoch: 13 Global Step: 223700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:29:47,078-Speed 3109.71 samples/sec Loss 0.5523 Epoch: 13 Global Step: 223750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:30:03,326-Speed 3151.23 samples/sec Loss 0.5498 Epoch: 13 Global Step: 223800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:30:19,593-Speed 3147.45 samples/sec Loss 0.5551 Epoch: 13 Global Step: 223850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:30:35,846-Speed 3150.45 samples/sec Loss 0.5447 Epoch: 13 Global Step: 223900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:30:52,184-Speed 3133.73 samples/sec Loss 0.5509 Epoch: 13 Global Step: 223950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:31:09,220-Speed 3005.52 samples/sec Loss 0.5373 Epoch: 13 Global Step: 224000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:32:02,116-[lfw][224000]XNorm: 21.688503 Training: 2021-03-17 01:32:02,116-[lfw][224000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 01:32:02,116-[lfw][224000]Accuracy-Highest: 0.99817 Training: 2021-03-17 01:33:03,844-[cfp_fp][224000]XNorm: 22.071615 Training: 2021-03-17 01:33:03,844-[cfp_fp][224000]Accuracy-Flip: 0.99057+-0.00531 Training: 2021-03-17 01:33:03,844-[cfp_fp][224000]Accuracy-Highest: 0.99186 Training: 2021-03-17 01:33:56,870-[agedb_30][224000]XNorm: 22.772494 Training: 2021-03-17 01:33:56,870-[agedb_30][224000]Accuracy-Flip: 0.98217+-0.00683 Training: 2021-03-17 01:33:56,870-[agedb_30][224000]Accuracy-Highest: 0.98433 Training: 2021-03-17 01:34:13,095-Speed 278.45 samples/sec Loss 0.5391 Epoch: 13 Global Step: 224050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:34:29,322-Speed 3155.23 samples/sec Loss 0.5529 Epoch: 13 Global Step: 224100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:34:45,719-Speed 3122.68 samples/sec Loss 0.5535 Epoch: 13 Global Step: 224150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:35:02,017-Speed 3141.65 samples/sec Loss 0.5572 Epoch: 13 Global Step: 224200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:35:18,404-Speed 3124.40 samples/sec Loss 0.5538 Epoch: 13 Global Step: 224250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:35:34,679-Speed 3146.12 samples/sec Loss 0.5451 Epoch: 13 Global Step: 224300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:35:51,045-Speed 3128.41 samples/sec Loss 0.5488 Epoch: 13 Global Step: 224350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:36:07,327-Speed 3144.73 samples/sec Loss 0.5398 Epoch: 13 Global Step: 224400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:36:23,544-Speed 3157.19 samples/sec Loss 0.5390 Epoch: 13 Global Step: 224450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:36:39,904-Speed 3129.74 samples/sec Loss 0.5364 Epoch: 13 Global Step: 224500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:36:56,094-Speed 3162.61 samples/sec Loss 0.5590 Epoch: 13 Global Step: 224550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:37:12,395-Speed 3140.85 samples/sec Loss 0.5389 Epoch: 13 Global Step: 224600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:37:29,352-Speed 3019.57 samples/sec Loss 0.5416 Epoch: 13 Global Step: 224650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:37:46,642-Speed 2961.34 samples/sec Loss 0.5436 Epoch: 13 Global Step: 224700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:38:02,987-Speed 3132.60 samples/sec Loss 0.5327 Epoch: 13 Global Step: 224750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:38:19,303-Speed 3138.10 samples/sec Loss 0.5437 Epoch: 13 Global Step: 224800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:38:35,506-Speed 3159.94 samples/sec Loss 0.5466 Epoch: 13 Global Step: 224850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:38:51,826-Speed 3137.46 samples/sec Loss 0.5510 Epoch: 13 Global Step: 224900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:39:08,392-Speed 3090.71 samples/sec Loss 0.5506 Epoch: 13 Global Step: 224950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:39:25,379-Speed 3014.13 samples/sec Loss 0.5456 Epoch: 13 Global Step: 225000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:39:41,742-Speed 3129.17 samples/sec Loss 0.5417 Epoch: 13 Global Step: 225050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:39:58,786-Speed 3004.05 samples/sec Loss 0.5478 Epoch: 13 Global Step: 225100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:40:15,047-Speed 3148.74 samples/sec Loss 0.5429 Epoch: 13 Global Step: 225150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:40:31,456-Speed 3120.33 samples/sec Loss 0.5510 Epoch: 13 Global Step: 225200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:40:47,835-Speed 3126.07 samples/sec Loss 0.5356 Epoch: 13 Global Step: 225250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:41:04,304-Speed 3108.98 samples/sec Loss 0.5448 Epoch: 13 Global Step: 225300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:41:20,510-Speed 3159.27 samples/sec Loss 0.5465 Epoch: 13 Global Step: 225350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:41:36,883-Speed 3127.25 samples/sec Loss 0.5361 Epoch: 13 Global Step: 225400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:41:53,454-Speed 3089.77 samples/sec Loss 0.5384 Epoch: 13 Global Step: 225450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:42:10,622-Speed 2982.43 samples/sec Loss 0.5461 Epoch: 13 Global Step: 225500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:42:27,738-Speed 2991.42 samples/sec Loss 0.5550 Epoch: 13 Global Step: 225550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:42:44,118-Speed 3125.96 samples/sec Loss 0.5319 Epoch: 13 Global Step: 225600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:43:00,286-Speed 3166.71 samples/sec Loss 0.5453 Epoch: 13 Global Step: 225650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:43:17,601-Speed 2957.08 samples/sec Loss 0.5333 Epoch: 13 Global Step: 225700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:43:34,025-Speed 3117.55 samples/sec Loss 0.5451 Epoch: 13 Global Step: 225750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:43:50,451-Speed 3116.98 samples/sec Loss 0.5333 Epoch: 13 Global Step: 225800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:44:06,779-Speed 3135.97 samples/sec Loss 0.5509 Epoch: 13 Global Step: 225850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:44:22,999-Speed 3156.64 samples/sec Loss 0.5467 Epoch: 13 Global Step: 225900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:44:39,624-Speed 3079.79 samples/sec Loss 0.5427 Epoch: 13 Global Step: 225950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:44:56,023-Speed 3122.13 samples/sec Loss 0.5479 Epoch: 13 Global Step: 226000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:45:49,177-[lfw][226000]XNorm: 21.510746 Training: 2021-03-17 01:45:49,178-[lfw][226000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 01:45:49,178-[lfw][226000]Accuracy-Highest: 0.99817 Training: 2021-03-17 01:46:51,340-[cfp_fp][226000]XNorm: 21.818288 Training: 2021-03-17 01:46:51,340-[cfp_fp][226000]Accuracy-Flip: 0.99000+-0.00535 Training: 2021-03-17 01:46:51,340-[cfp_fp][226000]Accuracy-Highest: 0.99186 Training: 2021-03-17 01:47:44,519-[agedb_30][226000]XNorm: 22.476899 Training: 2021-03-17 01:47:44,520-[agedb_30][226000]Accuracy-Flip: 0.98150+-0.00697 Training: 2021-03-17 01:47:44,520-[agedb_30][226000]Accuracy-Highest: 0.98433 Training: 2021-03-17 01:48:00,716-Speed 277.22 samples/sec Loss 0.5458 Epoch: 13 Global Step: 226050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:48:16,899-Speed 3163.90 samples/sec Loss 0.5500 Epoch: 13 Global Step: 226100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:48:33,252-Speed 3130.98 samples/sec Loss 0.5460 Epoch: 13 Global Step: 226150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:48:50,510-Speed 2966.87 samples/sec Loss 0.5469 Epoch: 13 Global Step: 226200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:49:07,794-Speed 2962.34 samples/sec Loss 0.5519 Epoch: 13 Global Step: 226250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:49:24,076-Speed 3144.61 samples/sec Loss 0.5603 Epoch: 13 Global Step: 226300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:49:40,272-Speed 3161.42 samples/sec Loss 0.5417 Epoch: 13 Global Step: 226350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:49:56,554-Speed 3144.71 samples/sec Loss 0.5387 Epoch: 13 Global Step: 226400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:50:13,297-Speed 3057.97 samples/sec Loss 0.5430 Epoch: 13 Global Step: 226450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:50:29,740-Speed 3113.87 samples/sec Loss 0.5389 Epoch: 13 Global Step: 226500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:50:46,044-Speed 3140.54 samples/sec Loss 0.5475 Epoch: 13 Global Step: 226550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:51:02,444-Speed 3122.10 samples/sec Loss 0.5487 Epoch: 13 Global Step: 226600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:51:18,602-Speed 3168.77 samples/sec Loss 0.5467 Epoch: 13 Global Step: 226650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:51:34,840-Speed 3153.10 samples/sec Loss 0.5512 Epoch: 13 Global Step: 226700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:51:51,497-Speed 3073.88 samples/sec Loss 0.5429 Epoch: 13 Global Step: 226750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:52:08,135-Speed 3077.46 samples/sec Loss 0.5574 Epoch: 13 Global Step: 226800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:52:25,263-Speed 2989.33 samples/sec Loss 0.5525 Epoch: 13 Global Step: 226850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:52:41,775-Speed 3100.94 samples/sec Loss 0.5397 Epoch: 13 Global Step: 226900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:52:59,031-Speed 2967.11 samples/sec Loss 0.5494 Epoch: 13 Global Step: 226950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:53:15,610-Speed 3088.30 samples/sec Loss 0.5520 Epoch: 13 Global Step: 227000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:53:31,888-Speed 3145.45 samples/sec Loss 0.5406 Epoch: 13 Global Step: 227050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:53:48,175-Speed 3143.74 samples/sec Loss 0.5483 Epoch: 13 Global Step: 227100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:54:04,337-Speed 3168.08 samples/sec Loss 0.5521 Epoch: 13 Global Step: 227150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:54:21,392-Speed 3002.02 samples/sec Loss 0.5492 Epoch: 13 Global Step: 227200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:54:37,659-Speed 3147.65 samples/sec Loss 0.5449 Epoch: 13 Global Step: 227250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:54:54,684-Speed 3007.38 samples/sec Loss 0.5479 Epoch: 13 Global Step: 227300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:55:11,198-Speed 3100.65 samples/sec Loss 0.5445 Epoch: 13 Global Step: 227350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:55:27,677-Speed 3106.96 samples/sec Loss 0.5374 Epoch: 13 Global Step: 227400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:55:44,001-Speed 3136.64 samples/sec Loss 0.5574 Epoch: 13 Global Step: 227450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:56:00,302-Speed 3141.04 samples/sec Loss 0.5423 Epoch: 13 Global Step: 227500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:56:16,607-Speed 3140.25 samples/sec Loss 0.5378 Epoch: 13 Global Step: 227550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:56:33,021-Speed 3119.24 samples/sec Loss 0.5502 Epoch: 13 Global Step: 227600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:56:49,250-Speed 3154.97 samples/sec Loss 0.5627 Epoch: 13 Global Step: 227650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:57:06,399-Speed 2985.70 samples/sec Loss 0.5532 Epoch: 13 Global Step: 227700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:57:22,857-Speed 3110.99 samples/sec Loss 0.5559 Epoch: 13 Global Step: 227750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:57:39,099-Speed 3152.46 samples/sec Loss 0.5462 Epoch: 13 Global Step: 227800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:57:57,063-Speed 2850.28 samples/sec Loss 0.5451 Epoch: 13 Global Step: 227850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:58:13,369-Speed 3140.08 samples/sec Loss 0.5410 Epoch: 13 Global Step: 227900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:58:29,762-Speed 3123.26 samples/sec Loss 0.5563 Epoch: 13 Global Step: 227950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:58:46,093-Speed 3135.28 samples/sec Loss 0.5526 Epoch: 13 Global Step: 228000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 01:59:39,188-[lfw][228000]XNorm: 21.753766 Training: 2021-03-17 01:59:39,188-[lfw][228000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 01:59:39,188-[lfw][228000]Accuracy-Highest: 0.99817 Training: 2021-03-17 02:00:41,022-[cfp_fp][228000]XNorm: 22.085112 Training: 2021-03-17 02:00:41,022-[cfp_fp][228000]Accuracy-Flip: 0.99100+-0.00465 Training: 2021-03-17 02:00:41,022-[cfp_fp][228000]Accuracy-Highest: 0.99186 Training: 2021-03-17 02:01:34,158-[agedb_30][228000]XNorm: 22.741979 Training: 2021-03-17 02:01:34,158-[agedb_30][228000]Accuracy-Flip: 0.98367+-0.00690 Training: 2021-03-17 02:01:34,158-[agedb_30][228000]Accuracy-Highest: 0.98433 Training: 2021-03-17 02:01:50,327-Speed 277.91 samples/sec Loss 0.5462 Epoch: 13 Global Step: 228050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:02:06,542-Speed 3157.74 samples/sec Loss 0.5576 Epoch: 13 Global Step: 228100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:02:22,656-Speed 3177.50 samples/sec Loss 0.5647 Epoch: 13 Global Step: 228150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:02:38,920-Speed 3148.21 samples/sec Loss 0.5492 Epoch: 13 Global Step: 228200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:02:55,186-Speed 3147.71 samples/sec Loss 0.5623 Epoch: 13 Global Step: 228250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:03:11,626-Speed 3114.44 samples/sec Loss 0.5398 Epoch: 13 Global Step: 228300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:03:28,154-Speed 3097.93 samples/sec Loss 0.5569 Epoch: 13 Global Step: 228350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:03:44,539-Speed 3124.80 samples/sec Loss 0.5363 Epoch: 13 Global Step: 228400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:04:00,883-Speed 3132.87 samples/sec Loss 0.5474 Epoch: 13 Global Step: 228450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:04:18,020-Speed 2987.78 samples/sec Loss 0.5458 Epoch: 13 Global Step: 228500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:04:34,280-Speed 3148.85 samples/sec Loss 0.5443 Epoch: 13 Global Step: 228550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:04:50,626-Speed 3132.38 samples/sec Loss 0.5457 Epoch: 13 Global Step: 228600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:05:07,003-Speed 3126.41 samples/sec Loss 0.5474 Epoch: 13 Global Step: 228650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:05:23,497-Speed 3104.31 samples/sec Loss 0.5396 Epoch: 13 Global Step: 228700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:05:39,745-Speed 3151.20 samples/sec Loss 0.5534 Epoch: 13 Global Step: 228750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:05:56,091-Speed 3132.33 samples/sec Loss 0.5432 Epoch: 13 Global Step: 228800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:06:12,462-Speed 3127.50 samples/sec Loss 0.5515 Epoch: 13 Global Step: 228850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:06:28,951-Speed 3105.25 samples/sec Loss 0.5463 Epoch: 13 Global Step: 228900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:06:45,170-Speed 3156.87 samples/sec Loss 0.5412 Epoch: 13 Global Step: 228950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:07:01,348-Speed 3165.01 samples/sec Loss 0.5401 Epoch: 13 Global Step: 229000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:07:18,400-Speed 3002.62 samples/sec Loss 0.5405 Epoch: 13 Global Step: 229050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:07:34,874-Speed 3107.95 samples/sec Loss 0.5560 Epoch: 13 Global Step: 229100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:07:51,127-Speed 3150.38 samples/sec Loss 0.5565 Epoch: 13 Global Step: 229150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:08:07,390-Speed 3148.42 samples/sec Loss 0.5499 Epoch: 13 Global Step: 229200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:08:24,450-Speed 3001.21 samples/sec Loss 0.5380 Epoch: 13 Global Step: 229250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:08:40,754-Speed 3140.41 samples/sec Loss 0.5498 Epoch: 13 Global Step: 229300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:08:57,216-Speed 3110.32 samples/sec Loss 0.5508 Epoch: 13 Global Step: 229350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:09:13,460-Speed 3151.99 samples/sec Loss 0.5501 Epoch: 13 Global Step: 229400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:09:29,928-Speed 3109.11 samples/sec Loss 0.5442 Epoch: 13 Global Step: 229450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:09:47,930-Speed 2844.24 samples/sec Loss 0.5514 Epoch: 13 Global Step: 229500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:10:04,223-Speed 3142.68 samples/sec Loss 0.5678 Epoch: 13 Global Step: 229550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:10:20,370-Speed 3170.78 samples/sec Loss 0.5454 Epoch: 13 Global Step: 229600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:10:36,626-Speed 3149.73 samples/sec Loss 0.5434 Epoch: 13 Global Step: 229650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:10:52,881-Speed 3149.93 samples/sec Loss 0.5564 Epoch: 13 Global Step: 229700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:11:09,178-Speed 3141.84 samples/sec Loss 0.5544 Epoch: 13 Global Step: 229750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:11:25,381-Speed 3160.06 samples/sec Loss 0.5541 Epoch: 13 Global Step: 229800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:11:41,896-Speed 3100.14 samples/sec Loss 0.5353 Epoch: 13 Global Step: 229850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:11:59,274-Speed 2946.35 samples/sec Loss 0.5454 Epoch: 13 Global Step: 229900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:12:15,708-Speed 3115.72 samples/sec Loss 0.5481 Epoch: 13 Global Step: 229950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:12:31,894-Speed 3163.34 samples/sec Loss 0.5457 Epoch: 13 Global Step: 230000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:13:24,933-[lfw][230000]XNorm: 22.200635 Training: 2021-03-17 02:13:24,933-[lfw][230000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 02:13:24,933-[lfw][230000]Accuracy-Highest: 0.99817 Training: 2021-03-17 02:14:26,712-[cfp_fp][230000]XNorm: 22.254741 Training: 2021-03-17 02:14:26,712-[cfp_fp][230000]Accuracy-Flip: 0.99071+-0.00551 Training: 2021-03-17 02:14:26,712-[cfp_fp][230000]Accuracy-Highest: 0.99186 Training: 2021-03-17 02:15:19,828-[agedb_30][230000]XNorm: 23.021315 Training: 2021-03-17 02:15:19,829-[agedb_30][230000]Accuracy-Flip: 0.98333+-0.00675 Training: 2021-03-17 02:15:19,829-[agedb_30][230000]Accuracy-Highest: 0.98433 Training: 2021-03-17 02:15:36,154-Speed 277.87 samples/sec Loss 0.5385 Epoch: 13 Global Step: 230050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:15:54,043-Speed 2862.16 samples/sec Loss 0.5392 Epoch: 13 Global Step: 230100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:16:10,528-Speed 3106.01 samples/sec Loss 0.5441 Epoch: 13 Global Step: 230150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:16:26,789-Speed 3148.62 samples/sec Loss 0.5558 Epoch: 13 Global Step: 230200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:16:43,183-Speed 3123.14 samples/sec Loss 0.5455 Epoch: 13 Global Step: 230250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:16:59,294-Speed 3178.11 samples/sec Loss 0.5535 Epoch: 13 Global Step: 230300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:17:15,536-Speed 3152.37 samples/sec Loss 0.5425 Epoch: 13 Global Step: 230350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:17:31,706-Speed 3166.50 samples/sec Loss 0.5500 Epoch: 13 Global Step: 230400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:17:48,224-Speed 3099.81 samples/sec Loss 0.5493 Epoch: 13 Global Step: 230450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:18:04,731-Speed 3101.68 samples/sec Loss 0.5601 Epoch: 13 Global Step: 230500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:18:21,032-Speed 3141.05 samples/sec Loss 0.5553 Epoch: 13 Global Step: 230550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:18:37,298-Speed 3147.72 samples/sec Loss 0.5480 Epoch: 13 Global Step: 230600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:18:54,725-Speed 2938.09 samples/sec Loss 0.5510 Epoch: 13 Global Step: 230650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:19:10,992-Speed 3147.53 samples/sec Loss 0.5599 Epoch: 13 Global Step: 230700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:19:27,411-Speed 3118.45 samples/sec Loss 0.5513 Epoch: 13 Global Step: 230750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:19:43,600-Speed 3162.68 samples/sec Loss 0.5638 Epoch: 13 Global Step: 230800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:19:59,869-Speed 3147.31 samples/sec Loss 0.5501 Epoch: 13 Global Step: 230850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:20:16,256-Speed 3124.49 samples/sec Loss 0.5589 Epoch: 13 Global Step: 230900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:20:32,748-Speed 3104.59 samples/sec Loss 0.5469 Epoch: 13 Global Step: 230950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:20:49,300-Speed 3093.44 samples/sec Loss 0.5536 Epoch: 13 Global Step: 231000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:21:05,692-Speed 3123.58 samples/sec Loss 0.5561 Epoch: 13 Global Step: 231050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:21:22,118-Speed 3116.93 samples/sec Loss 0.5437 Epoch: 13 Global Step: 231100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:21:38,447-Speed 3135.78 samples/sec Loss 0.5653 Epoch: 13 Global Step: 231150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:21:55,594-Speed 2985.98 samples/sec Loss 0.5357 Epoch: 13 Global Step: 231200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:22:12,328-Speed 3059.79 samples/sec Loss 0.5436 Epoch: 13 Global Step: 231250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:22:28,747-Speed 3118.44 samples/sec Loss 0.5574 Epoch: 13 Global Step: 231300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:22:45,040-Speed 3142.52 samples/sec Loss 0.5432 Epoch: 13 Global Step: 231350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:23:01,711-Speed 3071.21 samples/sec Loss 0.5421 Epoch: 13 Global Step: 231400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:23:18,215-Speed 3102.50 samples/sec Loss 0.5567 Epoch: 13 Global Step: 231450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:23:35,373-Speed 2984.01 samples/sec Loss 0.5538 Epoch: 13 Global Step: 231500 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:23:51,632-Speed 3150.38 samples/sec Loss 0.5449 Epoch: 13 Global Step: 231550 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:24:07,888-Speed 3149.76 samples/sec Loss 0.5531 Epoch: 13 Global Step: 231600 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:24:24,108-Speed 3156.68 samples/sec Loss 0.5527 Epoch: 13 Global Step: 231650 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:24:42,008-Speed 2860.29 samples/sec Loss 0.5548 Epoch: 13 Global Step: 231700 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:24:58,301-Speed 3142.66 samples/sec Loss 0.5450 Epoch: 13 Global Step: 231750 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:25:14,591-Speed 3143.01 samples/sec Loss 0.5481 Epoch: 13 Global Step: 231800 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:25:30,863-Speed 3146.73 samples/sec Loss 0.5511 Epoch: 13 Global Step: 231850 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:25:47,042-Speed 3164.69 samples/sec Loss 0.5614 Epoch: 13 Global Step: 231900 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:26:03,294-Speed 3150.32 samples/sec Loss 0.5398 Epoch: 13 Global Step: 231950 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:26:20,147-Speed 3038.14 samples/sec Loss 0.5658 Epoch: 13 Global Step: 232000 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:27:13,298-[lfw][232000]XNorm: 21.170846 Training: 2021-03-17 02:27:13,299-[lfw][232000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-17 02:27:13,299-[lfw][232000]Accuracy-Highest: 0.99817 Training: 2021-03-17 02:28:15,252-[cfp_fp][232000]XNorm: 21.574145 Training: 2021-03-17 02:28:15,253-[cfp_fp][232000]Accuracy-Flip: 0.99143+-0.00474 Training: 2021-03-17 02:28:15,253-[cfp_fp][232000]Accuracy-Highest: 0.99186 Training: 2021-03-17 02:29:08,415-[agedb_30][232000]XNorm: 22.006224 Training: 2021-03-17 02:29:08,415-[agedb_30][232000]Accuracy-Flip: 0.98367+-0.00726 Training: 2021-03-17 02:29:08,416-[agedb_30][232000]Accuracy-Highest: 0.98433 Training: 2021-03-17 02:29:25,928-Speed 275.59 samples/sec Loss 0.5558 Epoch: 13 Global Step: 232050 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:29:42,402-Speed 3108.00 samples/sec Loss 0.5402 Epoch: 13 Global Step: 232100 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:29:58,585-Speed 3163.95 samples/sec Loss 0.5664 Epoch: 13 Global Step: 232150 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:30:15,433-Speed 3038.96 samples/sec Loss 0.5496 Epoch: 13 Global Step: 232200 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:30:32,562-Speed 2989.16 samples/sec Loss 0.5451 Epoch: 13 Global Step: 232250 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:30:48,844-Speed 3144.68 samples/sec Loss 0.5585 Epoch: 13 Global Step: 232300 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:31:05,068-Speed 3155.87 samples/sec Loss 0.5443 Epoch: 13 Global Step: 232350 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:31:22,058-Speed 3013.66 samples/sec Loss 0.5454 Epoch: 13 Global Step: 232400 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:31:38,306-Speed 3151.38 samples/sec Loss 0.5668 Epoch: 13 Global Step: 232450 Fp16 Grad Scale: 16384 Required: 12 hours Training: 2021-03-17 02:31:54,759-Speed 3111.84 samples/sec Loss 0.5601 Epoch: 13 Global Step: 232500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:32:11,260-Speed 3103.07 samples/sec Loss 0.5460 Epoch: 13 Global Step: 232550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:32:27,655-Speed 3122.86 samples/sec Loss 0.5466 Epoch: 13 Global Step: 232600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:32:44,237-Speed 3087.85 samples/sec Loss 0.5497 Epoch: 13 Global Step: 232650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:33:00,699-Speed 3110.37 samples/sec Loss 0.5498 Epoch: 13 Global Step: 232700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:33:16,988-Speed 3143.27 samples/sec Loss 0.5555 Epoch: 13 Global Step: 232750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:33:34,002-Speed 3009.43 samples/sec Loss 0.5488 Epoch: 13 Global Step: 232800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:33:50,180-Speed 3164.81 samples/sec Loss 0.5537 Epoch: 13 Global Step: 232850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:34:06,356-Speed 3165.32 samples/sec Loss 0.5467 Epoch: 13 Global Step: 232900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:34:22,704-Speed 3131.91 samples/sec Loss 0.5380 Epoch: 13 Global Step: 232950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:34:39,181-Speed 3107.56 samples/sec Loss 0.5508 Epoch: 13 Global Step: 233000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:34:55,684-Speed 3102.47 samples/sec Loss 0.5620 Epoch: 13 Global Step: 233050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:35:12,084-Speed 3122.05 samples/sec Loss 0.5572 Epoch: 13 Global Step: 233100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:35:28,263-Speed 3164.79 samples/sec Loss 0.5357 Epoch: 13 Global Step: 233150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:35:44,553-Speed 3143.12 samples/sec Loss 0.5520 Epoch: 13 Global Step: 233200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:36:00,731-Speed 3164.84 samples/sec Loss 0.5644 Epoch: 13 Global Step: 233250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:36:17,009-Speed 3145.42 samples/sec Loss 0.5460 Epoch: 13 Global Step: 233300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:36:33,237-Speed 3155.17 samples/sec Loss 0.5546 Epoch: 13 Global Step: 233350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:36:50,283-Speed 3003.69 samples/sec Loss 0.5435 Epoch: 13 Global Step: 233400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:37:06,563-Speed 3145.20 samples/sec Loss 0.5584 Epoch: 13 Global Step: 233450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:37:23,131-Speed 3090.32 samples/sec Loss 0.5506 Epoch: 13 Global Step: 233500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:37:39,413-Speed 3144.68 samples/sec Loss 0.5594 Epoch: 13 Global Step: 233550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:37:55,591-Speed 3164.95 samples/sec Loss 0.5522 Epoch: 13 Global Step: 233600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:38:11,979-Speed 3124.24 samples/sec Loss 0.5414 Epoch: 13 Global Step: 233650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:38:43,141-Speed 1643.07 samples/sec Loss 0.5202 Epoch: 14 Global Step: 233700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:38:59,535-Speed 3123.09 samples/sec Loss 0.4680 Epoch: 14 Global Step: 233750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:39:16,055-Speed 3099.47 samples/sec Loss 0.4629 Epoch: 14 Global Step: 233800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:39:33,632-Speed 2912.93 samples/sec Loss 0.4631 Epoch: 14 Global Step: 233850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:39:50,007-Speed 3126.81 samples/sec Loss 0.4611 Epoch: 14 Global Step: 233900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:40:06,363-Speed 3130.63 samples/sec Loss 0.4634 Epoch: 14 Global Step: 233950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:40:23,482-Speed 2990.83 samples/sec Loss 0.4679 Epoch: 14 Global Step: 234000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:41:16,886-[lfw][234000]XNorm: 21.740858 Training: 2021-03-17 02:41:16,887-[lfw][234000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 02:41:16,887-[lfw][234000]Accuracy-Highest: 0.99817 Training: 2021-03-17 02:42:18,933-[cfp_fp][234000]XNorm: 22.226979 Training: 2021-03-17 02:42:18,933-[cfp_fp][234000]Accuracy-Flip: 0.99186+-0.00564 Training: 2021-03-17 02:42:18,933-[cfp_fp][234000]Accuracy-Highest: 0.99186 Training: 2021-03-17 02:43:12,249-[agedb_30][234000]XNorm: 22.707456 Training: 2021-03-17 02:43:12,250-[agedb_30][234000]Accuracy-Flip: 0.98450+-0.00615 Training: 2021-03-17 02:43:12,250-[agedb_30][234000]Accuracy-Highest: 0.98450 Training: 2021-03-17 02:43:28,593-Speed 276.59 samples/sec Loss 0.4546 Epoch: 14 Global Step: 234050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:43:44,848-Speed 3150.05 samples/sec Loss 0.4535 Epoch: 14 Global Step: 234100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:44:01,110-Speed 3148.40 samples/sec Loss 0.4486 Epoch: 14 Global Step: 234150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:44:17,499-Speed 3124.13 samples/sec Loss 0.4425 Epoch: 14 Global Step: 234200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:44:33,976-Speed 3107.51 samples/sec Loss 0.4560 Epoch: 14 Global Step: 234250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:44:50,129-Speed 3169.87 samples/sec Loss 0.4668 Epoch: 14 Global Step: 234300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:45:06,433-Speed 3140.37 samples/sec Loss 0.4414 Epoch: 14 Global Step: 234350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:45:23,568-Speed 2988.15 samples/sec Loss 0.4560 Epoch: 14 Global Step: 234400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:45:39,815-Speed 3151.40 samples/sec Loss 0.4410 Epoch: 14 Global Step: 234450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:45:57,440-Speed 2905.12 samples/sec Loss 0.4497 Epoch: 14 Global Step: 234500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:46:13,625-Speed 3163.43 samples/sec Loss 0.4396 Epoch: 14 Global Step: 234550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:46:29,936-Speed 3139.08 samples/sec Loss 0.4287 Epoch: 14 Global Step: 234600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:46:47,011-Speed 2998.61 samples/sec Loss 0.4329 Epoch: 14 Global Step: 234650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:47:03,259-Speed 3151.24 samples/sec Loss 0.4385 Epoch: 14 Global Step: 234700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:47:19,398-Speed 3172.59 samples/sec Loss 0.4354 Epoch: 14 Global Step: 234750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:47:35,644-Speed 3151.69 samples/sec Loss 0.4461 Epoch: 14 Global Step: 234800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:47:52,112-Speed 3109.17 samples/sec Loss 0.4474 Epoch: 14 Global Step: 234850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:48:08,443-Speed 3135.22 samples/sec Loss 0.4350 Epoch: 14 Global Step: 234900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:48:24,748-Speed 3140.07 samples/sec Loss 0.4297 Epoch: 14 Global Step: 234950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:48:41,143-Speed 3123.14 samples/sec Loss 0.4367 Epoch: 14 Global Step: 235000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:48:58,459-Speed 2956.88 samples/sec Loss 0.4429 Epoch: 14 Global Step: 235050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:49:15,360-Speed 3029.47 samples/sec Loss 0.4358 Epoch: 14 Global Step: 235100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:49:31,608-Speed 3151.15 samples/sec Loss 0.4442 Epoch: 14 Global Step: 235150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:49:47,904-Speed 3142.10 samples/sec Loss 0.4315 Epoch: 14 Global Step: 235200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:50:04,347-Speed 3113.90 samples/sec Loss 0.4300 Epoch: 14 Global Step: 235250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:50:20,658-Speed 3138.97 samples/sec Loss 0.4322 Epoch: 14 Global Step: 235300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:50:36,917-Speed 3149.15 samples/sec Loss 0.4342 Epoch: 14 Global Step: 235350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:50:53,208-Speed 3142.95 samples/sec Loss 0.4322 Epoch: 14 Global Step: 235400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:51:09,459-Speed 3150.59 samples/sec Loss 0.4487 Epoch: 14 Global Step: 235450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:51:25,605-Speed 3171.19 samples/sec Loss 0.4340 Epoch: 14 Global Step: 235500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:51:41,876-Speed 3146.79 samples/sec Loss 0.4310 Epoch: 14 Global Step: 235550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:51:59,624-Speed 2885.04 samples/sec Loss 0.4378 Epoch: 14 Global Step: 235600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:52:15,978-Speed 3130.68 samples/sec Loss 0.4360 Epoch: 14 Global Step: 235650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:52:32,504-Speed 3098.22 samples/sec Loss 0.4372 Epoch: 14 Global Step: 235700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:52:49,124-Speed 3080.80 samples/sec Loss 0.4382 Epoch: 14 Global Step: 235750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:53:05,318-Speed 3161.62 samples/sec Loss 0.4390 Epoch: 14 Global Step: 235800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:53:21,605-Speed 3143.74 samples/sec Loss 0.4286 Epoch: 14 Global Step: 235850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:53:38,780-Speed 2981.23 samples/sec Loss 0.4311 Epoch: 14 Global Step: 235900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:53:55,087-Speed 3139.91 samples/sec Loss 0.4335 Epoch: 14 Global Step: 235950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:54:11,355-Speed 3147.26 samples/sec Loss 0.4302 Epoch: 14 Global Step: 236000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:55:04,424-[lfw][236000]XNorm: 21.529125 Training: 2021-03-17 02:55:04,424-[lfw][236000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 02:55:04,424-[lfw][236000]Accuracy-Highest: 0.99817 Training: 2021-03-17 02:56:06,766-[cfp_fp][236000]XNorm: 22.133103 Training: 2021-03-17 02:56:06,766-[cfp_fp][236000]Accuracy-Flip: 0.99143+-0.00609 Training: 2021-03-17 02:56:06,766-[cfp_fp][236000]Accuracy-Highest: 0.99186 Training: 2021-03-17 02:57:00,218-[agedb_30][236000]XNorm: 22.596207 Training: 2021-03-17 02:57:00,218-[agedb_30][236000]Accuracy-Flip: 0.98467+-0.00623 Training: 2021-03-17 02:57:00,218-[agedb_30][236000]Accuracy-Highest: 0.98467 Training: 2021-03-17 02:57:17,719-Speed 274.73 samples/sec Loss 0.4329 Epoch: 14 Global Step: 236050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:57:34,260-Speed 3095.45 samples/sec Loss 0.4306 Epoch: 14 Global Step: 236100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:57:50,509-Speed 3150.92 samples/sec Loss 0.4452 Epoch: 14 Global Step: 236150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:58:07,925-Speed 2939.96 samples/sec Loss 0.4282 Epoch: 14 Global Step: 236200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:58:24,282-Speed 3130.27 samples/sec Loss 0.4443 Epoch: 14 Global Step: 236250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:58:40,568-Speed 3143.80 samples/sec Loss 0.4256 Epoch: 14 Global Step: 236300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:58:57,101-Speed 3096.94 samples/sec Loss 0.4271 Epoch: 14 Global Step: 236350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:59:13,387-Speed 3143.97 samples/sec Loss 0.4311 Epoch: 14 Global Step: 236400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:59:29,587-Speed 3160.63 samples/sec Loss 0.4247 Epoch: 14 Global Step: 236450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 02:59:45,832-Speed 3151.71 samples/sec Loss 0.4304 Epoch: 14 Global Step: 236500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:00:02,032-Speed 3160.60 samples/sec Loss 0.4261 Epoch: 14 Global Step: 236550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:00:19,145-Speed 2991.95 samples/sec Loss 0.4212 Epoch: 14 Global Step: 236600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:00:35,461-Speed 3138.22 samples/sec Loss 0.4198 Epoch: 14 Global Step: 236650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:00:51,674-Speed 3158.03 samples/sec Loss 0.4349 Epoch: 14 Global Step: 236700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:01:07,985-Speed 3139.11 samples/sec Loss 0.4360 Epoch: 14 Global Step: 236750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:01:25,398-Speed 2940.42 samples/sec Loss 0.4167 Epoch: 14 Global Step: 236800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:01:42,603-Speed 2975.95 samples/sec Loss 0.4291 Epoch: 14 Global Step: 236850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:01:59,000-Speed 3122.59 samples/sec Loss 0.4211 Epoch: 14 Global Step: 236900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:02:15,309-Speed 3139.57 samples/sec Loss 0.4365 Epoch: 14 Global Step: 236950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:02:31,962-Speed 3074.57 samples/sec Loss 0.4199 Epoch: 14 Global Step: 237000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:02:48,262-Speed 3141.12 samples/sec Loss 0.4226 Epoch: 14 Global Step: 237050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:03:04,684-Speed 3117.95 samples/sec Loss 0.4200 Epoch: 14 Global Step: 237100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:03:21,013-Speed 3135.47 samples/sec Loss 0.4353 Epoch: 14 Global Step: 237150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:03:38,242-Speed 2971.80 samples/sec Loss 0.4338 Epoch: 14 Global Step: 237200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:03:54,610-Speed 3128.22 samples/sec Loss 0.4228 Epoch: 14 Global Step: 237250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:04:10,920-Speed 3139.21 samples/sec Loss 0.4126 Epoch: 14 Global Step: 237300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:04:27,175-Speed 3149.97 samples/sec Loss 0.4253 Epoch: 14 Global Step: 237350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:04:43,556-Speed 3125.58 samples/sec Loss 0.4188 Epoch: 14 Global Step: 237400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:04:59,892-Speed 3134.44 samples/sec Loss 0.4433 Epoch: 14 Global Step: 237450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:05:16,429-Speed 3096.16 samples/sec Loss 0.4262 Epoch: 14 Global Step: 237500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:05:32,991-Speed 3091.53 samples/sec Loss 0.4258 Epoch: 14 Global Step: 237550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:05:49,189-Speed 3160.93 samples/sec Loss 0.4147 Epoch: 14 Global Step: 237600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:06:05,484-Speed 3142.06 samples/sec Loss 0.4270 Epoch: 14 Global Step: 237650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:06:21,796-Speed 3138.96 samples/sec Loss 0.4161 Epoch: 14 Global Step: 237700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:06:39,508-Speed 2890.84 samples/sec Loss 0.4175 Epoch: 14 Global Step: 237750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:06:55,771-Speed 3148.33 samples/sec Loss 0.4252 Epoch: 14 Global Step: 237800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:07:12,472-Speed 3065.72 samples/sec Loss 0.4205 Epoch: 14 Global Step: 237850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:07:28,868-Speed 3122.73 samples/sec Loss 0.4162 Epoch: 14 Global Step: 237900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:07:45,439-Speed 3089.80 samples/sec Loss 0.4184 Epoch: 14 Global Step: 237950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:08:01,737-Speed 3141.71 samples/sec Loss 0.4292 Epoch: 14 Global Step: 238000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:08:55,257-[lfw][238000]XNorm: 21.727448 Training: 2021-03-17 03:08:55,258-[lfw][238000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 03:08:55,258-[lfw][238000]Accuracy-Highest: 0.99817 Training: 2021-03-17 03:09:57,521-[cfp_fp][238000]XNorm: 22.339574 Training: 2021-03-17 03:09:57,521-[cfp_fp][238000]Accuracy-Flip: 0.99143+-0.00613 Training: 2021-03-17 03:09:57,521-[cfp_fp][238000]Accuracy-Highest: 0.99186 Training: 2021-03-17 03:10:51,133-[agedb_30][238000]XNorm: 22.836581 Training: 2021-03-17 03:10:51,133-[agedb_30][238000]Accuracy-Flip: 0.98333+-0.00610 Training: 2021-03-17 03:10:51,133-[agedb_30][238000]Accuracy-Highest: 0.98467 Training: 2021-03-17 03:11:07,410-Speed 275.75 samples/sec Loss 0.4186 Epoch: 14 Global Step: 238050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:11:23,749-Speed 3133.70 samples/sec Loss 0.4182 Epoch: 14 Global Step: 238100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:11:41,101-Speed 2950.62 samples/sec Loss 0.4215 Epoch: 14 Global Step: 238150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:11:57,815-Speed 3063.46 samples/sec Loss 0.4243 Epoch: 14 Global Step: 238200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:12:15,222-Speed 2941.42 samples/sec Loss 0.4326 Epoch: 14 Global Step: 238250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:12:31,600-Speed 3126.17 samples/sec Loss 0.4205 Epoch: 14 Global Step: 238300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:12:47,805-Speed 3159.74 samples/sec Loss 0.4212 Epoch: 14 Global Step: 238350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:13:05,024-Speed 2973.56 samples/sec Loss 0.4132 Epoch: 14 Global Step: 238400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:13:21,387-Speed 3129.13 samples/sec Loss 0.4187 Epoch: 14 Global Step: 238450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:13:37,940-Speed 3093.08 samples/sec Loss 0.4174 Epoch: 14 Global Step: 238500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:13:54,140-Speed 3160.75 samples/sec Loss 0.4338 Epoch: 14 Global Step: 238550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:14:10,544-Speed 3121.23 samples/sec Loss 0.4187 Epoch: 14 Global Step: 238600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:14:27,239-Speed 3066.85 samples/sec Loss 0.4167 Epoch: 14 Global Step: 238650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:14:43,608-Speed 3127.99 samples/sec Loss 0.4256 Epoch: 14 Global Step: 238700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:15:00,005-Speed 3122.47 samples/sec Loss 0.4248 Epoch: 14 Global Step: 238750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:15:17,262-Speed 2967.08 samples/sec Loss 0.4260 Epoch: 14 Global Step: 238800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:15:34,039-Speed 3051.86 samples/sec Loss 0.4159 Epoch: 14 Global Step: 238850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:15:50,382-Speed 3132.95 samples/sec Loss 0.4208 Epoch: 14 Global Step: 238900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:16:06,638-Speed 3149.79 samples/sec Loss 0.4197 Epoch: 14 Global Step: 238950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:16:24,144-Speed 2924.80 samples/sec Loss 0.4231 Epoch: 14 Global Step: 239000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:16:41,179-Speed 3005.66 samples/sec Loss 0.4270 Epoch: 14 Global Step: 239050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:16:57,436-Speed 3149.44 samples/sec Loss 0.4254 Epoch: 14 Global Step: 239100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:17:13,868-Speed 3115.98 samples/sec Loss 0.4217 Epoch: 14 Global Step: 239150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:17:30,307-Speed 3114.51 samples/sec Loss 0.4286 Epoch: 14 Global Step: 239200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:17:46,563-Speed 3149.71 samples/sec Loss 0.4216 Epoch: 14 Global Step: 239250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:18:02,812-Speed 3151.07 samples/sec Loss 0.4199 Epoch: 14 Global Step: 239300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:18:19,191-Speed 3126.18 samples/sec Loss 0.4224 Epoch: 14 Global Step: 239350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:18:36,281-Speed 2995.84 samples/sec Loss 0.4217 Epoch: 14 Global Step: 239400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:18:52,629-Speed 3132.09 samples/sec Loss 0.4206 Epoch: 14 Global Step: 239450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:19:09,211-Speed 3087.77 samples/sec Loss 0.4185 Epoch: 14 Global Step: 239500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:19:25,576-Speed 3128.75 samples/sec Loss 0.4216 Epoch: 14 Global Step: 239550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:19:42,251-Speed 3070.40 samples/sec Loss 0.4250 Epoch: 14 Global Step: 239600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:19:58,588-Speed 3134.21 samples/sec Loss 0.4187 Epoch: 14 Global Step: 239650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:20:14,960-Speed 3127.34 samples/sec Loss 0.4124 Epoch: 14 Global Step: 239700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:20:31,651-Speed 3067.52 samples/sec Loss 0.4188 Epoch: 14 Global Step: 239750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:20:48,203-Speed 3093.43 samples/sec Loss 0.4146 Epoch: 14 Global Step: 239800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:21:04,635-Speed 3115.94 samples/sec Loss 0.4138 Epoch: 14 Global Step: 239850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:21:21,020-Speed 3125.05 samples/sec Loss 0.4174 Epoch: 14 Global Step: 239900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:21:38,171-Speed 2985.18 samples/sec Loss 0.4242 Epoch: 14 Global Step: 239950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:21:54,406-Speed 3153.79 samples/sec Loss 0.4129 Epoch: 14 Global Step: 240000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:22:47,449-[lfw][240000]XNorm: 21.701669 Training: 2021-03-17 03:22:47,450-[lfw][240000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 03:22:47,450-[lfw][240000]Accuracy-Highest: 0.99817 Training: 2021-03-17 03:23:49,155-[cfp_fp][240000]XNorm: 22.286647 Training: 2021-03-17 03:23:49,155-[cfp_fp][240000]Accuracy-Flip: 0.99200+-0.00504 Training: 2021-03-17 03:23:49,155-[cfp_fp][240000]Accuracy-Highest: 0.99200 Training: 2021-03-17 03:24:42,751-[agedb_30][240000]XNorm: 22.779900 Training: 2021-03-17 03:24:42,751-[agedb_30][240000]Accuracy-Flip: 0.98467+-0.00623 Training: 2021-03-17 03:24:42,751-[agedb_30][240000]Accuracy-Highest: 0.98467 Training: 2021-03-17 03:24:59,048-Speed 277.29 samples/sec Loss 0.4210 Epoch: 14 Global Step: 240050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:25:15,250-Speed 3160.16 samples/sec Loss 0.4232 Epoch: 14 Global Step: 240100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:25:31,525-Speed 3146.00 samples/sec Loss 0.4223 Epoch: 14 Global Step: 240150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:25:48,060-Speed 3096.71 samples/sec Loss 0.4187 Epoch: 14 Global Step: 240200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:26:04,358-Speed 3141.55 samples/sec Loss 0.4270 Epoch: 14 Global Step: 240250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:26:21,708-Speed 2951.06 samples/sec Loss 0.4033 Epoch: 14 Global Step: 240300 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:26:37,962-Speed 3150.16 samples/sec Loss 0.4122 Epoch: 14 Global Step: 240350 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:26:54,474-Speed 3100.86 samples/sec Loss 0.4100 Epoch: 14 Global Step: 240400 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:27:11,887-Speed 2940.40 samples/sec Loss 0.4117 Epoch: 14 Global Step: 240450 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:27:28,309-Speed 3117.85 samples/sec Loss 0.4212 Epoch: 14 Global Step: 240500 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:27:44,648-Speed 3133.60 samples/sec Loss 0.4214 Epoch: 14 Global Step: 240550 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:28:00,988-Speed 3133.51 samples/sec Loss 0.4172 Epoch: 14 Global Step: 240600 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:28:18,145-Speed 2984.41 samples/sec Loss 0.4186 Epoch: 14 Global Step: 240650 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:28:34,428-Speed 3144.40 samples/sec Loss 0.4102 Epoch: 14 Global Step: 240700 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:28:50,922-Speed 3104.25 samples/sec Loss 0.4067 Epoch: 14 Global Step: 240750 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:29:07,108-Speed 3163.37 samples/sec Loss 0.4145 Epoch: 14 Global Step: 240800 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:29:23,342-Speed 3153.99 samples/sec Loss 0.4222 Epoch: 14 Global Step: 240850 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:29:39,604-Speed 3148.40 samples/sec Loss 0.4201 Epoch: 14 Global Step: 240900 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:29:57,217-Speed 2907.07 samples/sec Loss 0.4077 Epoch: 14 Global Step: 240950 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:30:13,492-Speed 3146.05 samples/sec Loss 0.4151 Epoch: 14 Global Step: 241000 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:30:30,245-Speed 3056.18 samples/sec Loss 0.4203 Epoch: 14 Global Step: 241050 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:30:46,534-Speed 3143.27 samples/sec Loss 0.4139 Epoch: 14 Global Step: 241100 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:31:02,704-Speed 3166.54 samples/sec Loss 0.4069 Epoch: 14 Global Step: 241150 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:31:19,236-Speed 3097.18 samples/sec Loss 0.4195 Epoch: 14 Global Step: 241200 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:31:36,446-Speed 2975.09 samples/sec Loss 0.4243 Epoch: 14 Global Step: 241250 Fp16 Grad Scale: 16384 Required: 11 hours Training: 2021-03-17 03:31:52,696-Speed 3150.76 samples/sec Loss 0.4105 Epoch: 14 Global Step: 241300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:32:09,711-Speed 3009.18 samples/sec Loss 0.4189 Epoch: 14 Global Step: 241350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:32:26,055-Speed 3132.68 samples/sec Loss 0.4213 Epoch: 14 Global Step: 241400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:32:42,374-Speed 3137.52 samples/sec Loss 0.4343 Epoch: 14 Global Step: 241450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:32:58,657-Speed 3144.47 samples/sec Loss 0.4137 Epoch: 14 Global Step: 241500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:33:14,909-Speed 3150.54 samples/sec Loss 0.4162 Epoch: 14 Global Step: 241550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:33:31,188-Speed 3145.32 samples/sec Loss 0.4160 Epoch: 14 Global Step: 241600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:33:47,875-Speed 3068.22 samples/sec Loss 0.4274 Epoch: 14 Global Step: 241650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:34:05,023-Speed 2985.93 samples/sec Loss 0.4141 Epoch: 14 Global Step: 241700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:34:21,371-Speed 3131.86 samples/sec Loss 0.4227 Epoch: 14 Global Step: 241750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:34:37,810-Speed 3114.71 samples/sec Loss 0.4235 Epoch: 14 Global Step: 241800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:34:54,128-Speed 3137.72 samples/sec Loss 0.4302 Epoch: 14 Global Step: 241850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:35:10,551-Speed 3117.73 samples/sec Loss 0.4239 Epoch: 14 Global Step: 241900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:35:26,991-Speed 3114.48 samples/sec Loss 0.4190 Epoch: 14 Global Step: 241950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:35:43,442-Speed 3112.33 samples/sec Loss 0.4153 Epoch: 14 Global Step: 242000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:36:36,620-[lfw][242000]XNorm: 21.711873 Training: 2021-03-17 03:36:36,621-[lfw][242000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 03:36:36,621-[lfw][242000]Accuracy-Highest: 0.99817 Training: 2021-03-17 03:37:38,296-[cfp_fp][242000]XNorm: 22.296622 Training: 2021-03-17 03:37:38,297-[cfp_fp][242000]Accuracy-Flip: 0.99143+-0.00531 Training: 2021-03-17 03:37:38,297-[cfp_fp][242000]Accuracy-Highest: 0.99200 Training: 2021-03-17 03:38:32,046-[agedb_30][242000]XNorm: 22.807961 Training: 2021-03-17 03:38:32,046-[agedb_30][242000]Accuracy-Flip: 0.98450+-0.00646 Training: 2021-03-17 03:38:32,046-[agedb_30][242000]Accuracy-Highest: 0.98467 Training: 2021-03-17 03:38:48,255-Speed 277.04 samples/sec Loss 0.4175 Epoch: 14 Global Step: 242050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:39:05,384-Speed 2989.17 samples/sec Loss 0.4073 Epoch: 14 Global Step: 242100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:39:21,711-Speed 3136.03 samples/sec Loss 0.4159 Epoch: 14 Global Step: 242150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:39:37,959-Speed 3151.26 samples/sec Loss 0.4241 Epoch: 14 Global Step: 242200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:39:54,238-Speed 3145.19 samples/sec Loss 0.4128 Epoch: 14 Global Step: 242250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:40:10,542-Speed 3140.40 samples/sec Loss 0.4127 Epoch: 14 Global Step: 242300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:40:26,706-Speed 3167.72 samples/sec Loss 0.4134 Epoch: 14 Global Step: 242350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:40:42,997-Speed 3142.97 samples/sec Loss 0.4170 Epoch: 14 Global Step: 242400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:41:00,285-Speed 2961.58 samples/sec Loss 0.4175 Epoch: 14 Global Step: 242450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:41:16,526-Speed 3152.72 samples/sec Loss 0.4234 Epoch: 14 Global Step: 242500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:41:32,947-Speed 3118.04 samples/sec Loss 0.4208 Epoch: 14 Global Step: 242550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:41:49,311-Speed 3128.89 samples/sec Loss 0.4147 Epoch: 14 Global Step: 242600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:42:06,948-Speed 2903.14 samples/sec Loss 0.4240 Epoch: 14 Global Step: 242650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:42:23,212-Speed 3148.04 samples/sec Loss 0.4155 Epoch: 14 Global Step: 242700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:42:39,453-Speed 3152.53 samples/sec Loss 0.4148 Epoch: 14 Global Step: 242750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:42:56,465-Speed 3009.79 samples/sec Loss 0.4051 Epoch: 14 Global Step: 242800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:43:12,946-Speed 3106.82 samples/sec Loss 0.4057 Epoch: 14 Global Step: 242850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:43:29,404-Speed 3111.05 samples/sec Loss 0.4143 Epoch: 14 Global Step: 242900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:43:45,702-Speed 3141.50 samples/sec Loss 0.4212 Epoch: 14 Global Step: 242950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:44:01,960-Speed 3149.22 samples/sec Loss 0.4042 Epoch: 14 Global Step: 243000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:44:18,162-Speed 3160.25 samples/sec Loss 0.4092 Epoch: 14 Global Step: 243050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:44:34,524-Speed 3129.29 samples/sec Loss 0.4216 Epoch: 14 Global Step: 243100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:44:52,049-Speed 2921.65 samples/sec Loss 0.4284 Epoch: 14 Global Step: 243150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:45:08,473-Speed 3117.39 samples/sec Loss 0.4133 Epoch: 14 Global Step: 243200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:45:24,700-Speed 3155.38 samples/sec Loss 0.4258 Epoch: 14 Global Step: 243250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:45:41,193-Speed 3104.54 samples/sec Loss 0.4238 Epoch: 14 Global Step: 243300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:45:57,587-Speed 3123.20 samples/sec Loss 0.4133 Epoch: 14 Global Step: 243350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:46:13,849-Speed 3148.47 samples/sec Loss 0.4132 Epoch: 14 Global Step: 243400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:46:30,951-Speed 2993.77 samples/sec Loss 0.4250 Epoch: 14 Global Step: 243450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:46:47,404-Speed 3112.05 samples/sec Loss 0.4165 Epoch: 14 Global Step: 243500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:47:04,596-Speed 2978.16 samples/sec Loss 0.4148 Epoch: 14 Global Step: 243550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:47:20,861-Speed 3147.96 samples/sec Loss 0.4239 Epoch: 14 Global Step: 243600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:47:37,427-Speed 3090.82 samples/sec Loss 0.4133 Epoch: 14 Global Step: 243650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:47:53,794-Speed 3128.37 samples/sec Loss 0.4119 Epoch: 14 Global Step: 243700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:48:10,042-Speed 3151.32 samples/sec Loss 0.4214 Epoch: 14 Global Step: 243750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:48:26,376-Speed 3134.50 samples/sec Loss 0.4172 Epoch: 14 Global Step: 243800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:48:42,731-Speed 3130.64 samples/sec Loss 0.4057 Epoch: 14 Global Step: 243850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:48:59,989-Speed 2966.90 samples/sec Loss 0.4140 Epoch: 14 Global Step: 243900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:49:16,390-Speed 3121.76 samples/sec Loss 0.4120 Epoch: 14 Global Step: 243950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:49:32,881-Speed 3104.81 samples/sec Loss 0.4220 Epoch: 14 Global Step: 244000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:50:26,293-[lfw][244000]XNorm: 21.567273 Training: 2021-03-17 03:50:26,293-[lfw][244000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-17 03:50:26,294-[lfw][244000]Accuracy-Highest: 0.99817 Training: 2021-03-17 03:51:28,495-[cfp_fp][244000]XNorm: 22.242632 Training: 2021-03-17 03:51:28,496-[cfp_fp][244000]Accuracy-Flip: 0.99171+-0.00502 Training: 2021-03-17 03:51:28,496-[cfp_fp][244000]Accuracy-Highest: 0.99200 Training: 2021-03-17 03:52:22,026-[agedb_30][244000]XNorm: 22.726410 Training: 2021-03-17 03:52:22,026-[agedb_30][244000]Accuracy-Flip: 0.98483+-0.00639 Training: 2021-03-17 03:52:22,026-[agedb_30][244000]Accuracy-Highest: 0.98483 Training: 2021-03-17 03:52:38,324-Speed 276.10 samples/sec Loss 0.4182 Epoch: 14 Global Step: 244050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:52:54,735-Speed 3119.83 samples/sec Loss 0.4242 Epoch: 14 Global Step: 244100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:53:11,391-Speed 3074.15 samples/sec Loss 0.4195 Epoch: 14 Global Step: 244150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:53:27,623-Speed 3154.34 samples/sec Loss 0.4057 Epoch: 14 Global Step: 244200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:53:44,822-Speed 2977.01 samples/sec Loss 0.4168 Epoch: 14 Global Step: 244250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:54:01,155-Speed 3134.82 samples/sec Loss 0.4168 Epoch: 14 Global Step: 244300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:54:17,480-Speed 3136.42 samples/sec Loss 0.4152 Epoch: 14 Global Step: 244350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:54:33,803-Speed 3136.87 samples/sec Loss 0.4238 Epoch: 14 Global Step: 244400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:54:50,261-Speed 3111.02 samples/sec Loss 0.4221 Epoch: 14 Global Step: 244450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:55:06,637-Speed 3126.64 samples/sec Loss 0.4104 Epoch: 14 Global Step: 244500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:55:23,022-Speed 3124.74 samples/sec Loss 0.4185 Epoch: 14 Global Step: 244550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:55:39,374-Speed 3131.31 samples/sec Loss 0.4132 Epoch: 14 Global Step: 244600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:55:55,679-Speed 3140.19 samples/sec Loss 0.4034 Epoch: 14 Global Step: 244650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:56:12,967-Speed 2961.75 samples/sec Loss 0.4075 Epoch: 14 Global Step: 244700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:56:29,274-Speed 3139.75 samples/sec Loss 0.4153 Epoch: 14 Global Step: 244750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:56:45,577-Speed 3140.66 samples/sec Loss 0.4162 Epoch: 14 Global Step: 244800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:57:02,778-Speed 2976.62 samples/sec Loss 0.4236 Epoch: 14 Global Step: 244850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:57:19,111-Speed 3134.91 samples/sec Loss 0.4126 Epoch: 14 Global Step: 244900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:57:35,408-Speed 3141.86 samples/sec Loss 0.4146 Epoch: 14 Global Step: 244950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:57:52,742-Speed 2953.82 samples/sec Loss 0.4196 Epoch: 14 Global Step: 245000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:58:09,160-Speed 3118.63 samples/sec Loss 0.4200 Epoch: 14 Global Step: 245050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:58:25,461-Speed 3140.98 samples/sec Loss 0.4080 Epoch: 14 Global Step: 245100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:58:41,813-Speed 3131.17 samples/sec Loss 0.4131 Epoch: 14 Global Step: 245150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:58:58,147-Speed 3134.60 samples/sec Loss 0.4170 Epoch: 14 Global Step: 245200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:59:14,424-Speed 3145.74 samples/sec Loss 0.4142 Epoch: 14 Global Step: 245250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:59:31,685-Speed 2966.16 samples/sec Loss 0.4142 Epoch: 14 Global Step: 245300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 03:59:48,114-Speed 3116.63 samples/sec Loss 0.4110 Epoch: 14 Global Step: 245350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:00:05,037-Speed 3025.48 samples/sec Loss 0.4148 Epoch: 14 Global Step: 245400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:00:21,664-Speed 3079.54 samples/sec Loss 0.4131 Epoch: 14 Global Step: 245450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:00:37,896-Speed 3154.27 samples/sec Loss 0.4180 Epoch: 14 Global Step: 245500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:00:54,188-Speed 3142.72 samples/sec Loss 0.4106 Epoch: 14 Global Step: 245550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:01:10,754-Speed 3090.88 samples/sec Loss 0.4161 Epoch: 14 Global Step: 245600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:01:27,156-Speed 3121.57 samples/sec Loss 0.4262 Epoch: 14 Global Step: 245650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:01:43,387-Speed 3154.61 samples/sec Loss 0.4089 Epoch: 14 Global Step: 245700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:01:59,931-Speed 3094.92 samples/sec Loss 0.4147 Epoch: 14 Global Step: 245750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:02:18,230-Speed 2797.98 samples/sec Loss 0.4088 Epoch: 14 Global Step: 245800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:02:34,618-Speed 3124.27 samples/sec Loss 0.4058 Epoch: 14 Global Step: 245850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:02:50,998-Speed 3126.01 samples/sec Loss 0.4078 Epoch: 14 Global Step: 245900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:03:07,461-Speed 3110.08 samples/sec Loss 0.4160 Epoch: 14 Global Step: 245950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:03:23,901-Speed 3114.34 samples/sec Loss 0.4180 Epoch: 14 Global Step: 246000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:04:17,129-[lfw][246000]XNorm: 21.640884 Training: 2021-03-17 04:04:17,130-[lfw][246000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 04:04:17,130-[lfw][246000]Accuracy-Highest: 0.99817 Training: 2021-03-17 04:05:18,782-[cfp_fp][246000]XNorm: 22.257712 Training: 2021-03-17 04:05:18,783-[cfp_fp][246000]Accuracy-Flip: 0.99229+-0.00508 Training: 2021-03-17 04:05:18,783-[cfp_fp][246000]Accuracy-Highest: 0.99229 Training: 2021-03-17 04:06:11,910-[agedb_30][246000]XNorm: 22.734604 Training: 2021-03-17 04:06:11,911-[agedb_30][246000]Accuracy-Flip: 0.98417+-0.00620 Training: 2021-03-17 04:06:11,911-[agedb_30][246000]Accuracy-Highest: 0.98483 Training: 2021-03-17 04:06:28,837-Speed 276.85 samples/sec Loss 0.4116 Epoch: 14 Global Step: 246050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:06:45,237-Speed 3122.07 samples/sec Loss 0.4100 Epoch: 14 Global Step: 246100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:07:01,517-Speed 3145.03 samples/sec Loss 0.4067 Epoch: 14 Global Step: 246150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:07:17,809-Speed 3142.83 samples/sec Loss 0.4166 Epoch: 14 Global Step: 246200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:07:34,092-Speed 3144.42 samples/sec Loss 0.4058 Epoch: 14 Global Step: 246250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:07:50,320-Speed 3155.26 samples/sec Loss 0.4244 Epoch: 14 Global Step: 246300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:08:06,650-Speed 3135.38 samples/sec Loss 0.4084 Epoch: 14 Global Step: 246350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:08:23,016-Speed 3128.51 samples/sec Loss 0.4129 Epoch: 14 Global Step: 246400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:08:39,393-Speed 3126.42 samples/sec Loss 0.4048 Epoch: 14 Global Step: 246450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:08:55,838-Speed 3113.55 samples/sec Loss 0.4073 Epoch: 14 Global Step: 246500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:09:12,121-Speed 3144.42 samples/sec Loss 0.4185 Epoch: 14 Global Step: 246550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:09:29,205-Speed 2996.99 samples/sec Loss 0.4112 Epoch: 14 Global Step: 246600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:09:45,597-Speed 3123.66 samples/sec Loss 0.4027 Epoch: 14 Global Step: 246650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:10:01,870-Speed 3146.38 samples/sec Loss 0.4077 Epoch: 14 Global Step: 246700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:10:18,092-Speed 3156.31 samples/sec Loss 0.4121 Epoch: 14 Global Step: 246750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:10:34,632-Speed 3095.55 samples/sec Loss 0.4106 Epoch: 14 Global Step: 246800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:10:50,987-Speed 3130.63 samples/sec Loss 0.4295 Epoch: 14 Global Step: 246850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:11:08,126-Speed 2987.36 samples/sec Loss 0.4161 Epoch: 14 Global Step: 246900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:11:24,486-Speed 3129.78 samples/sec Loss 0.4191 Epoch: 14 Global Step: 246950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:11:40,800-Speed 3138.44 samples/sec Loss 0.4159 Epoch: 14 Global Step: 247000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:11:58,147-Speed 2951.54 samples/sec Loss 0.4113 Epoch: 14 Global Step: 247050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:12:14,499-Speed 3131.30 samples/sec Loss 0.4149 Epoch: 14 Global Step: 247100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:12:30,885-Speed 3124.66 samples/sec Loss 0.4036 Epoch: 14 Global Step: 247150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:12:48,215-Speed 2954.59 samples/sec Loss 0.4195 Epoch: 14 Global Step: 247200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:13:04,576-Speed 3129.30 samples/sec Loss 0.4038 Epoch: 14 Global Step: 247250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:13:20,999-Speed 3117.78 samples/sec Loss 0.4138 Epoch: 14 Global Step: 247300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:13:37,166-Speed 3167.08 samples/sec Loss 0.4156 Epoch: 14 Global Step: 247350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:13:53,400-Speed 3153.94 samples/sec Loss 0.4138 Epoch: 14 Global Step: 247400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:14:09,699-Speed 3141.32 samples/sec Loss 0.4068 Epoch: 14 Global Step: 247450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:14:26,089-Speed 3123.89 samples/sec Loss 0.4227 Epoch: 14 Global Step: 247500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:14:43,549-Speed 2932.56 samples/sec Loss 0.4054 Epoch: 14 Global Step: 247550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:14:59,812-Speed 3148.37 samples/sec Loss 0.4070 Epoch: 14 Global Step: 247600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:15:16,109-Speed 3141.81 samples/sec Loss 0.4151 Epoch: 14 Global Step: 247650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:15:32,442-Speed 3134.82 samples/sec Loss 0.4213 Epoch: 14 Global Step: 247700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:15:48,698-Speed 3149.71 samples/sec Loss 0.4081 Epoch: 14 Global Step: 247750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:16:05,066-Speed 3128.14 samples/sec Loss 0.4164 Epoch: 14 Global Step: 247800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:16:21,625-Speed 3092.11 samples/sec Loss 0.4224 Epoch: 14 Global Step: 247850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:16:37,804-Speed 3164.58 samples/sec Loss 0.4016 Epoch: 14 Global Step: 247900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:16:54,240-Speed 3115.20 samples/sec Loss 0.4110 Epoch: 14 Global Step: 247950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:17:12,091-Speed 2868.31 samples/sec Loss 0.4126 Epoch: 14 Global Step: 248000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:18:05,017-[lfw][248000]XNorm: 21.536239 Training: 2021-03-17 04:18:05,017-[lfw][248000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 04:18:05,017-[lfw][248000]Accuracy-Highest: 0.99817 Training: 2021-03-17 04:19:07,017-[cfp_fp][248000]XNorm: 22.185437 Training: 2021-03-17 04:19:07,017-[cfp_fp][248000]Accuracy-Flip: 0.99129+-0.00569 Training: 2021-03-17 04:19:07,017-[cfp_fp][248000]Accuracy-Highest: 0.99229 Training: 2021-03-17 04:19:59,981-[agedb_30][248000]XNorm: 22.666235 Training: 2021-03-17 04:19:59,981-[agedb_30][248000]Accuracy-Flip: 0.98400+-0.00554 Training: 2021-03-17 04:19:59,981-[agedb_30][248000]Accuracy-Highest: 0.98483 Training: 2021-03-17 04:20:16,379-Speed 277.83 samples/sec Loss 0.4134 Epoch: 14 Global Step: 248050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:20:33,521-Speed 2986.87 samples/sec Loss 0.4078 Epoch: 14 Global Step: 248100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:20:49,978-Speed 3111.39 samples/sec Loss 0.4133 Epoch: 14 Global Step: 248150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:21:06,188-Speed 3158.52 samples/sec Loss 0.4168 Epoch: 14 Global Step: 248200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:21:23,462-Speed 2964.11 samples/sec Loss 0.4003 Epoch: 14 Global Step: 248250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:21:39,853-Speed 3123.79 samples/sec Loss 0.4146 Epoch: 14 Global Step: 248300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:21:56,154-Speed 3141.02 samples/sec Loss 0.4072 Epoch: 14 Global Step: 248350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:22:12,551-Speed 3122.58 samples/sec Loss 0.3971 Epoch: 14 Global Step: 248400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:22:28,861-Speed 3139.29 samples/sec Loss 0.4031 Epoch: 14 Global Step: 248450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:22:45,692-Speed 3042.06 samples/sec Loss 0.4119 Epoch: 14 Global Step: 248500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:23:01,892-Speed 3160.70 samples/sec Loss 0.4122 Epoch: 14 Global Step: 248550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:23:18,697-Speed 3046.70 samples/sec Loss 0.4186 Epoch: 14 Global Step: 248600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:23:34,931-Speed 3154.07 samples/sec Loss 0.4112 Epoch: 14 Global Step: 248650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:23:51,396-Speed 3109.56 samples/sec Loss 0.4194 Epoch: 14 Global Step: 248700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:24:07,562-Speed 3167.37 samples/sec Loss 0.4068 Epoch: 14 Global Step: 248750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:24:24,769-Speed 2975.56 samples/sec Loss 0.4140 Epoch: 14 Global Step: 248800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:24:41,155-Speed 3124.76 samples/sec Loss 0.4201 Epoch: 14 Global Step: 248850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:24:57,478-Speed 3136.80 samples/sec Loss 0.4077 Epoch: 14 Global Step: 248900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:25:13,776-Speed 3141.61 samples/sec Loss 0.4146 Epoch: 14 Global Step: 248950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:25:30,008-Speed 3154.26 samples/sec Loss 0.4211 Epoch: 14 Global Step: 249000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:25:47,148-Speed 2987.17 samples/sec Loss 0.4022 Epoch: 14 Global Step: 249050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:26:03,574-Speed 3117.16 samples/sec Loss 0.4113 Epoch: 14 Global Step: 249100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:26:20,010-Speed 3115.25 samples/sec Loss 0.4122 Epoch: 14 Global Step: 249150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:26:37,360-Speed 2951.10 samples/sec Loss 0.4098 Epoch: 14 Global Step: 249200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:26:53,680-Speed 3137.35 samples/sec Loss 0.4106 Epoch: 14 Global Step: 249250 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:27:10,184-Speed 3102.29 samples/sec Loss 0.4083 Epoch: 14 Global Step: 249300 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:27:26,562-Speed 3126.23 samples/sec Loss 0.4045 Epoch: 14 Global Step: 249350 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:27:43,735-Speed 2981.51 samples/sec Loss 0.4077 Epoch: 14 Global Step: 249400 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:28:00,394-Speed 3073.52 samples/sec Loss 0.4170 Epoch: 14 Global Step: 249450 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:28:16,816-Speed 3117.85 samples/sec Loss 0.4019 Epoch: 14 Global Step: 249500 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:28:33,046-Speed 3154.71 samples/sec Loss 0.4138 Epoch: 14 Global Step: 249550 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:28:49,382-Speed 3134.24 samples/sec Loss 0.4135 Epoch: 14 Global Step: 249600 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:29:05,945-Speed 3091.46 samples/sec Loss 0.4110 Epoch: 14 Global Step: 249650 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:29:22,470-Speed 3098.35 samples/sec Loss 0.4045 Epoch: 14 Global Step: 249700 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:29:39,681-Speed 2974.99 samples/sec Loss 0.4232 Epoch: 14 Global Step: 249750 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:29:55,958-Speed 3145.46 samples/sec Loss 0.4065 Epoch: 14 Global Step: 249800 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:30:12,319-Speed 3129.47 samples/sec Loss 0.4176 Epoch: 14 Global Step: 249850 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:30:28,657-Speed 3133.89 samples/sec Loss 0.4071 Epoch: 14 Global Step: 249900 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:30:45,064-Speed 3120.77 samples/sec Loss 0.4043 Epoch: 14 Global Step: 249950 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:31:01,395-Speed 3135.32 samples/sec Loss 0.4199 Epoch: 14 Global Step: 250000 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:31:54,610-[lfw][250000]XNorm: 21.546334 Training: 2021-03-17 04:31:54,611-[lfw][250000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 04:31:54,611-[lfw][250000]Accuracy-Highest: 0.99817 Training: 2021-03-17 04:32:56,585-[cfp_fp][250000]XNorm: 22.139523 Training: 2021-03-17 04:32:56,585-[cfp_fp][250000]Accuracy-Flip: 0.99157+-0.00569 Training: 2021-03-17 04:32:56,586-[cfp_fp][250000]Accuracy-Highest: 0.99229 Training: 2021-03-17 04:33:49,686-[agedb_30][250000]XNorm: 22.686319 Training: 2021-03-17 04:33:49,686-[agedb_30][250000]Accuracy-Flip: 0.98417+-0.00574 Training: 2021-03-17 04:33:49,686-[agedb_30][250000]Accuracy-Highest: 0.98483 Training: 2021-03-17 04:34:06,011-Speed 277.33 samples/sec Loss 0.4059 Epoch: 14 Global Step: 250050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:34:22,470-Speed 3110.87 samples/sec Loss 0.4094 Epoch: 14 Global Step: 250100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:34:38,764-Speed 3142.38 samples/sec Loss 0.4166 Epoch: 14 Global Step: 250150 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:34:55,868-Speed 2993.44 samples/sec Loss 0.4079 Epoch: 14 Global Step: 250200 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-17 04:35:12,707-Speed 3040.69 samples/sec Loss 0.4104 Epoch: 14 Global Step: 250250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:35:28,998-Speed 3142.84 samples/sec Loss 0.4062 Epoch: 14 Global Step: 250300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:35:45,468-Speed 3108.77 samples/sec Loss 0.4102 Epoch: 14 Global Step: 250350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:36:16,851-Speed 1631.49 samples/sec Loss 0.3998 Epoch: 15 Global Step: 250400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:36:34,380-Speed 2920.90 samples/sec Loss 0.4022 Epoch: 15 Global Step: 250450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:36:50,604-Speed 3155.96 samples/sec Loss 0.3941 Epoch: 15 Global Step: 250500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:37:06,969-Speed 3128.87 samples/sec Loss 0.3946 Epoch: 15 Global Step: 250550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:37:23,241-Speed 3146.61 samples/sec Loss 0.3976 Epoch: 15 Global Step: 250600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:37:39,547-Speed 3139.93 samples/sec Loss 0.4023 Epoch: 15 Global Step: 250650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:37:55,844-Speed 3141.80 samples/sec Loss 0.4002 Epoch: 15 Global Step: 250700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:38:12,323-Speed 3107.04 samples/sec Loss 0.4033 Epoch: 15 Global Step: 250750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:38:29,151-Speed 3042.60 samples/sec Loss 0.4104 Epoch: 15 Global Step: 250800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:38:45,385-Speed 3154.12 samples/sec Loss 0.3960 Epoch: 15 Global Step: 250850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:39:01,660-Speed 3145.99 samples/sec Loss 0.4040 Epoch: 15 Global Step: 250900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:39:18,901-Speed 2969.77 samples/sec Loss 0.3994 Epoch: 15 Global Step: 250950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:39:35,097-Speed 3161.28 samples/sec Loss 0.3969 Epoch: 15 Global Step: 251000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:39:51,540-Speed 3113.91 samples/sec Loss 0.4013 Epoch: 15 Global Step: 251050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:40:08,030-Speed 3105.10 samples/sec Loss 0.4121 Epoch: 15 Global Step: 251100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:40:24,420-Speed 3123.90 samples/sec Loss 0.3979 Epoch: 15 Global Step: 251150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:40:40,704-Speed 3144.31 samples/sec Loss 0.4007 Epoch: 15 Global Step: 251200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:40:56,981-Speed 3145.70 samples/sec Loss 0.3966 Epoch: 15 Global Step: 251250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:41:14,423-Speed 2935.48 samples/sec Loss 0.3958 Epoch: 15 Global Step: 251300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:41:30,841-Speed 3118.64 samples/sec Loss 0.3990 Epoch: 15 Global Step: 251350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:41:47,895-Speed 3002.29 samples/sec Loss 0.4036 Epoch: 15 Global Step: 251400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:42:04,291-Speed 3122.75 samples/sec Loss 0.4025 Epoch: 15 Global Step: 251450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:42:20,588-Speed 3141.73 samples/sec Loss 0.4060 Epoch: 15 Global Step: 251500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:42:36,888-Speed 3141.37 samples/sec Loss 0.4015 Epoch: 15 Global Step: 251550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:42:54,096-Speed 2975.41 samples/sec Loss 0.4129 Epoch: 15 Global Step: 251600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:43:10,343-Speed 3151.36 samples/sec Loss 0.3949 Epoch: 15 Global Step: 251650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:43:26,598-Speed 3149.93 samples/sec Loss 0.3994 Epoch: 15 Global Step: 251700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:43:42,848-Speed 3150.87 samples/sec Loss 0.4016 Epoch: 15 Global Step: 251750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:43:59,274-Speed 3117.04 samples/sec Loss 0.3984 Epoch: 15 Global Step: 251800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:44:15,469-Speed 3161.61 samples/sec Loss 0.3959 Epoch: 15 Global Step: 251850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:44:32,959-Speed 2927.51 samples/sec Loss 0.3993 Epoch: 15 Global Step: 251900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:44:49,236-Speed 3145.64 samples/sec Loss 0.3995 Epoch: 15 Global Step: 251950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:45:05,537-Speed 3140.99 samples/sec Loss 0.3966 Epoch: 15 Global Step: 252000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:45:58,595-[lfw][252000]XNorm: 21.391004 Training: 2021-03-17 04:45:58,596-[lfw][252000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 04:45:58,596-[lfw][252000]Accuracy-Highest: 0.99817 Training: 2021-03-17 04:47:00,487-[cfp_fp][252000]XNorm: 22.176393 Training: 2021-03-17 04:47:00,487-[cfp_fp][252000]Accuracy-Flip: 0.99200+-0.00512 Training: 2021-03-17 04:47:00,487-[cfp_fp][252000]Accuracy-Highest: 0.99229 Training: 2021-03-17 04:47:53,812-[agedb_30][252000]XNorm: 22.659630 Training: 2021-03-17 04:47:53,812-[agedb_30][252000]Accuracy-Flip: 0.98367+-0.00567 Training: 2021-03-17 04:47:53,812-[agedb_30][252000]Accuracy-Highest: 0.98483 Training: 2021-03-17 04:48:10,040-Speed 277.50 samples/sec Loss 0.3917 Epoch: 15 Global Step: 252050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:48:26,263-Speed 3156.13 samples/sec Loss 0.4045 Epoch: 15 Global Step: 252100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:48:42,613-Speed 3131.53 samples/sec Loss 0.4097 Epoch: 15 Global Step: 252150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:48:58,882-Speed 3147.09 samples/sec Loss 0.3929 Epoch: 15 Global Step: 252200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:49:15,192-Speed 3139.28 samples/sec Loss 0.4074 Epoch: 15 Global Step: 252250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:49:31,427-Speed 3153.88 samples/sec Loss 0.3982 Epoch: 15 Global Step: 252300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:49:48,504-Speed 2998.22 samples/sec Loss 0.3992 Epoch: 15 Global Step: 252350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:50:04,772-Speed 3147.40 samples/sec Loss 0.3826 Epoch: 15 Global Step: 252400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:50:20,992-Speed 3156.66 samples/sec Loss 0.3905 Epoch: 15 Global Step: 252450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:50:37,490-Speed 3103.58 samples/sec Loss 0.3922 Epoch: 15 Global Step: 252500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:50:53,902-Speed 3119.72 samples/sec Loss 0.3977 Epoch: 15 Global Step: 252550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:51:11,182-Speed 2963.01 samples/sec Loss 0.3860 Epoch: 15 Global Step: 252600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:51:27,454-Speed 3146.63 samples/sec Loss 0.4040 Epoch: 15 Global Step: 252650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:51:43,804-Speed 3131.58 samples/sec Loss 0.3976 Epoch: 15 Global Step: 252700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:52:00,915-Speed 2992.30 samples/sec Loss 0.3984 Epoch: 15 Global Step: 252750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:52:17,269-Speed 3130.79 samples/sec Loss 0.4101 Epoch: 15 Global Step: 252800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:52:33,651-Speed 3125.59 samples/sec Loss 0.3952 Epoch: 15 Global Step: 252850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:52:49,965-Speed 3138.54 samples/sec Loss 0.3971 Epoch: 15 Global Step: 252900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:53:06,416-Speed 3112.32 samples/sec Loss 0.3985 Epoch: 15 Global Step: 252950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:53:23,047-Speed 3078.56 samples/sec Loss 0.4028 Epoch: 15 Global Step: 253000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:53:39,285-Speed 3153.21 samples/sec Loss 0.3921 Epoch: 15 Global Step: 253050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:53:56,501-Speed 2974.15 samples/sec Loss 0.3973 Epoch: 15 Global Step: 253100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:54:12,809-Speed 3139.60 samples/sec Loss 0.3946 Epoch: 15 Global Step: 253150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:54:29,073-Speed 3148.22 samples/sec Loss 0.3973 Epoch: 15 Global Step: 253200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:54:45,585-Speed 3100.74 samples/sec Loss 0.3974 Epoch: 15 Global Step: 253250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:55:01,844-Speed 3149.26 samples/sec Loss 0.4037 Epoch: 15 Global Step: 253300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:55:18,280-Speed 3115.21 samples/sec Loss 0.3903 Epoch: 15 Global Step: 253350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:55:34,560-Speed 3144.92 samples/sec Loss 0.3972 Epoch: 15 Global Step: 253400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:55:50,944-Speed 3125.06 samples/sec Loss 0.3934 Epoch: 15 Global Step: 253450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:56:07,359-Speed 3119.36 samples/sec Loss 0.3976 Epoch: 15 Global Step: 253500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:56:25,187-Speed 2871.88 samples/sec Loss 0.3985 Epoch: 15 Global Step: 253550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:56:41,626-Speed 3114.71 samples/sec Loss 0.4000 Epoch: 15 Global Step: 253600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:56:57,918-Speed 3142.71 samples/sec Loss 0.4040 Epoch: 15 Global Step: 253650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:57:14,199-Speed 3144.86 samples/sec Loss 0.4081 Epoch: 15 Global Step: 253700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:57:30,624-Speed 3117.26 samples/sec Loss 0.4135 Epoch: 15 Global Step: 253750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:57:47,885-Speed 2966.24 samples/sec Loss 0.3883 Epoch: 15 Global Step: 253800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:58:04,222-Speed 3134.26 samples/sec Loss 0.3918 Epoch: 15 Global Step: 253850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:58:20,492-Speed 3146.93 samples/sec Loss 0.4011 Epoch: 15 Global Step: 253900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:58:36,865-Speed 3127.20 samples/sec Loss 0.3990 Epoch: 15 Global Step: 253950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:58:53,115-Speed 3150.88 samples/sec Loss 0.3988 Epoch: 15 Global Step: 254000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 04:59:46,089-[lfw][254000]XNorm: 21.650944 Training: 2021-03-17 04:59:46,089-[lfw][254000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 04:59:46,090-[lfw][254000]Accuracy-Highest: 0.99817 Training: 2021-03-17 05:00:47,871-[cfp_fp][254000]XNorm: 22.327156 Training: 2021-03-17 05:00:47,872-[cfp_fp][254000]Accuracy-Flip: 0.99186+-0.00523 Training: 2021-03-17 05:00:47,872-[cfp_fp][254000]Accuracy-Highest: 0.99229 Training: 2021-03-17 05:01:41,019-[agedb_30][254000]XNorm: 22.828043 Training: 2021-03-17 05:01:41,020-[agedb_30][254000]Accuracy-Flip: 0.98467+-0.00623 Training: 2021-03-17 05:01:41,020-[agedb_30][254000]Accuracy-Highest: 0.98483 Training: 2021-03-17 05:01:57,212-Speed 278.11 samples/sec Loss 0.3973 Epoch: 15 Global Step: 254050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:02:14,602-Speed 2944.39 samples/sec Loss 0.4005 Epoch: 15 Global Step: 254100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:02:31,008-Speed 3120.90 samples/sec Loss 0.4033 Epoch: 15 Global Step: 254150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:02:47,245-Speed 3153.42 samples/sec Loss 0.4068 Epoch: 15 Global Step: 254200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:03:03,555-Speed 3139.30 samples/sec Loss 0.3999 Epoch: 15 Global Step: 254250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:03:19,914-Speed 3129.82 samples/sec Loss 0.4022 Epoch: 15 Global Step: 254300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:03:36,340-Speed 3117.08 samples/sec Loss 0.4006 Epoch: 15 Global Step: 254350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:03:52,612-Speed 3146.63 samples/sec Loss 0.3947 Epoch: 15 Global Step: 254400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:04:08,916-Speed 3140.32 samples/sec Loss 0.3935 Epoch: 15 Global Step: 254450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:04:25,202-Speed 3143.91 samples/sec Loss 0.3969 Epoch: 15 Global Step: 254500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:04:41,811-Speed 3082.84 samples/sec Loss 0.4085 Epoch: 15 Global Step: 254550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:04:58,923-Speed 2992.05 samples/sec Loss 0.4001 Epoch: 15 Global Step: 254600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:05:15,225-Speed 3140.89 samples/sec Loss 0.3891 Epoch: 15 Global Step: 254650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:05:31,726-Speed 3102.98 samples/sec Loss 0.4007 Epoch: 15 Global Step: 254700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:05:48,937-Speed 2974.85 samples/sec Loss 0.3975 Epoch: 15 Global Step: 254750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:06:05,717-Speed 3051.33 samples/sec Loss 0.3905 Epoch: 15 Global Step: 254800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:06:21,990-Speed 3146.51 samples/sec Loss 0.3962 Epoch: 15 Global Step: 254850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:06:38,191-Speed 3160.39 samples/sec Loss 0.3995 Epoch: 15 Global Step: 254900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:06:55,309-Speed 2991.03 samples/sec Loss 0.3986 Epoch: 15 Global Step: 254950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:07:11,641-Speed 3135.00 samples/sec Loss 0.3951 Epoch: 15 Global Step: 255000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:07:28,006-Speed 3128.68 samples/sec Loss 0.3996 Epoch: 15 Global Step: 255050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:07:44,307-Speed 3141.01 samples/sec Loss 0.3889 Epoch: 15 Global Step: 255100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:08:00,506-Speed 3160.90 samples/sec Loss 0.3891 Epoch: 15 Global Step: 255150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:08:16,891-Speed 3124.84 samples/sec Loss 0.4034 Epoch: 15 Global Step: 255200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:08:33,170-Speed 3145.33 samples/sec Loss 0.3957 Epoch: 15 Global Step: 255250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:08:49,396-Speed 3155.39 samples/sec Loss 0.3913 Epoch: 15 Global Step: 255300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:09:05,772-Speed 3126.71 samples/sec Loss 0.3981 Epoch: 15 Global Step: 255350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:09:22,908-Speed 2988.00 samples/sec Loss 0.3956 Epoch: 15 Global Step: 255400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:09:39,136-Speed 3155.00 samples/sec Loss 0.4074 Epoch: 15 Global Step: 255450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:09:55,652-Speed 3100.13 samples/sec Loss 0.3909 Epoch: 15 Global Step: 255500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:10:11,936-Speed 3144.35 samples/sec Loss 0.3923 Epoch: 15 Global Step: 255550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:10:28,239-Speed 3140.67 samples/sec Loss 0.3991 Epoch: 15 Global Step: 255600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:10:44,672-Speed 3115.69 samples/sec Loss 0.3905 Epoch: 15 Global Step: 255650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:11:01,032-Speed 3129.80 samples/sec Loss 0.3981 Epoch: 15 Global Step: 255700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:11:18,497-Speed 2931.65 samples/sec Loss 0.3985 Epoch: 15 Global Step: 255750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:11:34,761-Speed 3148.09 samples/sec Loss 0.3978 Epoch: 15 Global Step: 255800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:11:51,092-Speed 3135.28 samples/sec Loss 0.3962 Epoch: 15 Global Step: 255850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:12:08,397-Speed 2958.79 samples/sec Loss 0.4034 Epoch: 15 Global Step: 255900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:12:25,800-Speed 2942.09 samples/sec Loss 0.3951 Epoch: 15 Global Step: 255950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:12:42,417-Speed 3081.18 samples/sec Loss 0.3956 Epoch: 15 Global Step: 256000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:13:35,893-[lfw][256000]XNorm: 21.670358 Training: 2021-03-17 05:13:35,894-[lfw][256000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 05:13:35,894-[lfw][256000]Accuracy-Highest: 0.99817 Training: 2021-03-17 05:14:38,118-[cfp_fp][256000]XNorm: 22.315174 Training: 2021-03-17 05:14:38,118-[cfp_fp][256000]Accuracy-Flip: 0.99257+-0.00522 Training: 2021-03-17 05:14:38,118-[cfp_fp][256000]Accuracy-Highest: 0.99257 Training: 2021-03-17 05:15:31,500-[agedb_30][256000]XNorm: 22.939395 Training: 2021-03-17 05:15:31,501-[agedb_30][256000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 05:15:31,501-[agedb_30][256000]Accuracy-Highest: 0.98483 Training: 2021-03-17 05:15:47,820-Speed 276.16 samples/sec Loss 0.4166 Epoch: 15 Global Step: 256050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:16:04,040-Speed 3156.82 samples/sec Loss 0.3940 Epoch: 15 Global Step: 256100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:16:20,664-Speed 3080.01 samples/sec Loss 0.3985 Epoch: 15 Global Step: 256150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:16:36,939-Speed 3145.88 samples/sec Loss 0.4148 Epoch: 15 Global Step: 256200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:16:53,935-Speed 3012.59 samples/sec Loss 0.4086 Epoch: 15 Global Step: 256250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:17:10,440-Speed 3102.24 samples/sec Loss 0.4115 Epoch: 15 Global Step: 256300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:17:26,888-Speed 3112.97 samples/sec Loss 0.3852 Epoch: 15 Global Step: 256350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:17:43,355-Speed 3109.34 samples/sec Loss 0.4009 Epoch: 15 Global Step: 256400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:17:59,767-Speed 3119.73 samples/sec Loss 0.4019 Epoch: 15 Global Step: 256450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:18:16,472-Speed 3064.94 samples/sec Loss 0.3999 Epoch: 15 Global Step: 256500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:18:32,728-Speed 3149.83 samples/sec Loss 0.3978 Epoch: 15 Global Step: 256550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:18:49,473-Speed 3057.70 samples/sec Loss 0.4002 Epoch: 15 Global Step: 256600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:19:05,634-Speed 3168.09 samples/sec Loss 0.4039 Epoch: 15 Global Step: 256650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:19:21,916-Speed 3144.87 samples/sec Loss 0.3982 Epoch: 15 Global Step: 256700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:19:38,273-Speed 3130.13 samples/sec Loss 0.3926 Epoch: 15 Global Step: 256750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:19:54,551-Speed 3145.44 samples/sec Loss 0.3974 Epoch: 15 Global Step: 256800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:20:10,998-Speed 3113.16 samples/sec Loss 0.3957 Epoch: 15 Global Step: 256850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:20:28,207-Speed 2975.18 samples/sec Loss 0.3923 Epoch: 15 Global Step: 256900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:20:44,556-Speed 3131.92 samples/sec Loss 0.3998 Epoch: 15 Global Step: 256950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:21:00,771-Speed 3157.55 samples/sec Loss 0.3985 Epoch: 15 Global Step: 257000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:21:18,332-Speed 2915.66 samples/sec Loss 0.3998 Epoch: 15 Global Step: 257050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:21:34,811-Speed 3107.07 samples/sec Loss 0.3960 Epoch: 15 Global Step: 257100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:21:51,181-Speed 3127.66 samples/sec Loss 0.3951 Epoch: 15 Global Step: 257150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:22:08,432-Speed 2968.12 samples/sec Loss 0.3930 Epoch: 15 Global Step: 257200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:22:24,918-Speed 3105.69 samples/sec Loss 0.4054 Epoch: 15 Global Step: 257250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:22:41,324-Speed 3120.93 samples/sec Loss 0.3889 Epoch: 15 Global Step: 257300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:22:57,736-Speed 3119.84 samples/sec Loss 0.3980 Epoch: 15 Global Step: 257350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:23:14,028-Speed 3142.79 samples/sec Loss 0.3999 Epoch: 15 Global Step: 257400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:23:30,421-Speed 3123.30 samples/sec Loss 0.3934 Epoch: 15 Global Step: 257450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:23:46,694-Speed 3146.44 samples/sec Loss 0.4006 Epoch: 15 Global Step: 257500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:24:03,027-Speed 3134.72 samples/sec Loss 0.3975 Epoch: 15 Global Step: 257550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:24:20,218-Speed 2978.44 samples/sec Loss 0.4001 Epoch: 15 Global Step: 257600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:24:36,512-Speed 3142.45 samples/sec Loss 0.3949 Epoch: 15 Global Step: 257650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:24:53,015-Speed 3102.54 samples/sec Loss 0.3889 Epoch: 15 Global Step: 257700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:25:09,214-Speed 3160.79 samples/sec Loss 0.3996 Epoch: 15 Global Step: 257750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:25:25,459-Speed 3151.67 samples/sec Loss 0.3989 Epoch: 15 Global Step: 257800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:25:41,960-Speed 3102.95 samples/sec Loss 0.4051 Epoch: 15 Global Step: 257850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:25:58,245-Speed 3144.14 samples/sec Loss 0.4013 Epoch: 15 Global Step: 257900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:26:15,434-Speed 2978.69 samples/sec Loss 0.3899 Epoch: 15 Global Step: 257950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:26:31,723-Speed 3143.34 samples/sec Loss 0.3987 Epoch: 15 Global Step: 258000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:27:24,990-[lfw][258000]XNorm: 21.554345 Training: 2021-03-17 05:27:24,991-[lfw][258000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 05:27:24,991-[lfw][258000]Accuracy-Highest: 0.99817 Training: 2021-03-17 05:28:26,681-[cfp_fp][258000]XNorm: 22.237479 Training: 2021-03-17 05:28:26,682-[cfp_fp][258000]Accuracy-Flip: 0.99243+-0.00519 Training: 2021-03-17 05:28:26,682-[cfp_fp][258000]Accuracy-Highest: 0.99257 Training: 2021-03-17 05:29:19,678-[agedb_30][258000]XNorm: 22.744530 Training: 2021-03-17 05:29:19,678-[agedb_30][258000]Accuracy-Flip: 0.98383+-0.00573 Training: 2021-03-17 05:29:19,678-[agedb_30][258000]Accuracy-Highest: 0.98483 Training: 2021-03-17 05:29:36,090-Speed 277.71 samples/sec Loss 0.3995 Epoch: 15 Global Step: 258050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:29:53,207-Speed 2991.19 samples/sec Loss 0.3990 Epoch: 15 Global Step: 258100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:30:10,563-Speed 2950.07 samples/sec Loss 0.3996 Epoch: 15 Global Step: 258150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:30:26,720-Speed 3168.87 samples/sec Loss 0.3932 Epoch: 15 Global Step: 258200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:30:43,048-Speed 3135.84 samples/sec Loss 0.4025 Epoch: 15 Global Step: 258250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:30:59,394-Speed 3132.51 samples/sec Loss 0.3827 Epoch: 15 Global Step: 258300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:31:15,794-Speed 3121.93 samples/sec Loss 0.3999 Epoch: 15 Global Step: 258350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:31:32,083-Speed 3143.44 samples/sec Loss 0.3936 Epoch: 15 Global Step: 258400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:31:49,135-Speed 3002.66 samples/sec Loss 0.3932 Epoch: 15 Global Step: 258450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:32:05,405-Speed 3146.91 samples/sec Loss 0.3976 Epoch: 15 Global Step: 258500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:32:21,841-Speed 3115.17 samples/sec Loss 0.3977 Epoch: 15 Global Step: 258550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:32:38,163-Speed 3137.09 samples/sec Loss 0.3971 Epoch: 15 Global Step: 258600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:32:54,560-Speed 3122.53 samples/sec Loss 0.3992 Epoch: 15 Global Step: 258650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:33:11,094-Speed 3096.78 samples/sec Loss 0.4026 Epoch: 15 Global Step: 258700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:33:27,455-Speed 3129.47 samples/sec Loss 0.3973 Epoch: 15 Global Step: 258750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:33:43,810-Speed 3130.51 samples/sec Loss 0.3949 Epoch: 15 Global Step: 258800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:34:00,245-Speed 3115.51 samples/sec Loss 0.3891 Epoch: 15 Global Step: 258850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:34:16,442-Speed 3161.20 samples/sec Loss 0.4090 Epoch: 15 Global Step: 258900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:34:32,951-Speed 3101.32 samples/sec Loss 0.3924 Epoch: 15 Global Step: 258950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:34:49,371-Speed 3118.34 samples/sec Loss 0.3867 Epoch: 15 Global Step: 259000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-17 05:35:05,901-Speed 3097.39 samples/sec Loss 0.3960 Epoch: 15 Global Step: 259050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:35:22,140-Speed 3152.98 samples/sec Loss 0.4018 Epoch: 15 Global Step: 259100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:35:39,334-Speed 2977.90 samples/sec Loss 0.3937 Epoch: 15 Global Step: 259150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:35:56,495-Speed 2983.60 samples/sec Loss 0.3991 Epoch: 15 Global Step: 259200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:36:12,743-Speed 3151.16 samples/sec Loss 0.3961 Epoch: 15 Global Step: 259250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:36:29,089-Speed 3132.36 samples/sec Loss 0.4043 Epoch: 15 Global Step: 259300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:36:45,392-Speed 3140.58 samples/sec Loss 0.3919 Epoch: 15 Global Step: 259350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:37:01,668-Speed 3145.97 samples/sec Loss 0.3911 Epoch: 15 Global Step: 259400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:37:18,013-Speed 3132.50 samples/sec Loss 0.3934 Epoch: 15 Global Step: 259450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:37:35,290-Speed 2963.52 samples/sec Loss 0.3936 Epoch: 15 Global Step: 259500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:37:51,581-Speed 3142.98 samples/sec Loss 0.3967 Epoch: 15 Global Step: 259550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:38:07,865-Speed 3144.36 samples/sec Loss 0.4006 Epoch: 15 Global Step: 259600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:38:24,067-Speed 3160.23 samples/sec Loss 0.3977 Epoch: 15 Global Step: 259650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:38:40,595-Speed 3097.86 samples/sec Loss 0.3843 Epoch: 15 Global Step: 259700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:38:56,843-Speed 3151.07 samples/sec Loss 0.3997 Epoch: 15 Global Step: 259750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:39:14,230-Speed 2944.84 samples/sec Loss 0.3963 Epoch: 15 Global Step: 259800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:39:30,690-Speed 3110.72 samples/sec Loss 0.4047 Epoch: 15 Global Step: 259850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:39:46,926-Speed 3153.61 samples/sec Loss 0.3921 Epoch: 15 Global Step: 259900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:40:03,347-Speed 3118.07 samples/sec Loss 0.4060 Epoch: 15 Global Step: 259950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:40:19,866-Speed 3099.55 samples/sec Loss 0.3948 Epoch: 15 Global Step: 260000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:41:12,946-[lfw][260000]XNorm: 21.469901 Training: 2021-03-17 05:41:12,947-[lfw][260000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 05:41:12,947-[lfw][260000]Accuracy-Highest: 0.99817 Training: 2021-03-17 05:42:14,775-[cfp_fp][260000]XNorm: 22.283746 Training: 2021-03-17 05:42:14,775-[cfp_fp][260000]Accuracy-Flip: 0.99243+-0.00519 Training: 2021-03-17 05:42:14,775-[cfp_fp][260000]Accuracy-Highest: 0.99257 Training: 2021-03-17 05:43:07,943-[agedb_30][260000]XNorm: 22.701568 Training: 2021-03-17 05:43:07,943-[agedb_30][260000]Accuracy-Flip: 0.98383+-0.00578 Training: 2021-03-17 05:43:07,943-[agedb_30][260000]Accuracy-Highest: 0.98483 Training: 2021-03-17 05:43:24,401-Speed 277.45 samples/sec Loss 0.3988 Epoch: 15 Global Step: 260050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:43:41,386-Speed 3014.58 samples/sec Loss 0.4032 Epoch: 15 Global Step: 260100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:43:57,891-Speed 3102.16 samples/sec Loss 0.3930 Epoch: 15 Global Step: 260150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:44:14,638-Speed 3057.27 samples/sec Loss 0.3963 Epoch: 15 Global Step: 260200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:44:30,855-Speed 3157.40 samples/sec Loss 0.3838 Epoch: 15 Global Step: 260250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:44:48,019-Speed 2983.07 samples/sec Loss 0.4045 Epoch: 15 Global Step: 260300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:45:05,476-Speed 2933.03 samples/sec Loss 0.4005 Epoch: 15 Global Step: 260350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:45:21,862-Speed 3124.60 samples/sec Loss 0.3945 Epoch: 15 Global Step: 260400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:45:38,235-Speed 3127.24 samples/sec Loss 0.3918 Epoch: 15 Global Step: 260450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:45:54,439-Speed 3159.81 samples/sec Loss 0.3991 Epoch: 15 Global Step: 260500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:46:10,898-Speed 3110.81 samples/sec Loss 0.3882 Epoch: 15 Global Step: 260550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:46:28,005-Speed 2993.03 samples/sec Loss 0.4011 Epoch: 15 Global Step: 260600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:46:44,366-Speed 3129.41 samples/sec Loss 0.3940 Epoch: 15 Global Step: 260650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:47:00,738-Speed 3127.43 samples/sec Loss 0.3928 Epoch: 15 Global Step: 260700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:47:16,951-Speed 3157.96 samples/sec Loss 0.3957 Epoch: 15 Global Step: 260750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:47:33,121-Speed 3166.53 samples/sec Loss 0.3860 Epoch: 15 Global Step: 260800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:47:49,717-Speed 3085.15 samples/sec Loss 0.4006 Epoch: 15 Global Step: 260850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:48:06,121-Speed 3121.37 samples/sec Loss 0.4127 Epoch: 15 Global Step: 260900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:48:22,459-Speed 3133.88 samples/sec Loss 0.3922 Epoch: 15 Global Step: 260950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:48:38,998-Speed 3095.70 samples/sec Loss 0.3897 Epoch: 15 Global Step: 261000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:48:55,408-Speed 3120.14 samples/sec Loss 0.4029 Epoch: 15 Global Step: 261050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:49:11,775-Speed 3128.32 samples/sec Loss 0.4026 Epoch: 15 Global Step: 261100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:49:27,970-Speed 3161.59 samples/sec Loss 0.3886 Epoch: 15 Global Step: 261150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:49:44,352-Speed 3125.47 samples/sec Loss 0.3953 Epoch: 15 Global Step: 261200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:50:00,638-Speed 3143.90 samples/sec Loss 0.3980 Epoch: 15 Global Step: 261250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:50:16,988-Speed 3131.48 samples/sec Loss 0.4049 Epoch: 15 Global Step: 261300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:50:34,259-Speed 2964.74 samples/sec Loss 0.3958 Epoch: 15 Global Step: 261350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:50:51,509-Speed 2968.16 samples/sec Loss 0.3976 Epoch: 15 Global Step: 261400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:51:07,988-Speed 3107.07 samples/sec Loss 0.3991 Epoch: 15 Global Step: 261450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:51:24,220-Speed 3154.21 samples/sec Loss 0.3875 Epoch: 15 Global Step: 261500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:51:40,523-Speed 3140.65 samples/sec Loss 0.3948 Epoch: 15 Global Step: 261550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:51:57,012-Speed 3105.18 samples/sec Loss 0.4047 Epoch: 15 Global Step: 261600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:52:13,336-Speed 3136.70 samples/sec Loss 0.3951 Epoch: 15 Global Step: 261650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:52:29,689-Speed 3131.03 samples/sec Loss 0.4094 Epoch: 15 Global Step: 261700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:52:46,295-Speed 3083.19 samples/sec Loss 0.4023 Epoch: 15 Global Step: 261750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:53:03,716-Speed 2939.14 samples/sec Loss 0.3874 Epoch: 15 Global Step: 261800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:53:20,055-Speed 3133.80 samples/sec Loss 0.3959 Epoch: 15 Global Step: 261850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:53:36,933-Speed 3033.48 samples/sec Loss 0.3955 Epoch: 15 Global Step: 261900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:53:53,336-Speed 3121.56 samples/sec Loss 0.3817 Epoch: 15 Global Step: 261950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:54:10,536-Speed 2976.72 samples/sec Loss 0.4005 Epoch: 15 Global Step: 262000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:55:03,559-[lfw][262000]XNorm: 21.613552 Training: 2021-03-17 05:55:03,559-[lfw][262000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 05:55:03,559-[lfw][262000]Accuracy-Highest: 0.99817 Training: 2021-03-17 05:56:05,628-[cfp_fp][262000]XNorm: 22.324300 Training: 2021-03-17 05:56:05,628-[cfp_fp][262000]Accuracy-Flip: 0.99200+-0.00579 Training: 2021-03-17 05:56:05,628-[cfp_fp][262000]Accuracy-Highest: 0.99257 Training: 2021-03-17 05:56:59,208-[agedb_30][262000]XNorm: 22.780255 Training: 2021-03-17 05:56:59,209-[agedb_30][262000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 05:56:59,209-[agedb_30][262000]Accuracy-Highest: 0.98483 Training: 2021-03-17 05:57:15,389-Speed 276.98 samples/sec Loss 0.4066 Epoch: 15 Global Step: 262050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:57:31,720-Speed 3135.24 samples/sec Loss 0.3938 Epoch: 15 Global Step: 262100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:57:48,139-Speed 3118.42 samples/sec Loss 0.3951 Epoch: 15 Global Step: 262150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:58:04,335-Speed 3161.38 samples/sec Loss 0.3893 Epoch: 15 Global Step: 262200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:58:20,764-Speed 3116.47 samples/sec Loss 0.3874 Epoch: 15 Global Step: 262250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:58:37,752-Speed 3014.04 samples/sec Loss 0.3979 Epoch: 15 Global Step: 262300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:58:54,087-Speed 3134.41 samples/sec Loss 0.3911 Epoch: 15 Global Step: 262350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:59:10,339-Speed 3150.43 samples/sec Loss 0.4081 Epoch: 15 Global Step: 262400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:59:26,714-Speed 3126.95 samples/sec Loss 0.3936 Epoch: 15 Global Step: 262450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:59:42,977-Speed 3148.23 samples/sec Loss 0.4045 Epoch: 15 Global Step: 262500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 05:59:59,252-Speed 3145.98 samples/sec Loss 0.3933 Epoch: 15 Global Step: 262550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:00:16,378-Speed 2989.81 samples/sec Loss 0.4018 Epoch: 15 Global Step: 262600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:00:33,478-Speed 2994.13 samples/sec Loss 0.3973 Epoch: 15 Global Step: 262650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:00:49,903-Speed 3117.28 samples/sec Loss 0.4005 Epoch: 15 Global Step: 262700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:01:06,226-Speed 3136.90 samples/sec Loss 0.3993 Epoch: 15 Global Step: 262750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:01:22,562-Speed 3134.25 samples/sec Loss 0.4007 Epoch: 15 Global Step: 262800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:01:39,635-Speed 2998.94 samples/sec Loss 0.3958 Epoch: 15 Global Step: 262850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:01:55,902-Speed 3147.66 samples/sec Loss 0.3976 Epoch: 15 Global Step: 262900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:02:12,728-Speed 3042.97 samples/sec Loss 0.4022 Epoch: 15 Global Step: 262950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:02:29,139-Speed 3119.88 samples/sec Loss 0.3854 Epoch: 15 Global Step: 263000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:02:45,605-Speed 3109.62 samples/sec Loss 0.3966 Epoch: 15 Global Step: 263050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:03:01,853-Speed 3151.07 samples/sec Loss 0.3947 Epoch: 15 Global Step: 263100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:03:18,025-Speed 3166.21 samples/sec Loss 0.3996 Epoch: 15 Global Step: 263150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:03:34,459-Speed 3115.51 samples/sec Loss 0.3941 Epoch: 15 Global Step: 263200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:03:51,036-Speed 3088.70 samples/sec Loss 0.3902 Epoch: 15 Global Step: 263250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:04:07,370-Speed 3134.63 samples/sec Loss 0.4076 Epoch: 15 Global Step: 263300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:04:23,886-Speed 3100.09 samples/sec Loss 0.3943 Epoch: 15 Global Step: 263350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:04:40,323-Speed 3115.16 samples/sec Loss 0.3926 Epoch: 15 Global Step: 263400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:04:56,493-Speed 3166.42 samples/sec Loss 0.3928 Epoch: 15 Global Step: 263450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:05:12,795-Speed 3140.78 samples/sec Loss 0.4060 Epoch: 15 Global Step: 263500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:05:29,905-Speed 2992.44 samples/sec Loss 0.4013 Epoch: 15 Global Step: 263550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:05:46,339-Speed 3115.62 samples/sec Loss 0.3918 Epoch: 15 Global Step: 263600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:06:03,585-Speed 2968.85 samples/sec Loss 0.3948 Epoch: 15 Global Step: 263650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:06:19,915-Speed 3135.42 samples/sec Loss 0.4006 Epoch: 15 Global Step: 263700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:06:36,263-Speed 3132.06 samples/sec Loss 0.4017 Epoch: 15 Global Step: 263750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:06:52,465-Speed 3160.05 samples/sec Loss 0.3984 Epoch: 15 Global Step: 263800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:07:08,750-Speed 3144.22 samples/sec Loss 0.3918 Epoch: 15 Global Step: 263850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:07:25,125-Speed 3126.84 samples/sec Loss 0.3923 Epoch: 15 Global Step: 263900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:07:41,467-Speed 3133.02 samples/sec Loss 0.3952 Epoch: 15 Global Step: 263950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:07:58,852-Speed 2945.21 samples/sec Loss 0.3959 Epoch: 15 Global Step: 264000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:08:51,789-[lfw][264000]XNorm: 21.658509 Training: 2021-03-17 06:08:51,789-[lfw][264000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 06:08:51,789-[lfw][264000]Accuracy-Highest: 0.99817 Training: 2021-03-17 06:09:54,106-[cfp_fp][264000]XNorm: 22.266170 Training: 2021-03-17 06:09:54,107-[cfp_fp][264000]Accuracy-Flip: 0.99200+-0.00524 Training: 2021-03-17 06:09:54,107-[cfp_fp][264000]Accuracy-Highest: 0.99257 Training: 2021-03-17 06:10:47,184-[agedb_30][264000]XNorm: 22.815070 Training: 2021-03-17 06:10:47,184-[agedb_30][264000]Accuracy-Flip: 0.98467+-0.00627 Training: 2021-03-17 06:10:47,184-[agedb_30][264000]Accuracy-Highest: 0.98483 Training: 2021-03-17 06:11:03,505-Speed 277.28 samples/sec Loss 0.3995 Epoch: 15 Global Step: 264050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:11:19,920-Speed 3119.12 samples/sec Loss 0.3902 Epoch: 15 Global Step: 264100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:11:36,179-Speed 3149.20 samples/sec Loss 0.3879 Epoch: 15 Global Step: 264150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:11:52,803-Speed 3079.96 samples/sec Loss 0.3919 Epoch: 15 Global Step: 264200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:12:09,059-Speed 3149.64 samples/sec Loss 0.4018 Epoch: 15 Global Step: 264250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:12:26,242-Speed 2979.80 samples/sec Loss 0.4055 Epoch: 15 Global Step: 264300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:12:42,899-Speed 3074.00 samples/sec Loss 0.4084 Epoch: 15 Global Step: 264350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:12:59,099-Speed 3160.46 samples/sec Loss 0.3939 Epoch: 15 Global Step: 264400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:13:15,618-Speed 3099.62 samples/sec Loss 0.3971 Epoch: 15 Global Step: 264450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:13:31,804-Speed 3163.23 samples/sec Loss 0.3973 Epoch: 15 Global Step: 264500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:13:49,169-Speed 2948.69 samples/sec Loss 0.3961 Epoch: 15 Global Step: 264550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:14:05,573-Speed 3121.19 samples/sec Loss 0.3976 Epoch: 15 Global Step: 264600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:14:21,765-Speed 3162.12 samples/sec Loss 0.3985 Epoch: 15 Global Step: 264650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:14:38,049-Speed 3144.27 samples/sec Loss 0.3897 Epoch: 15 Global Step: 264700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:14:54,484-Speed 3115.35 samples/sec Loss 0.4040 Epoch: 15 Global Step: 264750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:15:11,571-Speed 2996.59 samples/sec Loss 0.3973 Epoch: 15 Global Step: 264800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:15:27,993-Speed 3117.89 samples/sec Loss 0.3974 Epoch: 15 Global Step: 264850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:15:44,257-Speed 3148.18 samples/sec Loss 0.4015 Epoch: 15 Global Step: 264900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:16:01,328-Speed 2999.30 samples/sec Loss 0.3935 Epoch: 15 Global Step: 264950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:16:18,606-Speed 2963.34 samples/sec Loss 0.3951 Epoch: 15 Global Step: 265000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:16:34,888-Speed 3144.57 samples/sec Loss 0.3990 Epoch: 15 Global Step: 265050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:16:51,350-Speed 3110.38 samples/sec Loss 0.4032 Epoch: 15 Global Step: 265100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:17:07,876-Speed 3098.18 samples/sec Loss 0.3922 Epoch: 15 Global Step: 265150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:17:24,143-Speed 3147.56 samples/sec Loss 0.3986 Epoch: 15 Global Step: 265200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:17:40,469-Speed 3136.20 samples/sec Loss 0.3866 Epoch: 15 Global Step: 265250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:17:56,852-Speed 3125.34 samples/sec Loss 0.3961 Epoch: 15 Global Step: 265300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:18:13,269-Speed 3118.77 samples/sec Loss 0.3972 Epoch: 15 Global Step: 265350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:18:29,999-Speed 3060.59 samples/sec Loss 0.3948 Epoch: 15 Global Step: 265400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:18:46,296-Speed 3141.62 samples/sec Loss 0.3826 Epoch: 15 Global Step: 265450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:19:02,537-Speed 3152.78 samples/sec Loss 0.3902 Epoch: 15 Global Step: 265500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:19:18,885-Speed 3131.95 samples/sec Loss 0.3981 Epoch: 15 Global Step: 265550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:19:35,101-Speed 3157.46 samples/sec Loss 0.3980 Epoch: 15 Global Step: 265600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:19:51,603-Speed 3102.76 samples/sec Loss 0.3917 Epoch: 15 Global Step: 265650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:20:08,049-Speed 3113.28 samples/sec Loss 0.4021 Epoch: 15 Global Step: 265700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:20:25,222-Speed 2981.44 samples/sec Loss 0.3916 Epoch: 15 Global Step: 265750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:20:41,697-Speed 3107.92 samples/sec Loss 0.3868 Epoch: 15 Global Step: 265800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:20:58,006-Speed 3139.36 samples/sec Loss 0.3866 Epoch: 15 Global Step: 265850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:21:14,403-Speed 3122.75 samples/sec Loss 0.3985 Epoch: 15 Global Step: 265900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:21:30,644-Speed 3152.56 samples/sec Loss 0.3963 Epoch: 15 Global Step: 265950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:21:47,833-Speed 2978.73 samples/sec Loss 0.3890 Epoch: 15 Global Step: 266000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:22:41,102-[lfw][266000]XNorm: 21.623312 Training: 2021-03-17 06:22:41,103-[lfw][266000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-17 06:22:41,103-[lfw][266000]Accuracy-Highest: 0.99817 Training: 2021-03-17 06:23:43,003-[cfp_fp][266000]XNorm: 22.337101 Training: 2021-03-17 06:23:43,003-[cfp_fp][266000]Accuracy-Flip: 0.99243+-0.00519 Training: 2021-03-17 06:23:43,003-[cfp_fp][266000]Accuracy-Highest: 0.99257 Training: 2021-03-17 06:24:36,316-[agedb_30][266000]XNorm: 22.800980 Training: 2021-03-17 06:24:36,316-[agedb_30][266000]Accuracy-Flip: 0.98383+-0.00615 Training: 2021-03-17 06:24:36,317-[agedb_30][266000]Accuracy-Highest: 0.98483 Training: 2021-03-17 06:24:52,525-Speed 277.22 samples/sec Loss 0.3925 Epoch: 15 Global Step: 266050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:25:08,785-Speed 3149.02 samples/sec Loss 0.3881 Epoch: 15 Global Step: 266100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:25:25,021-Speed 3153.59 samples/sec Loss 0.3814 Epoch: 15 Global Step: 266150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:25:41,453-Speed 3115.85 samples/sec Loss 0.3824 Epoch: 15 Global Step: 266200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:25:58,682-Speed 2971.86 samples/sec Loss 0.3904 Epoch: 15 Global Step: 266250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:26:14,900-Speed 3157.01 samples/sec Loss 0.3847 Epoch: 15 Global Step: 266300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:26:31,193-Speed 3142.62 samples/sec Loss 0.3863 Epoch: 15 Global Step: 266350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:26:47,481-Speed 3143.54 samples/sec Loss 0.3905 Epoch: 15 Global Step: 266400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:27:03,733-Speed 3150.43 samples/sec Loss 0.3967 Epoch: 15 Global Step: 266450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:27:20,953-Speed 2973.43 samples/sec Loss 0.3872 Epoch: 15 Global Step: 266500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:27:37,306-Speed 3131.02 samples/sec Loss 0.4001 Epoch: 15 Global Step: 266550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:27:53,629-Speed 3136.74 samples/sec Loss 0.3947 Epoch: 15 Global Step: 266600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:28:10,031-Speed 3121.64 samples/sec Loss 0.3983 Epoch: 15 Global Step: 266650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:28:26,459-Speed 3116.64 samples/sec Loss 0.3998 Epoch: 15 Global Step: 266700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:28:43,578-Speed 2990.96 samples/sec Loss 0.3915 Epoch: 15 Global Step: 266750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:28:59,940-Speed 3129.35 samples/sec Loss 0.4023 Epoch: 15 Global Step: 266800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:29:16,190-Speed 3150.78 samples/sec Loss 0.3895 Epoch: 15 Global Step: 266850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:29:32,478-Speed 3143.53 samples/sec Loss 0.3940 Epoch: 15 Global Step: 266900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:29:48,819-Speed 3133.41 samples/sec Loss 0.3921 Epoch: 15 Global Step: 266950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:30:06,042-Speed 2972.85 samples/sec Loss 0.3983 Epoch: 15 Global Step: 267000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:30:23,026-Speed 3014.63 samples/sec Loss 0.3981 Epoch: 15 Global Step: 267050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:30:54,342-Speed 1634.98 samples/sec Loss 0.3849 Epoch: 16 Global Step: 267100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:31:12,355-Speed 2842.41 samples/sec Loss 0.3907 Epoch: 16 Global Step: 267150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:31:28,644-Speed 3143.46 samples/sec Loss 0.3869 Epoch: 16 Global Step: 267200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:31:45,059-Speed 3119.07 samples/sec Loss 0.3775 Epoch: 16 Global Step: 267250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:32:01,498-Speed 3114.66 samples/sec Loss 0.3772 Epoch: 16 Global Step: 267300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:32:17,840-Speed 3133.30 samples/sec Loss 0.3859 Epoch: 16 Global Step: 267350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:32:34,398-Speed 3092.26 samples/sec Loss 0.3756 Epoch: 16 Global Step: 267400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:32:50,571-Speed 3165.84 samples/sec Loss 0.3860 Epoch: 16 Global Step: 267450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:33:06,690-Speed 3176.35 samples/sec Loss 0.3744 Epoch: 16 Global Step: 267500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:33:22,867-Speed 3165.10 samples/sec Loss 0.3893 Epoch: 16 Global Step: 267550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:33:39,454-Speed 3086.91 samples/sec Loss 0.3848 Epoch: 16 Global Step: 267600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:33:55,942-Speed 3105.36 samples/sec Loss 0.3926 Epoch: 16 Global Step: 267650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:34:12,214-Speed 3146.55 samples/sec Loss 0.3949 Epoch: 16 Global Step: 267700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:34:28,438-Speed 3155.94 samples/sec Loss 0.3696 Epoch: 16 Global Step: 267750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:34:44,665-Speed 3155.44 samples/sec Loss 0.3866 Epoch: 16 Global Step: 267800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-17 06:35:00,893-Speed 3155.21 samples/sec Loss 0.3890 Epoch: 16 Global Step: 267850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:35:17,307-Speed 3119.38 samples/sec Loss 0.3784 Epoch: 16 Global Step: 267900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:35:34,464-Speed 2984.32 samples/sec Loss 0.3787 Epoch: 16 Global Step: 267950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:35:50,695-Speed 3154.46 samples/sec Loss 0.3826 Epoch: 16 Global Step: 268000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:36:43,950-[lfw][268000]XNorm: 21.679034 Training: 2021-03-17 06:36:43,951-[lfw][268000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 06:36:43,951-[lfw][268000]Accuracy-Highest: 0.99817 Training: 2021-03-17 06:37:46,145-[cfp_fp][268000]XNorm: 22.350572 Training: 2021-03-17 06:37:46,146-[cfp_fp][268000]Accuracy-Flip: 0.99243+-0.00519 Training: 2021-03-17 06:37:46,146-[cfp_fp][268000]Accuracy-Highest: 0.99257 Training: 2021-03-17 06:38:39,668-[agedb_30][268000]XNorm: 22.760098 Training: 2021-03-17 06:38:39,669-[agedb_30][268000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 06:38:39,669-[agedb_30][268000]Accuracy-Highest: 0.98483 Training: 2021-03-17 06:38:55,953-Speed 276.37 samples/sec Loss 0.3914 Epoch: 16 Global Step: 268050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:39:12,351-Speed 3122.48 samples/sec Loss 0.3785 Epoch: 16 Global Step: 268100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:39:29,842-Speed 2927.23 samples/sec Loss 0.3763 Epoch: 16 Global Step: 268150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:39:46,487-Speed 3076.14 samples/sec Loss 0.3768 Epoch: 16 Global Step: 268200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:40:02,673-Speed 3163.23 samples/sec Loss 0.3795 Epoch: 16 Global Step: 268250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:40:18,757-Speed 3183.45 samples/sec Loss 0.3869 Epoch: 16 Global Step: 268300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:40:34,963-Speed 3159.42 samples/sec Loss 0.3873 Epoch: 16 Global Step: 268350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:40:51,926-Speed 3018.45 samples/sec Loss 0.3852 Epoch: 16 Global Step: 268400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:41:08,415-Speed 3105.18 samples/sec Loss 0.3826 Epoch: 16 Global Step: 268450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:41:24,743-Speed 3135.85 samples/sec Loss 0.3767 Epoch: 16 Global Step: 268500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:41:40,869-Speed 3175.12 samples/sec Loss 0.3887 Epoch: 16 Global Step: 268550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:41:57,300-Speed 3116.08 samples/sec Loss 0.3817 Epoch: 16 Global Step: 268600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:42:13,442-Speed 3171.87 samples/sec Loss 0.3811 Epoch: 16 Global Step: 268650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:42:30,443-Speed 3011.68 samples/sec Loss 0.3896 Epoch: 16 Global Step: 268700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:42:46,793-Speed 3131.70 samples/sec Loss 0.3794 Epoch: 16 Global Step: 268750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:43:03,058-Speed 3147.88 samples/sec Loss 0.3772 Epoch: 16 Global Step: 268800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:43:19,348-Speed 3143.12 samples/sec Loss 0.3823 Epoch: 16 Global Step: 268850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:43:35,552-Speed 3159.90 samples/sec Loss 0.3848 Epoch: 16 Global Step: 268900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:43:52,545-Speed 3013.00 samples/sec Loss 0.3812 Epoch: 16 Global Step: 268950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:44:08,842-Speed 3141.85 samples/sec Loss 0.3812 Epoch: 16 Global Step: 269000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:44:24,870-Speed 3194.61 samples/sec Loss 0.3838 Epoch: 16 Global Step: 269050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:44:41,106-Speed 3153.63 samples/sec Loss 0.3869 Epoch: 16 Global Step: 269100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:44:57,586-Speed 3106.80 samples/sec Loss 0.3838 Epoch: 16 Global Step: 269150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:45:13,829-Speed 3152.17 samples/sec Loss 0.3905 Epoch: 16 Global Step: 269200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:45:31,203-Speed 2947.03 samples/sec Loss 0.3835 Epoch: 16 Global Step: 269250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:45:48,316-Speed 2991.90 samples/sec Loss 0.3800 Epoch: 16 Global Step: 269300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:46:05,731-Speed 2940.22 samples/sec Loss 0.3830 Epoch: 16 Global Step: 269350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:46:21,937-Speed 3159.43 samples/sec Loss 0.3803 Epoch: 16 Global Step: 269400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:46:38,235-Speed 3141.46 samples/sec Loss 0.3855 Epoch: 16 Global Step: 269450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:46:54,531-Speed 3141.99 samples/sec Loss 0.3827 Epoch: 16 Global Step: 269500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:47:10,608-Speed 3184.77 samples/sec Loss 0.3899 Epoch: 16 Global Step: 269550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:47:26,743-Speed 3173.43 samples/sec Loss 0.3934 Epoch: 16 Global Step: 269600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:47:43,432-Speed 3067.98 samples/sec Loss 0.3880 Epoch: 16 Global Step: 269650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:47:59,891-Speed 3110.91 samples/sec Loss 0.3950 Epoch: 16 Global Step: 269700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:48:16,319-Speed 3116.65 samples/sec Loss 0.3862 Epoch: 16 Global Step: 269750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:48:32,478-Speed 3168.64 samples/sec Loss 0.3969 Epoch: 16 Global Step: 269800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:48:48,649-Speed 3166.14 samples/sec Loss 0.3915 Epoch: 16 Global Step: 269850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:49:04,862-Speed 3158.22 samples/sec Loss 0.3866 Epoch: 16 Global Step: 269900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:49:21,017-Speed 3169.26 samples/sec Loss 0.3829 Epoch: 16 Global Step: 269950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:49:37,154-Speed 3172.98 samples/sec Loss 0.3947 Epoch: 16 Global Step: 270000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:50:30,406-[lfw][270000]XNorm: 21.388239 Training: 2021-03-17 06:50:30,406-[lfw][270000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-17 06:50:30,406-[lfw][270000]Accuracy-Highest: 0.99817 Training: 2021-03-17 06:51:32,124-[cfp_fp][270000]XNorm: 22.091093 Training: 2021-03-17 06:51:32,124-[cfp_fp][270000]Accuracy-Flip: 0.99271+-0.00517 Training: 2021-03-17 06:51:32,124-[cfp_fp][270000]Accuracy-Highest: 0.99271 Training: 2021-03-17 06:52:25,615-[agedb_30][270000]XNorm: 22.546373 Training: 2021-03-17 06:52:25,615-[agedb_30][270000]Accuracy-Flip: 0.98483+-0.00647 Training: 2021-03-17 06:52:25,616-[agedb_30][270000]Accuracy-Highest: 0.98483 Training: 2021-03-17 06:52:41,671-Speed 277.48 samples/sec Loss 0.3884 Epoch: 16 Global Step: 270050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:52:57,833-Speed 3167.87 samples/sec Loss 0.3738 Epoch: 16 Global Step: 270100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:53:15,006-Speed 2981.58 samples/sec Loss 0.3749 Epoch: 16 Global Step: 270150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:53:31,363-Speed 3130.32 samples/sec Loss 0.3897 Epoch: 16 Global Step: 270200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:53:47,789-Speed 3117.04 samples/sec Loss 0.3837 Epoch: 16 Global Step: 270250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:54:04,208-Speed 3118.43 samples/sec Loss 0.3736 Epoch: 16 Global Step: 270300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:54:21,169-Speed 3018.75 samples/sec Loss 0.3792 Epoch: 16 Global Step: 270350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:54:37,386-Speed 3157.25 samples/sec Loss 0.3951 Epoch: 16 Global Step: 270400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:54:53,674-Speed 3143.66 samples/sec Loss 0.3841 Epoch: 16 Global Step: 270450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:55:09,837-Speed 3167.79 samples/sec Loss 0.3901 Epoch: 16 Global Step: 270500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:55:25,929-Speed 3181.74 samples/sec Loss 0.3802 Epoch: 16 Global Step: 270550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:55:42,094-Speed 3167.54 samples/sec Loss 0.3863 Epoch: 16 Global Step: 270600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:55:58,515-Speed 3118.00 samples/sec Loss 0.3794 Epoch: 16 Global Step: 270650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:56:15,787-Speed 2964.33 samples/sec Loss 0.3848 Epoch: 16 Global Step: 270700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:56:31,886-Speed 3180.43 samples/sec Loss 0.3860 Epoch: 16 Global Step: 270750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:56:48,239-Speed 3131.16 samples/sec Loss 0.3849 Epoch: 16 Global Step: 270800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:57:04,486-Speed 3151.30 samples/sec Loss 0.3872 Epoch: 16 Global Step: 270850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:57:20,842-Speed 3130.46 samples/sec Loss 0.3748 Epoch: 16 Global Step: 270900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:57:37,173-Speed 3135.36 samples/sec Loss 0.3875 Epoch: 16 Global Step: 270950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:57:54,174-Speed 3011.56 samples/sec Loss 0.3783 Epoch: 16 Global Step: 271000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:58:10,568-Speed 3123.23 samples/sec Loss 0.3879 Epoch: 16 Global Step: 271050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:58:26,739-Speed 3166.25 samples/sec Loss 0.3892 Epoch: 16 Global Step: 271100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:58:43,697-Speed 3019.30 samples/sec Loss 0.3892 Epoch: 16 Global Step: 271150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:59:00,058-Speed 3129.44 samples/sec Loss 0.3865 Epoch: 16 Global Step: 271200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:59:16,194-Speed 3173.17 samples/sec Loss 0.3971 Epoch: 16 Global Step: 271250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:59:32,738-Speed 3094.81 samples/sec Loss 0.3796 Epoch: 16 Global Step: 271300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 06:59:49,064-Speed 3136.33 samples/sec Loss 0.3887 Epoch: 16 Global Step: 271350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:00:05,400-Speed 3134.18 samples/sec Loss 0.3826 Epoch: 16 Global Step: 271400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:00:22,665-Speed 2965.62 samples/sec Loss 0.3900 Epoch: 16 Global Step: 271450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:00:40,411-Speed 2885.35 samples/sec Loss 0.3841 Epoch: 16 Global Step: 271500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:00:56,822-Speed 3119.79 samples/sec Loss 0.3795 Epoch: 16 Global Step: 271550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:01:12,969-Speed 3171.12 samples/sec Loss 0.3811 Epoch: 16 Global Step: 271600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:01:29,133-Speed 3167.59 samples/sec Loss 0.3823 Epoch: 16 Global Step: 271650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:01:45,347-Speed 3157.84 samples/sec Loss 0.3813 Epoch: 16 Global Step: 271700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:02:01,543-Speed 3161.43 samples/sec Loss 0.3961 Epoch: 16 Global Step: 271750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:02:17,917-Speed 3127.03 samples/sec Loss 0.3955 Epoch: 16 Global Step: 271800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:02:34,194-Speed 3145.49 samples/sec Loss 0.3909 Epoch: 16 Global Step: 271850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:02:50,522-Speed 3135.88 samples/sec Loss 0.3886 Epoch: 16 Global Step: 271900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:03:06,659-Speed 3172.96 samples/sec Loss 0.3821 Epoch: 16 Global Step: 271950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:03:22,819-Speed 3168.46 samples/sec Loss 0.3861 Epoch: 16 Global Step: 272000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:04:16,022-[lfw][272000]XNorm: 21.631197 Training: 2021-03-17 07:04:16,022-[lfw][272000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-17 07:04:16,022-[lfw][272000]Accuracy-Highest: 0.99817 Training: 2021-03-17 07:05:17,902-[cfp_fp][272000]XNorm: 22.358109 Training: 2021-03-17 07:05:17,903-[cfp_fp][272000]Accuracy-Flip: 0.99171+-0.00510 Training: 2021-03-17 07:05:17,903-[cfp_fp][272000]Accuracy-Highest: 0.99271 Training: 2021-03-17 07:06:11,238-[agedb_30][272000]XNorm: 22.792518 Training: 2021-03-17 07:06:11,238-[agedb_30][272000]Accuracy-Flip: 0.98467+-0.00649 Training: 2021-03-17 07:06:11,239-[agedb_30][272000]Accuracy-Highest: 0.98483 Training: 2021-03-17 07:06:27,367-Speed 277.44 samples/sec Loss 0.3878 Epoch: 16 Global Step: 272050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:06:43,423-Speed 3188.90 samples/sec Loss 0.3881 Epoch: 16 Global Step: 272100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:06:59,547-Speed 3175.44 samples/sec Loss 0.3798 Epoch: 16 Global Step: 272150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:07:15,738-Speed 3162.31 samples/sec Loss 0.3869 Epoch: 16 Global Step: 272200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:07:32,299-Speed 3091.71 samples/sec Loss 0.4024 Epoch: 16 Global Step: 272250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:07:48,559-Speed 3148.88 samples/sec Loss 0.3725 Epoch: 16 Global Step: 272300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:08:05,817-Speed 2966.90 samples/sec Loss 0.3861 Epoch: 16 Global Step: 272350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:08:22,067-Speed 3150.98 samples/sec Loss 0.3924 Epoch: 16 Global Step: 272400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:08:38,446-Speed 3126.02 samples/sec Loss 0.3874 Epoch: 16 Global Step: 272450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:08:55,681-Speed 2970.80 samples/sec Loss 0.3821 Epoch: 16 Global Step: 272500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:09:11,862-Speed 3164.25 samples/sec Loss 0.3843 Epoch: 16 Global Step: 272550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:09:28,134-Speed 3146.54 samples/sec Loss 0.3829 Epoch: 16 Global Step: 272600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:09:44,323-Speed 3162.90 samples/sec Loss 0.3901 Epoch: 16 Global Step: 272650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:10:00,700-Speed 3126.34 samples/sec Loss 0.3865 Epoch: 16 Global Step: 272700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:10:16,933-Speed 3154.12 samples/sec Loss 0.3838 Epoch: 16 Global Step: 272750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:10:33,453-Speed 3099.41 samples/sec Loss 0.3897 Epoch: 16 Global Step: 272800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:10:50,624-Speed 2981.80 samples/sec Loss 0.3878 Epoch: 16 Global Step: 272850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:11:06,830-Speed 3159.39 samples/sec Loss 0.3806 Epoch: 16 Global Step: 272900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:11:23,136-Speed 3140.10 samples/sec Loss 0.3850 Epoch: 16 Global Step: 272950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:11:39,401-Speed 3147.96 samples/sec Loss 0.3993 Epoch: 16 Global Step: 273000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:11:55,712-Speed 3139.20 samples/sec Loss 0.3913 Epoch: 16 Global Step: 273050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:12:11,817-Speed 3179.14 samples/sec Loss 0.3863 Epoch: 16 Global Step: 273100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:12:28,794-Speed 3015.94 samples/sec Loss 0.3762 Epoch: 16 Global Step: 273150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:12:45,045-Speed 3150.77 samples/sec Loss 0.3938 Epoch: 16 Global Step: 273200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:13:01,155-Speed 3178.12 samples/sec Loss 0.3852 Epoch: 16 Global Step: 273250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:13:18,704-Speed 2917.74 samples/sec Loss 0.3922 Epoch: 16 Global Step: 273300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:13:34,916-Speed 3158.07 samples/sec Loss 0.3772 Epoch: 16 Global Step: 273350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:13:51,124-Speed 3159.17 samples/sec Loss 0.3876 Epoch: 16 Global Step: 273400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:14:07,571-Speed 3113.10 samples/sec Loss 0.3886 Epoch: 16 Global Step: 273450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:14:24,105-Speed 3096.69 samples/sec Loss 0.3905 Epoch: 16 Global Step: 273500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:14:40,462-Speed 3130.21 samples/sec Loss 0.3866 Epoch: 16 Global Step: 273550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:14:56,839-Speed 3126.59 samples/sec Loss 0.3924 Epoch: 16 Global Step: 273600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:15:14,020-Speed 2980.07 samples/sec Loss 0.3765 Epoch: 16 Global Step: 273650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:15:32,229-Speed 2811.79 samples/sec Loss 0.3869 Epoch: 16 Global Step: 273700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:15:48,365-Speed 3173.29 samples/sec Loss 0.3895 Epoch: 16 Global Step: 273750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:16:04,859-Speed 3104.14 samples/sec Loss 0.3831 Epoch: 16 Global Step: 273800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:16:21,273-Speed 3119.38 samples/sec Loss 0.3743 Epoch: 16 Global Step: 273850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:16:37,540-Speed 3147.60 samples/sec Loss 0.3890 Epoch: 16 Global Step: 273900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:16:54,067-Speed 3098.01 samples/sec Loss 0.3859 Epoch: 16 Global Step: 273950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:17:10,505-Speed 3114.91 samples/sec Loss 0.3812 Epoch: 16 Global Step: 274000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:18:03,721-[lfw][274000]XNorm: 21.688083 Training: 2021-03-17 07:18:03,721-[lfw][274000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 07:18:03,722-[lfw][274000]Accuracy-Highest: 0.99817 Training: 2021-03-17 07:19:05,877-[cfp_fp][274000]XNorm: 22.430791 Training: 2021-03-17 07:19:05,877-[cfp_fp][274000]Accuracy-Flip: 0.99229+-0.00535 Training: 2021-03-17 07:19:05,877-[cfp_fp][274000]Accuracy-Highest: 0.99271 Training: 2021-03-17 07:19:59,208-[agedb_30][274000]XNorm: 22.881862 Training: 2021-03-17 07:19:59,209-[agedb_30][274000]Accuracy-Flip: 0.98400+-0.00663 Training: 2021-03-17 07:19:59,209-[agedb_30][274000]Accuracy-Highest: 0.98483 Training: 2021-03-17 07:20:15,627-Speed 276.58 samples/sec Loss 0.3861 Epoch: 16 Global Step: 274050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:20:31,962-Speed 3134.43 samples/sec Loss 0.3880 Epoch: 16 Global Step: 274100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:20:48,357-Speed 3122.89 samples/sec Loss 0.3904 Epoch: 16 Global Step: 274150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:21:04,805-Speed 3112.94 samples/sec Loss 0.3835 Epoch: 16 Global Step: 274200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:21:20,985-Speed 3164.63 samples/sec Loss 0.3828 Epoch: 16 Global Step: 274250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:21:37,426-Speed 3114.18 samples/sec Loss 0.3947 Epoch: 16 Global Step: 274300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:21:53,902-Speed 3107.58 samples/sec Loss 0.3896 Epoch: 16 Global Step: 274350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:22:10,403-Speed 3103.06 samples/sec Loss 0.3860 Epoch: 16 Global Step: 274400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:22:27,142-Speed 3058.81 samples/sec Loss 0.3832 Epoch: 16 Global Step: 274450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:22:43,296-Speed 3169.51 samples/sec Loss 0.3857 Epoch: 16 Global Step: 274500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:22:59,657-Speed 3129.51 samples/sec Loss 0.3797 Epoch: 16 Global Step: 274550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:23:16,981-Speed 2955.48 samples/sec Loss 0.3812 Epoch: 16 Global Step: 274600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:23:33,163-Speed 3164.12 samples/sec Loss 0.3895 Epoch: 16 Global Step: 274650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:23:49,202-Speed 3192.40 samples/sec Loss 0.3882 Epoch: 16 Global Step: 274700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:24:06,358-Speed 2984.47 samples/sec Loss 0.3886 Epoch: 16 Global Step: 274750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:24:22,691-Speed 3134.84 samples/sec Loss 0.3846 Epoch: 16 Global Step: 274800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:24:39,170-Speed 3107.03 samples/sec Loss 0.3883 Epoch: 16 Global Step: 274850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:24:55,772-Speed 3084.11 samples/sec Loss 0.3859 Epoch: 16 Global Step: 274900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:25:12,314-Speed 3095.19 samples/sec Loss 0.3801 Epoch: 16 Global Step: 274950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:25:28,531-Speed 3157.29 samples/sec Loss 0.3977 Epoch: 16 Global Step: 275000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:25:45,795-Speed 2965.79 samples/sec Loss 0.3921 Epoch: 16 Global Step: 275050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:26:02,089-Speed 3142.39 samples/sec Loss 0.3797 Epoch: 16 Global Step: 275100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:26:18,480-Speed 3123.70 samples/sec Loss 0.3880 Epoch: 16 Global Step: 275150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:26:34,821-Speed 3133.35 samples/sec Loss 0.3822 Epoch: 16 Global Step: 275200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:26:50,986-Speed 3167.38 samples/sec Loss 0.3845 Epoch: 16 Global Step: 275250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:27:07,352-Speed 3128.43 samples/sec Loss 0.3834 Epoch: 16 Global Step: 275300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:27:24,237-Speed 3032.53 samples/sec Loss 0.3902 Epoch: 16 Global Step: 275350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:27:40,841-Speed 3083.68 samples/sec Loss 0.3769 Epoch: 16 Global Step: 275400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:27:57,190-Speed 3131.68 samples/sec Loss 0.3809 Epoch: 16 Global Step: 275450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:28:14,339-Speed 2985.62 samples/sec Loss 0.3873 Epoch: 16 Global Step: 275500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:28:31,032-Speed 3067.32 samples/sec Loss 0.3916 Epoch: 16 Global Step: 275550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:28:47,326-Speed 3142.36 samples/sec Loss 0.3876 Epoch: 16 Global Step: 275600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:29:03,544-Speed 3157.11 samples/sec Loss 0.3988 Epoch: 16 Global Step: 275650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:29:19,912-Speed 3128.12 samples/sec Loss 0.3977 Epoch: 16 Global Step: 275700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:29:36,323-Speed 3120.00 samples/sec Loss 0.3760 Epoch: 16 Global Step: 275750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:29:52,660-Speed 3134.09 samples/sec Loss 0.3882 Epoch: 16 Global Step: 275800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:30:10,628-Speed 2849.57 samples/sec Loss 0.3951 Epoch: 16 Global Step: 275850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:30:27,969-Speed 2952.60 samples/sec Loss 0.3786 Epoch: 16 Global Step: 275900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:30:44,504-Speed 3096.63 samples/sec Loss 0.3867 Epoch: 16 Global Step: 275950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:31:01,091-Speed 3086.81 samples/sec Loss 0.3916 Epoch: 16 Global Step: 276000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:31:54,251-[lfw][276000]XNorm: 21.466982 Training: 2021-03-17 07:31:54,252-[lfw][276000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 07:31:54,252-[lfw][276000]Accuracy-Highest: 0.99817 Training: 2021-03-17 07:32:56,106-[cfp_fp][276000]XNorm: 22.271137 Training: 2021-03-17 07:32:56,106-[cfp_fp][276000]Accuracy-Flip: 0.99186+-0.00561 Training: 2021-03-17 07:32:56,106-[cfp_fp][276000]Accuracy-Highest: 0.99271 Training: 2021-03-17 07:33:49,436-[agedb_30][276000]XNorm: 22.682531 Training: 2021-03-17 07:33:49,437-[agedb_30][276000]Accuracy-Flip: 0.98350+-0.00580 Training: 2021-03-17 07:33:49,438-[agedb_30][276000]Accuracy-Highest: 0.98483 Training: 2021-03-17 07:34:06,000-Speed 276.89 samples/sec Loss 0.3836 Epoch: 16 Global Step: 276050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:34:22,873-Speed 3034.57 samples/sec Loss 0.3894 Epoch: 16 Global Step: 276100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:34:39,117-Speed 3151.89 samples/sec Loss 0.3851 Epoch: 16 Global Step: 276150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:34:55,429-Speed 3138.91 samples/sec Loss 0.3781 Epoch: 16 Global Step: 276200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:35:11,679-Speed 3150.92 samples/sec Loss 0.3800 Epoch: 16 Global Step: 276250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:35:27,825-Speed 3171.21 samples/sec Loss 0.3832 Epoch: 16 Global Step: 276300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:35:44,359-Speed 3096.71 samples/sec Loss 0.3774 Epoch: 16 Global Step: 276350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:36:00,742-Speed 3125.25 samples/sec Loss 0.3839 Epoch: 16 Global Step: 276400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:36:17,069-Speed 3136.03 samples/sec Loss 0.3785 Epoch: 16 Global Step: 276450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:36:33,460-Speed 3123.69 samples/sec Loss 0.3780 Epoch: 16 Global Step: 276500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:36:49,758-Speed 3141.72 samples/sec Loss 0.3766 Epoch: 16 Global Step: 276550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:37:06,105-Speed 3132.00 samples/sec Loss 0.3821 Epoch: 16 Global Step: 276600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:37:22,388-Speed 3144.57 samples/sec Loss 0.3924 Epoch: 16 Global Step: 276650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-17 07:37:38,824-Speed 3115.13 samples/sec Loss 0.3847 Epoch: 16 Global Step: 276700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:37:55,460-Speed 3077.81 samples/sec Loss 0.3878 Epoch: 16 Global Step: 276750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:38:12,885-Speed 2938.44 samples/sec Loss 0.3921 Epoch: 16 Global Step: 276800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:38:29,485-Speed 3084.27 samples/sec Loss 0.3848 Epoch: 16 Global Step: 276850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:38:45,802-Speed 3138.04 samples/sec Loss 0.3921 Epoch: 16 Global Step: 276900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:39:03,024-Speed 2972.90 samples/sec Loss 0.3792 Epoch: 16 Global Step: 276950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:39:19,631-Speed 3083.12 samples/sec Loss 0.3744 Epoch: 16 Global Step: 277000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:39:35,832-Speed 3160.46 samples/sec Loss 0.3879 Epoch: 16 Global Step: 277050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:39:52,169-Speed 3134.03 samples/sec Loss 0.3753 Epoch: 16 Global Step: 277100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:40:08,568-Speed 3122.22 samples/sec Loss 0.3836 Epoch: 16 Global Step: 277150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:40:25,801-Speed 2971.24 samples/sec Loss 0.3914 Epoch: 16 Global Step: 277200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:40:42,060-Speed 3149.17 samples/sec Loss 0.3810 Epoch: 16 Global Step: 277250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:40:58,431-Speed 3127.53 samples/sec Loss 0.3802 Epoch: 16 Global Step: 277300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:41:14,764-Speed 3134.80 samples/sec Loss 0.3788 Epoch: 16 Global Step: 277350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:41:30,989-Speed 3155.68 samples/sec Loss 0.3874 Epoch: 16 Global Step: 277400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:41:47,288-Speed 3141.39 samples/sec Loss 0.3849 Epoch: 16 Global Step: 277450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:42:03,627-Speed 3133.72 samples/sec Loss 0.3801 Epoch: 16 Global Step: 277500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:42:20,890-Speed 2966.00 samples/sec Loss 0.3955 Epoch: 16 Global Step: 277550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:42:37,228-Speed 3133.84 samples/sec Loss 0.3815 Epoch: 16 Global Step: 277600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:42:53,639-Speed 3120.04 samples/sec Loss 0.3868 Epoch: 16 Global Step: 277650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:43:10,918-Speed 2963.13 samples/sec Loss 0.3797 Epoch: 16 Global Step: 277700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:43:27,124-Speed 3159.51 samples/sec Loss 0.3850 Epoch: 16 Global Step: 277750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:43:43,668-Speed 3094.80 samples/sec Loss 0.3947 Epoch: 16 Global Step: 277800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:44:00,095-Speed 3116.89 samples/sec Loss 0.3881 Epoch: 16 Global Step: 277850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:44:16,449-Speed 3130.85 samples/sec Loss 0.3958 Epoch: 16 Global Step: 277900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:44:32,845-Speed 3122.94 samples/sec Loss 0.3744 Epoch: 16 Global Step: 277950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:44:49,453-Speed 3082.89 samples/sec Loss 0.3904 Epoch: 16 Global Step: 278000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:45:42,572-[lfw][278000]XNorm: 21.599701 Training: 2021-03-17 07:45:42,572-[lfw][278000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 07:45:42,572-[lfw][278000]Accuracy-Highest: 0.99817 Training: 2021-03-17 07:46:44,369-[cfp_fp][278000]XNorm: 22.256892 Training: 2021-03-17 07:46:44,370-[cfp_fp][278000]Accuracy-Flip: 0.99229+-0.00516 Training: 2021-03-17 07:46:44,370-[cfp_fp][278000]Accuracy-Highest: 0.99271 Training: 2021-03-17 07:47:37,449-[agedb_30][278000]XNorm: 22.759164 Training: 2021-03-17 07:47:37,449-[agedb_30][278000]Accuracy-Flip: 0.98367+-0.00595 Training: 2021-03-17 07:47:37,449-[agedb_30][278000]Accuracy-Highest: 0.98483 Training: 2021-03-17 07:47:53,825-Speed 277.70 samples/sec Loss 0.3793 Epoch: 16 Global Step: 278050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:48:12,774-Speed 2702.04 samples/sec Loss 0.3952 Epoch: 16 Global Step: 278100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:48:29,218-Speed 3113.68 samples/sec Loss 0.3894 Epoch: 16 Global Step: 278150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:48:45,617-Speed 3122.29 samples/sec Loss 0.3887 Epoch: 16 Global Step: 278200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:49:01,773-Speed 3169.07 samples/sec Loss 0.3838 Epoch: 16 Global Step: 278250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:49:18,476-Speed 3065.48 samples/sec Loss 0.3822 Epoch: 16 Global Step: 278300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:49:35,078-Speed 3083.98 samples/sec Loss 0.3757 Epoch: 16 Global Step: 278350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:49:51,377-Speed 3141.34 samples/sec Loss 0.3828 Epoch: 16 Global Step: 278400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:50:08,011-Speed 3078.26 samples/sec Loss 0.3922 Epoch: 16 Global Step: 278450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:50:24,940-Speed 3024.43 samples/sec Loss 0.3795 Epoch: 16 Global Step: 278500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:50:41,134-Speed 3161.87 samples/sec Loss 0.3893 Epoch: 16 Global Step: 278550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:50:57,646-Speed 3100.77 samples/sec Loss 0.3856 Epoch: 16 Global Step: 278600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:51:14,115-Speed 3109.08 samples/sec Loss 0.3825 Epoch: 16 Global Step: 278650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:51:30,693-Speed 3088.58 samples/sec Loss 0.3854 Epoch: 16 Global Step: 278700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:51:46,915-Speed 3156.15 samples/sec Loss 0.3791 Epoch: 16 Global Step: 278750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:52:03,262-Speed 3132.27 samples/sec Loss 0.3882 Epoch: 16 Global Step: 278800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:52:19,553-Speed 3142.96 samples/sec Loss 0.3910 Epoch: 16 Global Step: 278850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:52:35,667-Speed 3177.38 samples/sec Loss 0.3785 Epoch: 16 Global Step: 278900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:52:52,229-Speed 3091.43 samples/sec Loss 0.3859 Epoch: 16 Global Step: 278950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:53:08,745-Speed 3100.09 samples/sec Loss 0.3864 Epoch: 16 Global Step: 279000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:53:25,155-Speed 3120.18 samples/sec Loss 0.3914 Epoch: 16 Global Step: 279050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:53:42,561-Speed 2941.63 samples/sec Loss 0.3871 Epoch: 16 Global Step: 279100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:53:58,936-Speed 3126.82 samples/sec Loss 0.3833 Epoch: 16 Global Step: 279150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:54:16,440-Speed 2925.06 samples/sec Loss 0.3818 Epoch: 16 Global Step: 279200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:54:32,783-Speed 3132.97 samples/sec Loss 0.3837 Epoch: 16 Global Step: 279250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:54:49,340-Speed 3092.57 samples/sec Loss 0.3870 Epoch: 16 Global Step: 279300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:55:05,554-Speed 3157.82 samples/sec Loss 0.3847 Epoch: 16 Global Step: 279350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:55:22,014-Speed 3110.62 samples/sec Loss 0.3798 Epoch: 16 Global Step: 279400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:55:38,998-Speed 3014.77 samples/sec Loss 0.3815 Epoch: 16 Global Step: 279450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:55:55,636-Speed 3077.28 samples/sec Loss 0.3842 Epoch: 16 Global Step: 279500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:56:11,938-Speed 3140.93 samples/sec Loss 0.3865 Epoch: 16 Global Step: 279550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:56:28,475-Speed 3096.19 samples/sec Loss 0.3822 Epoch: 16 Global Step: 279600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:56:44,974-Speed 3103.32 samples/sec Loss 0.3897 Epoch: 16 Global Step: 279650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:57:01,277-Speed 3140.46 samples/sec Loss 0.3742 Epoch: 16 Global Step: 279700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:57:17,544-Speed 3147.68 samples/sec Loss 0.3945 Epoch: 16 Global Step: 279750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:57:33,848-Speed 3140.35 samples/sec Loss 0.3829 Epoch: 16 Global Step: 279800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:57:51,050-Speed 2976.47 samples/sec Loss 0.4000 Epoch: 16 Global Step: 279850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:58:08,161-Speed 2992.31 samples/sec Loss 0.3906 Epoch: 16 Global Step: 279900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:58:24,329-Speed 3167.03 samples/sec Loss 0.3872 Epoch: 16 Global Step: 279950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:58:40,757-Speed 3116.58 samples/sec Loss 0.3806 Epoch: 16 Global Step: 280000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 07:59:34,437-[lfw][280000]XNorm: 21.703750 Training: 2021-03-17 07:59:34,437-[lfw][280000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 07:59:34,437-[lfw][280000]Accuracy-Highest: 0.99817 Training: 2021-03-17 08:00:36,417-[cfp_fp][280000]XNorm: 22.273528 Training: 2021-03-17 08:00:36,417-[cfp_fp][280000]Accuracy-Flip: 0.99214+-0.00504 Training: 2021-03-17 08:00:36,417-[cfp_fp][280000]Accuracy-Highest: 0.99271 Training: 2021-03-17 08:01:29,819-[agedb_30][280000]XNorm: 22.830263 Training: 2021-03-17 08:01:29,819-[agedb_30][280000]Accuracy-Flip: 0.98383+-0.00646 Training: 2021-03-17 08:01:29,819-[agedb_30][280000]Accuracy-Highest: 0.98483 Training: 2021-03-17 08:01:46,256-Speed 276.01 samples/sec Loss 0.3931 Epoch: 16 Global Step: 280050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:02:02,820-Speed 3091.03 samples/sec Loss 0.3798 Epoch: 16 Global Step: 280100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:02:19,227-Speed 3120.71 samples/sec Loss 0.3813 Epoch: 16 Global Step: 280150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:02:35,313-Speed 3183.11 samples/sec Loss 0.3819 Epoch: 16 Global Step: 280200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:02:52,482-Speed 2982.21 samples/sec Loss 0.3806 Epoch: 16 Global Step: 280250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:03:09,647-Speed 2982.83 samples/sec Loss 0.3834 Epoch: 16 Global Step: 280300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:03:26,794-Speed 2985.95 samples/sec Loss 0.3801 Epoch: 16 Global Step: 280350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:03:43,380-Speed 3087.17 samples/sec Loss 0.3906 Epoch: 16 Global Step: 280400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:03:59,873-Speed 3104.35 samples/sec Loss 0.3863 Epoch: 16 Global Step: 280450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:04:16,290-Speed 3118.85 samples/sec Loss 0.3840 Epoch: 16 Global Step: 280500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:04:32,659-Speed 3128.01 samples/sec Loss 0.3860 Epoch: 16 Global Step: 280550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:04:49,018-Speed 3129.78 samples/sec Loss 0.3871 Epoch: 16 Global Step: 280600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:05:05,221-Speed 3159.94 samples/sec Loss 0.3754 Epoch: 16 Global Step: 280650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:05:21,530-Speed 3139.57 samples/sec Loss 0.3823 Epoch: 16 Global Step: 280700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:05:37,873-Speed 3132.96 samples/sec Loss 0.3852 Epoch: 16 Global Step: 280750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:05:53,992-Speed 3176.48 samples/sec Loss 0.3870 Epoch: 16 Global Step: 280800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:06:10,331-Speed 3133.71 samples/sec Loss 0.3770 Epoch: 16 Global Step: 280850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:06:26,753-Speed 3117.90 samples/sec Loss 0.3861 Epoch: 16 Global Step: 280900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:06:43,036-Speed 3144.33 samples/sec Loss 0.3870 Epoch: 16 Global Step: 280950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:06:59,296-Speed 3148.95 samples/sec Loss 0.3808 Epoch: 16 Global Step: 281000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:07:15,793-Speed 3103.77 samples/sec Loss 0.3797 Epoch: 16 Global Step: 281050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:07:32,216-Speed 3117.62 samples/sec Loss 0.3829 Epoch: 16 Global Step: 281100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:07:48,620-Speed 3121.27 samples/sec Loss 0.3862 Epoch: 16 Global Step: 281150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:08:05,018-Speed 3122.41 samples/sec Loss 0.3760 Epoch: 16 Global Step: 281200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:08:21,434-Speed 3119.08 samples/sec Loss 0.3820 Epoch: 16 Global Step: 281250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:08:38,538-Speed 2993.54 samples/sec Loss 0.3829 Epoch: 16 Global Step: 281300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:08:55,478-Speed 3022.42 samples/sec Loss 0.3838 Epoch: 16 Global Step: 281350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:09:11,994-Speed 3100.06 samples/sec Loss 0.3945 Epoch: 16 Global Step: 281400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:09:28,147-Speed 3169.95 samples/sec Loss 0.3877 Epoch: 16 Global Step: 281450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:09:44,722-Speed 3089.04 samples/sec Loss 0.3799 Epoch: 16 Global Step: 281500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:10:01,074-Speed 3131.27 samples/sec Loss 0.3819 Epoch: 16 Global Step: 281550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:10:17,367-Speed 3142.49 samples/sec Loss 0.3824 Epoch: 16 Global Step: 281600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:10:33,709-Speed 3133.08 samples/sec Loss 0.3795 Epoch: 16 Global Step: 281650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:10:50,780-Speed 2999.39 samples/sec Loss 0.3885 Epoch: 16 Global Step: 281700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:11:07,113-Speed 3134.84 samples/sec Loss 0.3809 Epoch: 16 Global Step: 281750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:11:23,501-Speed 3124.31 samples/sec Loss 0.3807 Epoch: 16 Global Step: 281800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:11:40,085-Speed 3087.46 samples/sec Loss 0.3803 Epoch: 16 Global Step: 281850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:11:56,427-Speed 3133.11 samples/sec Loss 0.3891 Epoch: 16 Global Step: 281900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:12:12,743-Speed 3138.13 samples/sec Loss 0.3858 Epoch: 16 Global Step: 281950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:12:29,249-Speed 3102.01 samples/sec Loss 0.3817 Epoch: 16 Global Step: 282000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:13:22,435-[lfw][282000]XNorm: 21.630128 Training: 2021-03-17 08:13:22,436-[lfw][282000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-17 08:13:22,436-[lfw][282000]Accuracy-Highest: 0.99817 Training: 2021-03-17 08:14:24,127-[cfp_fp][282000]XNorm: 22.259111 Training: 2021-03-17 08:14:24,128-[cfp_fp][282000]Accuracy-Flip: 0.99229+-0.00547 Training: 2021-03-17 08:14:24,128-[cfp_fp][282000]Accuracy-Highest: 0.99271 Training: 2021-03-17 08:15:17,497-[agedb_30][282000]XNorm: 22.812639 Training: 2021-03-17 08:15:17,497-[agedb_30][282000]Accuracy-Flip: 0.98383+-0.00615 Training: 2021-03-17 08:15:17,497-[agedb_30][282000]Accuracy-Highest: 0.98483 Training: 2021-03-17 08:15:34,064-Speed 277.03 samples/sec Loss 0.3770 Epoch: 16 Global Step: 282050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:15:51,935-Speed 2865.16 samples/sec Loss 0.3851 Epoch: 16 Global Step: 282100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:16:08,656-Speed 3062.00 samples/sec Loss 0.3782 Epoch: 16 Global Step: 282150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:16:25,477-Speed 3043.99 samples/sec Loss 0.3863 Epoch: 16 Global Step: 282200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:16:41,840-Speed 3129.11 samples/sec Loss 0.3938 Epoch: 16 Global Step: 282250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:16:58,099-Speed 3149.10 samples/sec Loss 0.3839 Epoch: 16 Global Step: 282300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:17:14,239-Speed 3172.37 samples/sec Loss 0.3867 Epoch: 16 Global Step: 282350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:17:31,224-Speed 3014.48 samples/sec Loss 0.3823 Epoch: 16 Global Step: 282400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:17:47,404-Speed 3164.56 samples/sec Loss 0.3801 Epoch: 16 Global Step: 282450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:18:04,494-Speed 2995.87 samples/sec Loss 0.3758 Epoch: 16 Global Step: 282500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:18:22,007-Speed 2923.64 samples/sec Loss 0.3786 Epoch: 16 Global Step: 282550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:18:38,348-Speed 3133.49 samples/sec Loss 0.3876 Epoch: 16 Global Step: 282600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:18:54,765-Speed 3118.66 samples/sec Loss 0.3834 Epoch: 16 Global Step: 282650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:19:11,155-Speed 3124.00 samples/sec Loss 0.3818 Epoch: 16 Global Step: 282700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:19:27,278-Speed 3175.74 samples/sec Loss 0.3766 Epoch: 16 Global Step: 282750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:19:43,639-Speed 3129.46 samples/sec Loss 0.3868 Epoch: 16 Global Step: 282800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:20:00,089-Speed 3112.54 samples/sec Loss 0.3821 Epoch: 16 Global Step: 282850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:20:16,333-Speed 3152.09 samples/sec Loss 0.3909 Epoch: 16 Global Step: 282900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:20:32,566-Speed 3154.12 samples/sec Loss 0.3794 Epoch: 16 Global Step: 282950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:20:48,893-Speed 3135.97 samples/sec Loss 0.3810 Epoch: 16 Global Step: 283000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:21:05,025-Speed 3173.92 samples/sec Loss 0.3883 Epoch: 16 Global Step: 283050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:21:21,409-Speed 3125.21 samples/sec Loss 0.3847 Epoch: 16 Global Step: 283100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:21:37,706-Speed 3141.71 samples/sec Loss 0.3875 Epoch: 16 Global Step: 283150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:21:54,056-Speed 3131.58 samples/sec Loss 0.3902 Epoch: 16 Global Step: 283200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:22:10,602-Speed 3094.50 samples/sec Loss 0.3843 Epoch: 16 Global Step: 283250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:22:26,845-Speed 3152.18 samples/sec Loss 0.3809 Epoch: 16 Global Step: 283300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:22:43,255-Speed 3120.24 samples/sec Loss 0.3793 Epoch: 16 Global Step: 283350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:22:59,466-Speed 3158.29 samples/sec Loss 0.3868 Epoch: 16 Global Step: 283400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:23:15,875-Speed 3120.33 samples/sec Loss 0.3839 Epoch: 16 Global Step: 283450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:23:33,008-Speed 2988.51 samples/sec Loss 0.3848 Epoch: 16 Global Step: 283500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:23:50,031-Speed 3007.88 samples/sec Loss 0.3916 Epoch: 16 Global Step: 283550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:24:06,819-Speed 3049.84 samples/sec Loss 0.3764 Epoch: 16 Global Step: 283600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:24:23,164-Speed 3132.58 samples/sec Loss 0.3766 Epoch: 16 Global Step: 283650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:24:39,304-Speed 3172.36 samples/sec Loss 0.3806 Epoch: 16 Global Step: 283700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:25:09,314-Speed 1706.13 samples/sec Loss 0.3860 Epoch: 17 Global Step: 283750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:25:26,070-Speed 3055.76 samples/sec Loss 0.3670 Epoch: 17 Global Step: 283800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:25:43,792-Speed 2889.06 samples/sec Loss 0.3691 Epoch: 17 Global Step: 283850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:26:00,564-Speed 3052.89 samples/sec Loss 0.3703 Epoch: 17 Global Step: 283900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:26:16,803-Speed 3152.97 samples/sec Loss 0.3756 Epoch: 17 Global Step: 283950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:26:33,159-Speed 3130.46 samples/sec Loss 0.3702 Epoch: 17 Global Step: 284000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:27:26,437-[lfw][284000]XNorm: 21.444544 Training: 2021-03-17 08:27:26,438-[lfw][284000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 08:27:26,438-[lfw][284000]Accuracy-Highest: 0.99817 Training: 2021-03-17 08:28:28,271-[cfp_fp][284000]XNorm: 22.198128 Training: 2021-03-17 08:28:28,271-[cfp_fp][284000]Accuracy-Flip: 0.99186+-0.00527 Training: 2021-03-17 08:28:28,271-[cfp_fp][284000]Accuracy-Highest: 0.99271 Training: 2021-03-17 08:29:21,710-[agedb_30][284000]XNorm: 22.647252 Training: 2021-03-17 08:29:21,711-[agedb_30][284000]Accuracy-Flip: 0.98333+-0.00601 Training: 2021-03-17 08:29:21,711-[agedb_30][284000]Accuracy-Highest: 0.98483 Training: 2021-03-17 08:29:38,028-Speed 276.95 samples/sec Loss 0.3630 Epoch: 17 Global Step: 284050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:29:54,370-Speed 3133.15 samples/sec Loss 0.3788 Epoch: 17 Global Step: 284100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:30:10,842-Speed 3108.35 samples/sec Loss 0.3708 Epoch: 17 Global Step: 284150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:30:27,322-Speed 3106.87 samples/sec Loss 0.3748 Epoch: 17 Global Step: 284200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:30:43,698-Speed 3126.79 samples/sec Loss 0.3752 Epoch: 17 Global Step: 284250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:31:01,271-Speed 2913.59 samples/sec Loss 0.3719 Epoch: 17 Global Step: 284300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:31:18,325-Speed 3002.30 samples/sec Loss 0.3739 Epoch: 17 Global Step: 284350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:31:34,553-Speed 3155.20 samples/sec Loss 0.3690 Epoch: 17 Global Step: 284400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:31:50,678-Speed 3175.20 samples/sec Loss 0.3756 Epoch: 17 Global Step: 284450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:32:07,111-Speed 3115.88 samples/sec Loss 0.3679 Epoch: 17 Global Step: 284500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:32:23,375-Speed 3148.11 samples/sec Loss 0.3760 Epoch: 17 Global Step: 284550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:32:40,504-Speed 2989.13 samples/sec Loss 0.3681 Epoch: 17 Global Step: 284600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:32:57,792-Speed 2961.73 samples/sec Loss 0.3718 Epoch: 17 Global Step: 284650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:33:15,257-Speed 2931.53 samples/sec Loss 0.3649 Epoch: 17 Global Step: 284700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:33:31,489-Speed 3154.48 samples/sec Loss 0.3742 Epoch: 17 Global Step: 284750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:33:47,840-Speed 3131.38 samples/sec Loss 0.3755 Epoch: 17 Global Step: 284800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:34:04,142-Speed 3140.75 samples/sec Loss 0.3677 Epoch: 17 Global Step: 284850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:34:20,352-Speed 3158.79 samples/sec Loss 0.3618 Epoch: 17 Global Step: 284900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:34:36,702-Speed 3131.50 samples/sec Loss 0.3649 Epoch: 17 Global Step: 284950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:34:53,096-Speed 3123.18 samples/sec Loss 0.3793 Epoch: 17 Global Step: 285000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:35:09,530-Speed 3115.61 samples/sec Loss 0.3717 Epoch: 17 Global Step: 285050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:35:25,995-Speed 3109.75 samples/sec Loss 0.3703 Epoch: 17 Global Step: 285100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:35:42,520-Speed 3098.35 samples/sec Loss 0.3709 Epoch: 17 Global Step: 285150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:35:59,229-Speed 3064.34 samples/sec Loss 0.3781 Epoch: 17 Global Step: 285200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:36:15,513-Speed 3144.40 samples/sec Loss 0.3685 Epoch: 17 Global Step: 285250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:36:31,993-Speed 3106.92 samples/sec Loss 0.3730 Epoch: 17 Global Step: 285300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:36:48,335-Speed 3133.17 samples/sec Loss 0.3826 Epoch: 17 Global Step: 285350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:37:04,498-Speed 3167.73 samples/sec Loss 0.3580 Epoch: 17 Global Step: 285400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:37:20,876-Speed 3126.33 samples/sec Loss 0.3781 Epoch: 17 Global Step: 285450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-17 08:37:37,087-Speed 3158.32 samples/sec Loss 0.3674 Epoch: 17 Global Step: 285500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:37:53,320-Speed 3154.24 samples/sec Loss 0.3679 Epoch: 17 Global Step: 285550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:38:09,600-Speed 3145.06 samples/sec Loss 0.3702 Epoch: 17 Global Step: 285600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:38:26,137-Speed 3096.22 samples/sec Loss 0.3661 Epoch: 17 Global Step: 285650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:38:44,683-Speed 2760.78 samples/sec Loss 0.3634 Epoch: 17 Global Step: 285700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:39:01,236-Speed 3093.09 samples/sec Loss 0.3758 Epoch: 17 Global Step: 285750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:39:17,869-Speed 3078.38 samples/sec Loss 0.3755 Epoch: 17 Global Step: 285800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:39:34,132-Speed 3148.31 samples/sec Loss 0.3628 Epoch: 17 Global Step: 285850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:39:50,454-Speed 3136.88 samples/sec Loss 0.3757 Epoch: 17 Global Step: 285900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:40:07,408-Speed 3020.16 samples/sec Loss 0.3763 Epoch: 17 Global Step: 285950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:40:23,666-Speed 3149.18 samples/sec Loss 0.3684 Epoch: 17 Global Step: 286000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:41:17,100-[lfw][286000]XNorm: 21.547386 Training: 2021-03-17 08:41:17,100-[lfw][286000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 08:41:17,100-[lfw][286000]Accuracy-Highest: 0.99817 Training: 2021-03-17 08:42:19,012-[cfp_fp][286000]XNorm: 22.276830 Training: 2021-03-17 08:42:19,012-[cfp_fp][286000]Accuracy-Flip: 0.99171+-0.00541 Training: 2021-03-17 08:42:19,012-[cfp_fp][286000]Accuracy-Highest: 0.99271 Training: 2021-03-17 08:43:12,205-[agedb_30][286000]XNorm: 22.731708 Training: 2021-03-17 08:43:12,206-[agedb_30][286000]Accuracy-Flip: 0.98400+-0.00597 Training: 2021-03-17 08:43:12,206-[agedb_30][286000]Accuracy-Highest: 0.98483 Training: 2021-03-17 08:43:29,264-Speed 275.87 samples/sec Loss 0.3797 Epoch: 17 Global Step: 286050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:43:45,712-Speed 3112.96 samples/sec Loss 0.3662 Epoch: 17 Global Step: 286100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:44:02,148-Speed 3115.18 samples/sec Loss 0.3671 Epoch: 17 Global Step: 286150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:44:18,359-Speed 3158.49 samples/sec Loss 0.3688 Epoch: 17 Global Step: 286200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:44:34,831-Speed 3108.27 samples/sec Loss 0.3709 Epoch: 17 Global Step: 286250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:44:51,840-Speed 3010.41 samples/sec Loss 0.3681 Epoch: 17 Global Step: 286300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:45:08,174-Speed 3134.54 samples/sec Loss 0.3705 Epoch: 17 Global Step: 286350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:45:24,589-Speed 3119.29 samples/sec Loss 0.3658 Epoch: 17 Global Step: 286400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:45:41,353-Speed 3054.21 samples/sec Loss 0.3720 Epoch: 17 Global Step: 286450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:45:58,638-Speed 2962.23 samples/sec Loss 0.3680 Epoch: 17 Global Step: 286500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:46:15,520-Speed 3032.99 samples/sec Loss 0.3684 Epoch: 17 Global Step: 286550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:46:31,890-Speed 3127.68 samples/sec Loss 0.3669 Epoch: 17 Global Step: 286600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:46:48,137-Speed 3151.37 samples/sec Loss 0.3631 Epoch: 17 Global Step: 286650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:47:04,676-Speed 3095.93 samples/sec Loss 0.3646 Epoch: 17 Global Step: 286700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:47:20,989-Speed 3138.74 samples/sec Loss 0.3715 Epoch: 17 Global Step: 286750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:47:38,312-Speed 2955.57 samples/sec Loss 0.3835 Epoch: 17 Global Step: 286800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:47:55,721-Speed 2941.09 samples/sec Loss 0.3736 Epoch: 17 Global Step: 286850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:48:13,086-Speed 2948.67 samples/sec Loss 0.3631 Epoch: 17 Global Step: 286900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:48:29,204-Speed 3176.58 samples/sec Loss 0.3641 Epoch: 17 Global Step: 286950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:48:45,391-Speed 3163.06 samples/sec Loss 0.3698 Epoch: 17 Global Step: 287000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:49:01,693-Speed 3140.93 samples/sec Loss 0.3656 Epoch: 17 Global Step: 287050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:49:18,044-Speed 3131.41 samples/sec Loss 0.3819 Epoch: 17 Global Step: 287100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:49:34,580-Speed 3096.23 samples/sec Loss 0.3660 Epoch: 17 Global Step: 287150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:49:50,734-Speed 3169.72 samples/sec Loss 0.3785 Epoch: 17 Global Step: 287200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:50:07,307-Speed 3089.49 samples/sec Loss 0.3732 Epoch: 17 Global Step: 287250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:50:23,735-Speed 3116.70 samples/sec Loss 0.3806 Epoch: 17 Global Step: 287300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:50:39,946-Speed 3158.49 samples/sec Loss 0.3670 Epoch: 17 Global Step: 287350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:50:56,129-Speed 3163.95 samples/sec Loss 0.3646 Epoch: 17 Global Step: 287400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:51:12,372-Speed 3152.09 samples/sec Loss 0.3658 Epoch: 17 Global Step: 287450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:51:28,604-Speed 3154.41 samples/sec Loss 0.3755 Epoch: 17 Global Step: 287500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:51:44,816-Speed 3158.36 samples/sec Loss 0.3711 Epoch: 17 Global Step: 287550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:52:01,493-Speed 3070.02 samples/sec Loss 0.3606 Epoch: 17 Global Step: 287600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:52:17,700-Speed 3159.31 samples/sec Loss 0.3647 Epoch: 17 Global Step: 287650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:52:33,884-Speed 3163.79 samples/sec Loss 0.3730 Epoch: 17 Global Step: 287700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:52:50,299-Speed 3119.21 samples/sec Loss 0.3743 Epoch: 17 Global Step: 287750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:53:06,968-Speed 3071.52 samples/sec Loss 0.3710 Epoch: 17 Global Step: 287800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:53:24,207-Speed 2970.17 samples/sec Loss 0.3779 Epoch: 17 Global Step: 287850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:53:41,671-Speed 2931.72 samples/sec Loss 0.3655 Epoch: 17 Global Step: 287900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:53:58,191-Speed 3099.34 samples/sec Loss 0.3671 Epoch: 17 Global Step: 287950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:54:14,589-Speed 3122.46 samples/sec Loss 0.3711 Epoch: 17 Global Step: 288000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:55:07,705-[lfw][288000]XNorm: 21.434876 Training: 2021-03-17 08:55:07,706-[lfw][288000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 08:55:07,706-[lfw][288000]Accuracy-Highest: 0.99817 Training: 2021-03-17 08:56:09,516-[cfp_fp][288000]XNorm: 22.198657 Training: 2021-03-17 08:56:09,517-[cfp_fp][288000]Accuracy-Flip: 0.99229+-0.00496 Training: 2021-03-17 08:56:09,517-[cfp_fp][288000]Accuracy-Highest: 0.99271 Training: 2021-03-17 08:57:02,679-[agedb_30][288000]XNorm: 22.659898 Training: 2021-03-17 08:57:02,680-[agedb_30][288000]Accuracy-Flip: 0.98383+-0.00606 Training: 2021-03-17 08:57:02,680-[agedb_30][288000]Accuracy-Highest: 0.98483 Training: 2021-03-17 08:57:19,078-Speed 277.52 samples/sec Loss 0.3625 Epoch: 17 Global Step: 288050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:57:35,542-Speed 3109.87 samples/sec Loss 0.3582 Epoch: 17 Global Step: 288100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:57:51,611-Speed 3186.40 samples/sec Loss 0.3601 Epoch: 17 Global Step: 288150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:58:08,592-Speed 3015.29 samples/sec Loss 0.3687 Epoch: 17 Global Step: 288200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:58:25,239-Speed 3075.74 samples/sec Loss 0.3697 Epoch: 17 Global Step: 288250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:58:41,558-Speed 3137.54 samples/sec Loss 0.3745 Epoch: 17 Global Step: 288300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:58:57,904-Speed 3132.18 samples/sec Loss 0.3557 Epoch: 17 Global Step: 288350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:59:14,534-Speed 3078.87 samples/sec Loss 0.3678 Epoch: 17 Global Step: 288400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:59:31,019-Speed 3106.10 samples/sec Loss 0.3716 Epoch: 17 Global Step: 288450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 08:59:47,108-Speed 3182.33 samples/sec Loss 0.3607 Epoch: 17 Global Step: 288500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:00:03,487-Speed 3126.11 samples/sec Loss 0.3729 Epoch: 17 Global Step: 288550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:00:20,003-Speed 3100.16 samples/sec Loss 0.3721 Epoch: 17 Global Step: 288600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:00:36,180-Speed 3165.06 samples/sec Loss 0.3753 Epoch: 17 Global Step: 288650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:00:53,319-Speed 2987.32 samples/sec Loss 0.3677 Epoch: 17 Global Step: 288700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:01:10,703-Speed 2945.42 samples/sec Loss 0.3670 Epoch: 17 Global Step: 288750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:01:26,951-Speed 3151.15 samples/sec Loss 0.3681 Epoch: 17 Global Step: 288800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:01:43,741-Speed 3049.52 samples/sec Loss 0.3639 Epoch: 17 Global Step: 288850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:02:00,100-Speed 3129.89 samples/sec Loss 0.3536 Epoch: 17 Global Step: 288900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:02:16,265-Speed 3167.52 samples/sec Loss 0.3662 Epoch: 17 Global Step: 288950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:02:33,464-Speed 2976.95 samples/sec Loss 0.3742 Epoch: 17 Global Step: 289000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:02:51,336-Speed 2864.90 samples/sec Loss 0.3649 Epoch: 17 Global Step: 289050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:03:07,735-Speed 3122.20 samples/sec Loss 0.3732 Epoch: 17 Global Step: 289100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:03:24,067-Speed 3135.05 samples/sec Loss 0.3719 Epoch: 17 Global Step: 289150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:03:40,593-Speed 3098.22 samples/sec Loss 0.3583 Epoch: 17 Global Step: 289200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:03:57,193-Speed 3084.49 samples/sec Loss 0.3714 Epoch: 17 Global Step: 289250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:04:13,469-Speed 3145.78 samples/sec Loss 0.3611 Epoch: 17 Global Step: 289300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:04:30,258-Speed 3049.67 samples/sec Loss 0.3689 Epoch: 17 Global Step: 289350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:04:46,443-Speed 3163.55 samples/sec Loss 0.3648 Epoch: 17 Global Step: 289400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:05:02,525-Speed 3183.77 samples/sec Loss 0.3756 Epoch: 17 Global Step: 289450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:05:19,198-Speed 3071.06 samples/sec Loss 0.3612 Epoch: 17 Global Step: 289500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:05:35,435-Speed 3153.33 samples/sec Loss 0.3744 Epoch: 17 Global Step: 289550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:05:51,720-Speed 3144.11 samples/sec Loss 0.3696 Epoch: 17 Global Step: 289600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:06:08,083-Speed 3129.13 samples/sec Loss 0.3712 Epoch: 17 Global Step: 289650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:06:24,349-Speed 3147.80 samples/sec Loss 0.3653 Epoch: 17 Global Step: 289700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:06:40,790-Speed 3114.10 samples/sec Loss 0.3695 Epoch: 17 Global Step: 289750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:06:56,970-Speed 3164.49 samples/sec Loss 0.3692 Epoch: 17 Global Step: 289800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:07:13,350-Speed 3125.86 samples/sec Loss 0.3675 Epoch: 17 Global Step: 289850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:07:29,854-Speed 3102.46 samples/sec Loss 0.3588 Epoch: 17 Global Step: 289900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:07:46,151-Speed 3141.70 samples/sec Loss 0.3699 Epoch: 17 Global Step: 289950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:08:02,719-Speed 3090.58 samples/sec Loss 0.3749 Epoch: 17 Global Step: 290000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:08:55,839-[lfw][290000]XNorm: 21.556387 Training: 2021-03-17 09:08:55,839-[lfw][290000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 09:08:55,839-[lfw][290000]Accuracy-Highest: 0.99817 Training: 2021-03-17 09:09:58,108-[cfp_fp][290000]XNorm: 22.254425 Training: 2021-03-17 09:09:58,108-[cfp_fp][290000]Accuracy-Flip: 0.99200+-0.00492 Training: 2021-03-17 09:09:58,108-[cfp_fp][290000]Accuracy-Highest: 0.99271 Training: 2021-03-17 09:10:51,612-[agedb_30][290000]XNorm: 22.727778 Training: 2021-03-17 09:10:51,612-[agedb_30][290000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 09:10:51,612-[agedb_30][290000]Accuracy-Highest: 0.98483 Training: 2021-03-17 09:11:07,710-Speed 276.77 samples/sec Loss 0.3683 Epoch: 17 Global Step: 290050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:11:24,643-Speed 3023.70 samples/sec Loss 0.3685 Epoch: 17 Global Step: 290100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:11:41,771-Speed 2989.34 samples/sec Loss 0.3684 Epoch: 17 Global Step: 290150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:11:58,188-Speed 3118.81 samples/sec Loss 0.3747 Epoch: 17 Global Step: 290200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:12:14,526-Speed 3133.82 samples/sec Loss 0.3815 Epoch: 17 Global Step: 290250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:12:30,994-Speed 3109.14 samples/sec Loss 0.3687 Epoch: 17 Global Step: 290300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:12:47,493-Speed 3103.44 samples/sec Loss 0.3623 Epoch: 17 Global Step: 290350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:13:03,586-Speed 3181.56 samples/sec Loss 0.3713 Epoch: 17 Global Step: 290400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:13:19,954-Speed 3128.19 samples/sec Loss 0.3733 Epoch: 17 Global Step: 290450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:13:37,099-Speed 2986.34 samples/sec Loss 0.3678 Epoch: 17 Global Step: 290500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:13:53,495-Speed 3122.80 samples/sec Loss 0.3779 Epoch: 17 Global Step: 290550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:14:09,835-Speed 3133.50 samples/sec Loss 0.3694 Epoch: 17 Global Step: 290600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:14:26,137-Speed 3140.94 samples/sec Loss 0.3665 Epoch: 17 Global Step: 290650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:14:42,560-Speed 3117.61 samples/sec Loss 0.3493 Epoch: 17 Global Step: 290700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:14:59,049-Speed 3105.09 samples/sec Loss 0.3664 Epoch: 17 Global Step: 290750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:15:15,288-Speed 3153.15 samples/sec Loss 0.3727 Epoch: 17 Global Step: 290800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:15:32,665-Speed 2946.52 samples/sec Loss 0.3773 Epoch: 17 Global Step: 290850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:15:49,047-Speed 3125.49 samples/sec Loss 0.3640 Epoch: 17 Global Step: 290900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:16:06,111-Speed 3000.47 samples/sec Loss 0.3700 Epoch: 17 Global Step: 290950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:16:22,452-Speed 3133.35 samples/sec Loss 0.3688 Epoch: 17 Global Step: 291000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:16:38,615-Speed 3167.87 samples/sec Loss 0.3702 Epoch: 17 Global Step: 291050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:16:54,870-Speed 3149.90 samples/sec Loss 0.3715 Epoch: 17 Global Step: 291100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:17:11,345-Speed 3107.72 samples/sec Loss 0.3597 Epoch: 17 Global Step: 291150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:17:28,319-Speed 3016.53 samples/sec Loss 0.3621 Epoch: 17 Global Step: 291200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:17:44,781-Speed 3110.39 samples/sec Loss 0.3659 Epoch: 17 Global Step: 291250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:18:01,856-Speed 2998.51 samples/sec Loss 0.3676 Epoch: 17 Global Step: 291300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:18:18,944-Speed 2996.30 samples/sec Loss 0.3649 Epoch: 17 Global Step: 291350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:18:35,208-Speed 3148.20 samples/sec Loss 0.3657 Epoch: 17 Global Step: 291400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:18:51,906-Speed 3066.34 samples/sec Loss 0.3653 Epoch: 17 Global Step: 291450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:19:08,105-Speed 3160.84 samples/sec Loss 0.3773 Epoch: 17 Global Step: 291500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:19:24,235-Speed 3174.29 samples/sec Loss 0.3708 Epoch: 17 Global Step: 291550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:19:40,647-Speed 3119.67 samples/sec Loss 0.3648 Epoch: 17 Global Step: 291600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:19:56,985-Speed 3133.96 samples/sec Loss 0.3681 Epoch: 17 Global Step: 291650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:20:13,165-Speed 3164.57 samples/sec Loss 0.3636 Epoch: 17 Global Step: 291700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:20:29,415-Speed 3150.77 samples/sec Loss 0.3767 Epoch: 17 Global Step: 291750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:20:45,582-Speed 3167.07 samples/sec Loss 0.3624 Epoch: 17 Global Step: 291800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:21:01,954-Speed 3127.38 samples/sec Loss 0.3694 Epoch: 17 Global Step: 291850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:21:18,901-Speed 3021.33 samples/sec Loss 0.3767 Epoch: 17 Global Step: 291900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:21:35,324-Speed 3117.59 samples/sec Loss 0.3586 Epoch: 17 Global Step: 291950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:21:51,740-Speed 3119.12 samples/sec Loss 0.3564 Epoch: 17 Global Step: 292000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:22:44,738-[lfw][292000]XNorm: 21.575208 Training: 2021-03-17 09:22:44,739-[lfw][292000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-17 09:22:44,739-[lfw][292000]Accuracy-Highest: 0.99817 Training: 2021-03-17 09:23:46,270-[cfp_fp][292000]XNorm: 22.283520 Training: 2021-03-17 09:23:46,270-[cfp_fp][292000]Accuracy-Flip: 0.99257+-0.00477 Training: 2021-03-17 09:23:46,270-[cfp_fp][292000]Accuracy-Highest: 0.99271 Training: 2021-03-17 09:24:39,783-[agedb_30][292000]XNorm: 22.760149 Training: 2021-03-17 09:24:39,783-[agedb_30][292000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 09:24:39,783-[agedb_30][292000]Accuracy-Highest: 0.98483 Training: 2021-03-17 09:24:56,429-Speed 277.22 samples/sec Loss 0.3682 Epoch: 17 Global Step: 292050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:25:13,057-Speed 3079.32 samples/sec Loss 0.3754 Epoch: 17 Global Step: 292100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:25:29,402-Speed 3132.57 samples/sec Loss 0.3700 Epoch: 17 Global Step: 292150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:25:45,681-Speed 3145.24 samples/sec Loss 0.3644 Epoch: 17 Global Step: 292200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:26:02,086-Speed 3121.17 samples/sec Loss 0.3738 Epoch: 17 Global Step: 292250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:26:18,505-Speed 3118.43 samples/sec Loss 0.3746 Epoch: 17 Global Step: 292300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:26:34,809-Speed 3140.40 samples/sec Loss 0.3674 Epoch: 17 Global Step: 292350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:26:52,047-Speed 2970.23 samples/sec Loss 0.3731 Epoch: 17 Global Step: 292400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:27:09,498-Speed 2934.01 samples/sec Loss 0.3588 Epoch: 17 Global Step: 292450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:27:25,884-Speed 3124.77 samples/sec Loss 0.3752 Epoch: 17 Global Step: 292500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:27:42,357-Speed 3108.10 samples/sec Loss 0.3617 Epoch: 17 Global Step: 292550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:27:58,935-Speed 3088.55 samples/sec Loss 0.3737 Epoch: 17 Global Step: 292600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:28:14,987-Speed 3189.70 samples/sec Loss 0.3751 Epoch: 17 Global Step: 292650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:28:32,086-Speed 2994.45 samples/sec Loss 0.3691 Epoch: 17 Global Step: 292700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:28:49,029-Speed 3022.07 samples/sec Loss 0.3741 Epoch: 17 Global Step: 292750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:29:05,175-Speed 3171.14 samples/sec Loss 0.3604 Epoch: 17 Global Step: 292800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:29:21,515-Speed 3133.50 samples/sec Loss 0.3738 Epoch: 17 Global Step: 292850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:29:38,074-Speed 3091.94 samples/sec Loss 0.3701 Epoch: 17 Global Step: 292900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:29:54,401-Speed 3136.14 samples/sec Loss 0.3659 Epoch: 17 Global Step: 292950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:30:11,646-Speed 2969.11 samples/sec Loss 0.3668 Epoch: 17 Global Step: 293000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:30:27,947-Speed 3141.03 samples/sec Loss 0.3628 Epoch: 17 Global Step: 293050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:30:45,073-Speed 2989.68 samples/sec Loss 0.3666 Epoch: 17 Global Step: 293100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:31:01,317-Speed 3151.98 samples/sec Loss 0.3723 Epoch: 17 Global Step: 293150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:31:17,816-Speed 3103.34 samples/sec Loss 0.3713 Epoch: 17 Global Step: 293200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:31:33,962-Speed 3171.08 samples/sec Loss 0.3664 Epoch: 17 Global Step: 293250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:31:50,092-Speed 3174.32 samples/sec Loss 0.3652 Epoch: 17 Global Step: 293300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:32:06,407-Speed 3138.26 samples/sec Loss 0.3711 Epoch: 17 Global Step: 293350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:32:22,629-Speed 3156.35 samples/sec Loss 0.3702 Epoch: 17 Global Step: 293400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:32:39,833-Speed 2976.24 samples/sec Loss 0.3663 Epoch: 17 Global Step: 293450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:32:56,140-Speed 3139.78 samples/sec Loss 0.3703 Epoch: 17 Global Step: 293500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:33:13,419-Speed 2963.17 samples/sec Loss 0.3756 Epoch: 17 Global Step: 293550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:33:29,659-Speed 3152.92 samples/sec Loss 0.3655 Epoch: 17 Global Step: 293600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:33:46,802-Speed 2986.68 samples/sec Loss 0.3764 Epoch: 17 Global Step: 293650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:34:02,941-Speed 3172.53 samples/sec Loss 0.3622 Epoch: 17 Global Step: 293700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:34:19,505-Speed 3091.17 samples/sec Loss 0.3760 Epoch: 17 Global Step: 293750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:34:35,853-Speed 3131.91 samples/sec Loss 0.3683 Epoch: 17 Global Step: 293800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:34:52,205-Speed 3131.25 samples/sec Loss 0.3639 Epoch: 17 Global Step: 293850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:35:08,394-Speed 3162.68 samples/sec Loss 0.3725 Epoch: 17 Global Step: 293900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:35:24,525-Speed 3174.19 samples/sec Loss 0.3656 Epoch: 17 Global Step: 293950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:35:41,273-Speed 3057.26 samples/sec Loss 0.3688 Epoch: 17 Global Step: 294000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:36:34,349-[lfw][294000]XNorm: 21.464371 Training: 2021-03-17 09:36:34,349-[lfw][294000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 09:36:34,349-[lfw][294000]Accuracy-Highest: 0.99817 Training: 2021-03-17 09:37:35,988-[cfp_fp][294000]XNorm: 22.182617 Training: 2021-03-17 09:37:35,988-[cfp_fp][294000]Accuracy-Flip: 0.99243+-0.00519 Training: 2021-03-17 09:37:35,988-[cfp_fp][294000]Accuracy-Highest: 0.99271 Training: 2021-03-17 09:38:28,931-[agedb_30][294000]XNorm: 22.659760 Training: 2021-03-17 09:38:28,931-[agedb_30][294000]Accuracy-Flip: 0.98283+-0.00610 Training: 2021-03-17 09:38:28,931-[agedb_30][294000]Accuracy-Highest: 0.98483 Training: 2021-03-17 09:38:45,514-Speed 277.90 samples/sec Loss 0.3648 Epoch: 17 Global Step: 294050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:39:01,686-Speed 3166.08 samples/sec Loss 0.3666 Epoch: 17 Global Step: 294100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:39:18,056-Speed 3127.72 samples/sec Loss 0.3687 Epoch: 17 Global Step: 294150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:39:34,272-Speed 3157.53 samples/sec Loss 0.3681 Epoch: 17 Global Step: 294200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:39:50,432-Speed 3168.36 samples/sec Loss 0.3900 Epoch: 17 Global Step: 294250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:40:06,804-Speed 3127.42 samples/sec Loss 0.3635 Epoch: 17 Global Step: 294300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-17 09:40:23,215-Speed 3119.87 samples/sec Loss 0.3624 Epoch: 17 Global Step: 294350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:40:39,393-Speed 3164.84 samples/sec Loss 0.3811 Epoch: 17 Global Step: 294400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:40:55,672-Speed 3145.33 samples/sec Loss 0.3562 Epoch: 17 Global Step: 294450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:41:12,271-Speed 3084.67 samples/sec Loss 0.3651 Epoch: 17 Global Step: 294500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:41:28,459-Speed 3162.90 samples/sec Loss 0.3687 Epoch: 17 Global Step: 294550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:41:45,804-Speed 2951.82 samples/sec Loss 0.3625 Epoch: 17 Global Step: 294600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:42:02,175-Speed 3127.72 samples/sec Loss 0.3619 Epoch: 17 Global Step: 294650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:42:18,594-Speed 3118.49 samples/sec Loss 0.3680 Epoch: 17 Global Step: 294700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:42:35,638-Speed 3004.03 samples/sec Loss 0.3644 Epoch: 17 Global Step: 294750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:42:51,999-Speed 3129.43 samples/sec Loss 0.3654 Epoch: 17 Global Step: 294800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:43:08,224-Speed 3155.70 samples/sec Loss 0.3581 Epoch: 17 Global Step: 294850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:43:24,500-Speed 3145.91 samples/sec Loss 0.3654 Epoch: 17 Global Step: 294900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:43:41,626-Speed 2989.65 samples/sec Loss 0.3612 Epoch: 17 Global Step: 294950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:43:57,803-Speed 3165.04 samples/sec Loss 0.3624 Epoch: 17 Global Step: 295000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:44:14,123-Speed 3137.37 samples/sec Loss 0.3675 Epoch: 17 Global Step: 295050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:44:30,378-Speed 3149.93 samples/sec Loss 0.3695 Epoch: 17 Global Step: 295100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:44:47,027-Speed 3075.44 samples/sec Loss 0.3601 Epoch: 17 Global Step: 295150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:45:03,417-Speed 3123.88 samples/sec Loss 0.3648 Epoch: 17 Global Step: 295200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:45:21,105-Speed 2894.69 samples/sec Loss 0.3630 Epoch: 17 Global Step: 295250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:45:38,292-Speed 2979.13 samples/sec Loss 0.3689 Epoch: 17 Global Step: 295300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:45:54,942-Speed 3075.24 samples/sec Loss 0.3730 Epoch: 17 Global Step: 295350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:46:11,069-Speed 3174.84 samples/sec Loss 0.3653 Epoch: 17 Global Step: 295400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:46:27,419-Speed 3131.55 samples/sec Loss 0.3736 Epoch: 17 Global Step: 295450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:46:43,874-Speed 3111.78 samples/sec Loss 0.3670 Epoch: 17 Global Step: 295500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:47:00,126-Speed 3150.49 samples/sec Loss 0.3582 Epoch: 17 Global Step: 295550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:47:16,754-Speed 3079.08 samples/sec Loss 0.3675 Epoch: 17 Global Step: 295600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:47:33,207-Speed 3112.05 samples/sec Loss 0.3722 Epoch: 17 Global Step: 295650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:47:50,316-Speed 2992.68 samples/sec Loss 0.3697 Epoch: 17 Global Step: 295700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:48:06,651-Speed 3134.43 samples/sec Loss 0.3629 Epoch: 17 Global Step: 295750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:48:23,750-Speed 2994.32 samples/sec Loss 0.3655 Epoch: 17 Global Step: 295800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:48:39,818-Speed 3186.56 samples/sec Loss 0.3710 Epoch: 17 Global Step: 295850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:48:57,284-Speed 2931.59 samples/sec Loss 0.3657 Epoch: 17 Global Step: 295900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:49:13,518-Speed 3153.93 samples/sec Loss 0.3638 Epoch: 17 Global Step: 295950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:49:29,928-Speed 3120.21 samples/sec Loss 0.3651 Epoch: 17 Global Step: 296000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:50:22,719-[lfw][296000]XNorm: 21.528291 Training: 2021-03-17 09:50:22,719-[lfw][296000]Accuracy-Flip: 0.99817+-0.00273 Training: 2021-03-17 09:50:22,719-[lfw][296000]Accuracy-Highest: 0.99817 Training: 2021-03-17 09:51:24,412-[cfp_fp][296000]XNorm: 22.254177 Training: 2021-03-17 09:51:24,412-[cfp_fp][296000]Accuracy-Flip: 0.99243+-0.00483 Training: 2021-03-17 09:51:24,412-[cfp_fp][296000]Accuracy-Highest: 0.99271 Training: 2021-03-17 09:52:17,474-[agedb_30][296000]XNorm: 22.732000 Training: 2021-03-17 09:52:17,474-[agedb_30][296000]Accuracy-Flip: 0.98367+-0.00595 Training: 2021-03-17 09:52:17,476-[agedb_30][296000]Accuracy-Highest: 0.98483 Training: 2021-03-17 09:52:33,908-Speed 278.29 samples/sec Loss 0.3762 Epoch: 17 Global Step: 296050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:52:50,585-Speed 3070.23 samples/sec Loss 0.3613 Epoch: 17 Global Step: 296100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:53:06,922-Speed 3134.15 samples/sec Loss 0.3736 Epoch: 17 Global Step: 296150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:53:23,461-Speed 3095.88 samples/sec Loss 0.3546 Epoch: 17 Global Step: 296200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:53:40,116-Speed 3074.17 samples/sec Loss 0.3642 Epoch: 17 Global Step: 296250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:53:56,298-Speed 3164.08 samples/sec Loss 0.3740 Epoch: 17 Global Step: 296300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:54:12,664-Speed 3128.58 samples/sec Loss 0.3714 Epoch: 17 Global Step: 296350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:54:29,556-Speed 3031.11 samples/sec Loss 0.3608 Epoch: 17 Global Step: 296400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:54:46,250-Speed 3066.93 samples/sec Loss 0.3574 Epoch: 17 Global Step: 296450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:55:02,572-Speed 3137.03 samples/sec Loss 0.3674 Epoch: 17 Global Step: 296500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:55:19,312-Speed 3058.66 samples/sec Loss 0.3662 Epoch: 17 Global Step: 296550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:55:35,454-Speed 3171.84 samples/sec Loss 0.3660 Epoch: 17 Global Step: 296600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:55:51,939-Speed 3105.94 samples/sec Loss 0.3692 Epoch: 17 Global Step: 296650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:56:08,337-Speed 3122.58 samples/sec Loss 0.3705 Epoch: 17 Global Step: 296700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:56:24,627-Speed 3143.12 samples/sec Loss 0.3717 Epoch: 17 Global Step: 296750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:56:41,836-Speed 2975.15 samples/sec Loss 0.3761 Epoch: 17 Global Step: 296800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:56:58,035-Speed 3160.79 samples/sec Loss 0.3645 Epoch: 17 Global Step: 296850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:57:15,393-Speed 2949.80 samples/sec Loss 0.3727 Epoch: 17 Global Step: 296900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:57:32,052-Speed 3073.53 samples/sec Loss 0.3687 Epoch: 17 Global Step: 296950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:57:48,467-Speed 3119.10 samples/sec Loss 0.3612 Epoch: 17 Global Step: 297000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:58:04,905-Speed 3114.87 samples/sec Loss 0.3693 Epoch: 17 Global Step: 297050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:58:21,487-Speed 3087.68 samples/sec Loss 0.3714 Epoch: 17 Global Step: 297100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:58:38,513-Speed 3007.31 samples/sec Loss 0.3731 Epoch: 17 Global Step: 297150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:58:55,254-Speed 3058.52 samples/sec Loss 0.3695 Epoch: 17 Global Step: 297200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:59:11,713-Speed 3110.85 samples/sec Loss 0.3726 Epoch: 17 Global Step: 297250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:59:27,954-Speed 3152.59 samples/sec Loss 0.3756 Epoch: 17 Global Step: 297300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 09:59:44,494-Speed 3095.62 samples/sec Loss 0.3685 Epoch: 17 Global Step: 297350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:00:01,858-Speed 2948.73 samples/sec Loss 0.3747 Epoch: 17 Global Step: 297400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:00:17,918-Speed 3188.17 samples/sec Loss 0.3618 Epoch: 17 Global Step: 297450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:00:34,861-Speed 3021.99 samples/sec Loss 0.3654 Epoch: 17 Global Step: 297500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:00:51,342-Speed 3106.60 samples/sec Loss 0.3735 Epoch: 17 Global Step: 297550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:01:07,980-Speed 3077.49 samples/sec Loss 0.3638 Epoch: 17 Global Step: 297600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:01:24,426-Speed 3113.37 samples/sec Loss 0.3814 Epoch: 17 Global Step: 297650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:01:40,977-Speed 3093.62 samples/sec Loss 0.3691 Epoch: 17 Global Step: 297700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:01:57,510-Speed 3096.86 samples/sec Loss 0.3527 Epoch: 17 Global Step: 297750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:02:13,825-Speed 3138.32 samples/sec Loss 0.3737 Epoch: 17 Global Step: 297800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:02:30,288-Speed 3110.12 samples/sec Loss 0.3661 Epoch: 17 Global Step: 297850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:02:46,533-Speed 3151.75 samples/sec Loss 0.3756 Epoch: 17 Global Step: 297900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:03:04,405-Speed 2864.91 samples/sec Loss 0.3591 Epoch: 17 Global Step: 297950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:03:20,615-Speed 3158.76 samples/sec Loss 0.3832 Epoch: 17 Global Step: 298000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:04:14,171-[lfw][298000]XNorm: 21.437372 Training: 2021-03-17 10:04:14,171-[lfw][298000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 10:04:14,171-[lfw][298000]Accuracy-Highest: 0.99817 Training: 2021-03-17 10:05:16,254-[cfp_fp][298000]XNorm: 22.214548 Training: 2021-03-17 10:05:16,255-[cfp_fp][298000]Accuracy-Flip: 0.99214+-0.00475 Training: 2021-03-17 10:05:16,255-[cfp_fp][298000]Accuracy-Highest: 0.99271 Training: 2021-03-17 10:06:09,687-[agedb_30][298000]XNorm: 22.678029 Training: 2021-03-17 10:06:09,688-[agedb_30][298000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 10:06:09,688-[agedb_30][298000]Accuracy-Highest: 0.98483 Training: 2021-03-17 10:06:26,227-Speed 275.84 samples/sec Loss 0.3679 Epoch: 17 Global Step: 298050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:06:43,419-Speed 2978.21 samples/sec Loss 0.3656 Epoch: 17 Global Step: 298100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:06:59,599-Speed 3164.51 samples/sec Loss 0.3659 Epoch: 17 Global Step: 298150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:07:15,839-Speed 3152.77 samples/sec Loss 0.3693 Epoch: 17 Global Step: 298200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:07:32,040-Speed 3160.37 samples/sec Loss 0.3680 Epoch: 17 Global Step: 298250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:07:48,137-Speed 3180.84 samples/sec Loss 0.3574 Epoch: 17 Global Step: 298300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:08:04,317-Speed 3164.51 samples/sec Loss 0.3714 Epoch: 17 Global Step: 298350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:08:20,830-Speed 3100.67 samples/sec Loss 0.3742 Epoch: 17 Global Step: 298400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:08:37,213-Speed 3125.17 samples/sec Loss 0.3706 Epoch: 17 Global Step: 298450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:08:53,457-Speed 3152.21 samples/sec Loss 0.3588 Epoch: 17 Global Step: 298500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:09:09,813-Speed 3130.29 samples/sec Loss 0.3719 Epoch: 17 Global Step: 298550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:09:25,884-Speed 3186.03 samples/sec Loss 0.3631 Epoch: 17 Global Step: 298600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:09:42,277-Speed 3123.32 samples/sec Loss 0.3719 Epoch: 17 Global Step: 298650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:09:58,640-Speed 3129.26 samples/sec Loss 0.3666 Epoch: 17 Global Step: 298700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:10:15,018-Speed 3126.11 samples/sec Loss 0.3573 Epoch: 17 Global Step: 298750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:10:31,250-Speed 3154.33 samples/sec Loss 0.3805 Epoch: 17 Global Step: 298800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:10:47,488-Speed 3153.39 samples/sec Loss 0.3705 Epoch: 17 Global Step: 298850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:11:03,638-Speed 3170.18 samples/sec Loss 0.3627 Epoch: 17 Global Step: 298900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:11:19,967-Speed 3135.69 samples/sec Loss 0.3702 Epoch: 17 Global Step: 298950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:11:37,132-Speed 2982.96 samples/sec Loss 0.3725 Epoch: 17 Global Step: 299000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:11:53,494-Speed 3129.31 samples/sec Loss 0.3728 Epoch: 17 Global Step: 299050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:12:10,651-Speed 2984.22 samples/sec Loss 0.3582 Epoch: 17 Global Step: 299100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:12:26,894-Speed 3152.31 samples/sec Loss 0.3698 Epoch: 17 Global Step: 299150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:12:43,290-Speed 3122.69 samples/sec Loss 0.3619 Epoch: 17 Global Step: 299200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:12:59,649-Speed 3129.86 samples/sec Loss 0.3625 Epoch: 17 Global Step: 299250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:13:16,168-Speed 3099.69 samples/sec Loss 0.3628 Epoch: 17 Global Step: 299300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:13:33,242-Speed 2998.76 samples/sec Loss 0.3684 Epoch: 17 Global Step: 299350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:13:49,428-Speed 3163.30 samples/sec Loss 0.3707 Epoch: 17 Global Step: 299400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:14:05,705-Speed 3145.76 samples/sec Loss 0.3697 Epoch: 17 Global Step: 299450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:14:22,160-Speed 3111.62 samples/sec Loss 0.3725 Epoch: 17 Global Step: 299500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:14:38,405-Speed 3151.81 samples/sec Loss 0.3821 Epoch: 17 Global Step: 299550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:14:54,888-Speed 3106.17 samples/sec Loss 0.3581 Epoch: 17 Global Step: 299600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:15:12,875-Speed 2846.61 samples/sec Loss 0.3688 Epoch: 17 Global Step: 299650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:15:29,387-Speed 3100.84 samples/sec Loss 0.3769 Epoch: 17 Global Step: 299700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:15:45,715-Speed 3135.83 samples/sec Loss 0.3652 Epoch: 17 Global Step: 299750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:16:02,284-Speed 3090.17 samples/sec Loss 0.3610 Epoch: 17 Global Step: 299800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:16:18,653-Speed 3128.01 samples/sec Loss 0.3674 Epoch: 17 Global Step: 299850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:16:34,953-Speed 3141.29 samples/sec Loss 0.3675 Epoch: 17 Global Step: 299900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:16:51,082-Speed 3174.51 samples/sec Loss 0.3611 Epoch: 17 Global Step: 299950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:17:07,470-Speed 3124.29 samples/sec Loss 0.3580 Epoch: 17 Global Step: 300000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:18:00,621-[lfw][300000]XNorm: 21.452889 Training: 2021-03-17 10:18:00,621-[lfw][300000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 10:18:00,621-[lfw][300000]Accuracy-Highest: 0.99817 Training: 2021-03-17 10:19:02,680-[cfp_fp][300000]XNorm: 22.199860 Training: 2021-03-17 10:19:02,681-[cfp_fp][300000]Accuracy-Flip: 0.99257+-0.00486 Training: 2021-03-17 10:19:02,681-[cfp_fp][300000]Accuracy-Highest: 0.99271 Training: 2021-03-17 10:19:55,843-[agedb_30][300000]XNorm: 22.660164 Training: 2021-03-17 10:19:55,843-[agedb_30][300000]Accuracy-Flip: 0.98400+-0.00597 Training: 2021-03-17 10:19:55,843-[agedb_30][300000]Accuracy-Highest: 0.98483 Training: 2021-03-17 10:20:12,042-Speed 277.40 samples/sec Loss 0.3697 Epoch: 17 Global Step: 300050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:20:29,032-Speed 3013.61 samples/sec Loss 0.3783 Epoch: 17 Global Step: 300100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:20:46,215-Speed 2979.88 samples/sec Loss 0.3719 Epoch: 17 Global Step: 300150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:21:02,526-Speed 3139.15 samples/sec Loss 0.3668 Epoch: 17 Global Step: 300200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:21:18,899-Speed 3127.22 samples/sec Loss 0.3726 Epoch: 17 Global Step: 300250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:21:36,043-Speed 2986.52 samples/sec Loss 0.3732 Epoch: 17 Global Step: 300300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:21:52,355-Speed 3138.94 samples/sec Loss 0.3731 Epoch: 17 Global Step: 300350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:22:08,636-Speed 3144.90 samples/sec Loss 0.3568 Epoch: 17 Global Step: 300400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:22:38,995-Speed 1686.46 samples/sec Loss 0.3702 Epoch: 18 Global Step: 300450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:22:55,385-Speed 3124.03 samples/sec Loss 0.3650 Epoch: 18 Global Step: 300500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:23:12,190-Speed 3046.85 samples/sec Loss 0.3840 Epoch: 18 Global Step: 300550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:23:28,390-Speed 3160.59 samples/sec Loss 0.3733 Epoch: 18 Global Step: 300600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:23:44,527-Speed 3173.01 samples/sec Loss 0.3661 Epoch: 18 Global Step: 300650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:24:00,969-Speed 3114.11 samples/sec Loss 0.3635 Epoch: 18 Global Step: 300700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:24:17,182-Speed 3158.04 samples/sec Loss 0.3681 Epoch: 18 Global Step: 300750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:24:33,446-Speed 3148.10 samples/sec Loss 0.3728 Epoch: 18 Global Step: 300800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:24:50,038-Speed 3085.96 samples/sec Loss 0.3743 Epoch: 18 Global Step: 300850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:25:06,384-Speed 3132.31 samples/sec Loss 0.3660 Epoch: 18 Global Step: 300900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:25:22,459-Speed 3185.29 samples/sec Loss 0.3700 Epoch: 18 Global Step: 300950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:25:38,772-Speed 3138.72 samples/sec Loss 0.3553 Epoch: 18 Global Step: 301000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:25:55,066-Speed 3142.33 samples/sec Loss 0.3622 Epoch: 18 Global Step: 301050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:26:11,593-Speed 3097.99 samples/sec Loss 0.3636 Epoch: 18 Global Step: 301100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:26:28,127-Speed 3096.80 samples/sec Loss 0.3659 Epoch: 18 Global Step: 301150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:26:45,285-Speed 2984.05 samples/sec Loss 0.3630 Epoch: 18 Global Step: 301200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:27:01,713-Speed 3116.66 samples/sec Loss 0.3620 Epoch: 18 Global Step: 301250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:27:19,261-Speed 2917.88 samples/sec Loss 0.3649 Epoch: 18 Global Step: 301300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:27:35,476-Speed 3157.65 samples/sec Loss 0.3748 Epoch: 18 Global Step: 301350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:27:51,657-Speed 3164.26 samples/sec Loss 0.3615 Epoch: 18 Global Step: 301400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:28:08,173-Speed 3100.22 samples/sec Loss 0.3654 Epoch: 18 Global Step: 301450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:28:25,447-Speed 2964.02 samples/sec Loss 0.3654 Epoch: 18 Global Step: 301500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:28:41,791-Speed 3132.85 samples/sec Loss 0.3674 Epoch: 18 Global Step: 301550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:28:58,100-Speed 3139.50 samples/sec Loss 0.3652 Epoch: 18 Global Step: 301600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:29:14,521-Speed 3118.06 samples/sec Loss 0.3668 Epoch: 18 Global Step: 301650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:29:31,208-Speed 3068.29 samples/sec Loss 0.3644 Epoch: 18 Global Step: 301700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:29:47,719-Speed 3100.99 samples/sec Loss 0.3585 Epoch: 18 Global Step: 301750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:30:04,127-Speed 3120.61 samples/sec Loss 0.3649 Epoch: 18 Global Step: 301800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:30:22,663-Speed 2762.24 samples/sec Loss 0.3722 Epoch: 18 Global Step: 301850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:30:38,921-Speed 3149.30 samples/sec Loss 0.3655 Epoch: 18 Global Step: 301900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:30:55,094-Speed 3165.93 samples/sec Loss 0.3696 Epoch: 18 Global Step: 301950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:31:11,634-Speed 3095.57 samples/sec Loss 0.3669 Epoch: 18 Global Step: 302000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:32:04,707-[lfw][302000]XNorm: 21.509505 Training: 2021-03-17 10:32:04,707-[lfw][302000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 10:32:04,707-[lfw][302000]Accuracy-Highest: 0.99817 Training: 2021-03-17 10:33:06,766-[cfp_fp][302000]XNorm: 22.239657 Training: 2021-03-17 10:33:06,766-[cfp_fp][302000]Accuracy-Flip: 0.99257+-0.00494 Training: 2021-03-17 10:33:06,766-[cfp_fp][302000]Accuracy-Highest: 0.99271 Training: 2021-03-17 10:33:59,968-[agedb_30][302000]XNorm: 22.707457 Training: 2021-03-17 10:33:59,968-[agedb_30][302000]Accuracy-Flip: 0.98333+-0.00601 Training: 2021-03-17 10:33:59,968-[agedb_30][302000]Accuracy-Highest: 0.98483 Training: 2021-03-17 10:34:16,167-Speed 277.46 samples/sec Loss 0.3719 Epoch: 18 Global Step: 302050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:34:32,467-Speed 3141.32 samples/sec Loss 0.3631 Epoch: 18 Global Step: 302100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:34:49,101-Speed 3078.06 samples/sec Loss 0.3627 Epoch: 18 Global Step: 302150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:35:05,551-Speed 3112.50 samples/sec Loss 0.3684 Epoch: 18 Global Step: 302200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:35:21,805-Speed 3150.07 samples/sec Loss 0.3603 Epoch: 18 Global Step: 302250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:35:39,356-Speed 2917.33 samples/sec Loss 0.3629 Epoch: 18 Global Step: 302300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:35:56,211-Speed 3037.82 samples/sec Loss 0.3677 Epoch: 18 Global Step: 302350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:36:13,559-Speed 2951.43 samples/sec Loss 0.3608 Epoch: 18 Global Step: 302400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:36:29,854-Speed 3142.03 samples/sec Loss 0.3738 Epoch: 18 Global Step: 302450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:36:46,173-Speed 3137.67 samples/sec Loss 0.3693 Epoch: 18 Global Step: 302500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:37:02,258-Speed 3183.12 samples/sec Loss 0.3644 Epoch: 18 Global Step: 302550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:37:19,564-Speed 2958.66 samples/sec Loss 0.3635 Epoch: 18 Global Step: 302600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:37:36,042-Speed 3107.30 samples/sec Loss 0.3648 Epoch: 18 Global Step: 302650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:37:52,543-Speed 3102.76 samples/sec Loss 0.3666 Epoch: 18 Global Step: 302700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:38:08,910-Speed 3128.35 samples/sec Loss 0.3623 Epoch: 18 Global Step: 302750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:38:25,551-Speed 3076.97 samples/sec Loss 0.3674 Epoch: 18 Global Step: 302800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:38:41,980-Speed 3116.47 samples/sec Loss 0.3665 Epoch: 18 Global Step: 302850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:38:58,314-Speed 3134.63 samples/sec Loss 0.3650 Epoch: 18 Global Step: 302900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:39:14,761-Speed 3113.09 samples/sec Loss 0.3680 Epoch: 18 Global Step: 302950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:39:31,114-Speed 3131.02 samples/sec Loss 0.3547 Epoch: 18 Global Step: 303000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:39:47,399-Speed 3144.22 samples/sec Loss 0.3608 Epoch: 18 Global Step: 303050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-17 10:40:03,781-Speed 3125.34 samples/sec Loss 0.3677 Epoch: 18 Global Step: 303100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:40:20,197-Speed 3119.16 samples/sec Loss 0.3553 Epoch: 18 Global Step: 303150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:40:36,308-Speed 3177.89 samples/sec Loss 0.3593 Epoch: 18 Global Step: 303200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:40:52,915-Speed 3083.14 samples/sec Loss 0.3578 Epoch: 18 Global Step: 303250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:41:09,220-Speed 3140.36 samples/sec Loss 0.3716 Epoch: 18 Global Step: 303300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:41:25,546-Speed 3136.24 samples/sec Loss 0.3638 Epoch: 18 Global Step: 303350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:41:42,054-Speed 3101.58 samples/sec Loss 0.3530 Epoch: 18 Global Step: 303400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:41:59,272-Speed 2973.61 samples/sec Loss 0.3673 Epoch: 18 Global Step: 303450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:42:16,505-Speed 2971.18 samples/sec Loss 0.3640 Epoch: 18 Global Step: 303500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:42:32,619-Speed 3177.55 samples/sec Loss 0.3630 Epoch: 18 Global Step: 303550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:42:49,071-Speed 3112.04 samples/sec Loss 0.3658 Epoch: 18 Global Step: 303600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:43:05,633-Speed 3091.60 samples/sec Loss 0.3522 Epoch: 18 Global Step: 303650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:43:22,605-Speed 3016.79 samples/sec Loss 0.3748 Epoch: 18 Global Step: 303700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:43:39,107-Speed 3102.68 samples/sec Loss 0.3744 Epoch: 18 Global Step: 303750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:43:55,998-Speed 3031.33 samples/sec Loss 0.3745 Epoch: 18 Global Step: 303800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:44:12,423-Speed 3117.31 samples/sec Loss 0.3656 Epoch: 18 Global Step: 303850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:44:29,050-Speed 3079.35 samples/sec Loss 0.3679 Epoch: 18 Global Step: 303900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:44:45,687-Speed 3077.57 samples/sec Loss 0.3588 Epoch: 18 Global Step: 303950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:45:02,034-Speed 3132.25 samples/sec Loss 0.3669 Epoch: 18 Global Step: 304000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:45:55,160-[lfw][304000]XNorm: 21.523245 Training: 2021-03-17 10:45:55,160-[lfw][304000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 10:45:55,160-[lfw][304000]Accuracy-Highest: 0.99817 Training: 2021-03-17 10:46:57,162-[cfp_fp][304000]XNorm: 22.263873 Training: 2021-03-17 10:46:57,163-[cfp_fp][304000]Accuracy-Flip: 0.99243+-0.00499 Training: 2021-03-17 10:46:57,163-[cfp_fp][304000]Accuracy-Highest: 0.99271 Training: 2021-03-17 10:47:50,311-[agedb_30][304000]XNorm: 22.728164 Training: 2021-03-17 10:47:50,311-[agedb_30][304000]Accuracy-Flip: 0.98417+-0.00539 Training: 2021-03-17 10:47:50,311-[agedb_30][304000]Accuracy-Highest: 0.98483 Training: 2021-03-17 10:48:06,867-Speed 277.01 samples/sec Loss 0.3613 Epoch: 18 Global Step: 304050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:48:24,458-Speed 2910.63 samples/sec Loss 0.3567 Epoch: 18 Global Step: 304100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:48:41,676-Speed 2973.59 samples/sec Loss 0.3700 Epoch: 18 Global Step: 304150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:48:57,845-Speed 3166.75 samples/sec Loss 0.3638 Epoch: 18 Global Step: 304200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:49:14,271-Speed 3117.04 samples/sec Loss 0.3643 Epoch: 18 Global Step: 304250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:49:31,000-Speed 3060.68 samples/sec Loss 0.3607 Epoch: 18 Global Step: 304300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:49:47,506-Speed 3101.99 samples/sec Loss 0.3631 Epoch: 18 Global Step: 304350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:50:03,770-Speed 3148.18 samples/sec Loss 0.3650 Epoch: 18 Global Step: 304400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:50:20,375-Speed 3083.60 samples/sec Loss 0.3627 Epoch: 18 Global Step: 304450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:50:37,699-Speed 2955.52 samples/sec Loss 0.3680 Epoch: 18 Global Step: 304500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:50:54,091-Speed 3123.54 samples/sec Loss 0.3740 Epoch: 18 Global Step: 304550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:51:11,401-Speed 2957.95 samples/sec Loss 0.3701 Epoch: 18 Global Step: 304600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:51:27,644-Speed 3152.13 samples/sec Loss 0.3653 Epoch: 18 Global Step: 304650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:51:43,901-Speed 3149.58 samples/sec Loss 0.3640 Epoch: 18 Global Step: 304700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:52:00,188-Speed 3143.61 samples/sec Loss 0.3709 Epoch: 18 Global Step: 304750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:52:17,731-Speed 2918.70 samples/sec Loss 0.3749 Epoch: 18 Global Step: 304800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:52:34,090-Speed 3129.85 samples/sec Loss 0.3785 Epoch: 18 Global Step: 304850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:52:50,325-Speed 3153.83 samples/sec Loss 0.3704 Epoch: 18 Global Step: 304900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:53:06,781-Speed 3111.32 samples/sec Loss 0.3659 Epoch: 18 Global Step: 304950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:53:23,181-Speed 3122.17 samples/sec Loss 0.3652 Epoch: 18 Global Step: 305000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:53:39,344-Speed 3167.84 samples/sec Loss 0.3547 Epoch: 18 Global Step: 305050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:53:55,650-Speed 3139.95 samples/sec Loss 0.3653 Epoch: 18 Global Step: 305100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:54:12,051-Speed 3121.79 samples/sec Loss 0.3718 Epoch: 18 Global Step: 305150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:54:28,478-Speed 3116.91 samples/sec Loss 0.3617 Epoch: 18 Global Step: 305200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:54:44,665-Speed 3163.14 samples/sec Loss 0.3590 Epoch: 18 Global Step: 305250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:55:01,056-Speed 3123.82 samples/sec Loss 0.3631 Epoch: 18 Global Step: 305300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:55:17,615-Speed 3092.13 samples/sec Loss 0.3491 Epoch: 18 Global Step: 305350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:55:33,800-Speed 3163.54 samples/sec Loss 0.3672 Epoch: 18 Global Step: 305400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:55:50,179-Speed 3126.02 samples/sec Loss 0.3674 Epoch: 18 Global Step: 305450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:56:06,508-Speed 3135.57 samples/sec Loss 0.3623 Epoch: 18 Global Step: 305500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:56:22,651-Speed 3171.79 samples/sec Loss 0.3629 Epoch: 18 Global Step: 305550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:56:39,792-Speed 2987.11 samples/sec Loss 0.3737 Epoch: 18 Global Step: 305600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:56:56,123-Speed 3135.10 samples/sec Loss 0.3688 Epoch: 18 Global Step: 305650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:57:13,394-Speed 2964.58 samples/sec Loss 0.3618 Epoch: 18 Global Step: 305700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:57:29,529-Speed 3173.38 samples/sec Loss 0.3655 Epoch: 18 Global Step: 305750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:57:45,848-Speed 3137.66 samples/sec Loss 0.3762 Epoch: 18 Global Step: 305800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:58:02,366-Speed 3099.63 samples/sec Loss 0.3680 Epoch: 18 Global Step: 305850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:58:18,966-Speed 3084.40 samples/sec Loss 0.3634 Epoch: 18 Global Step: 305900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:58:35,595-Speed 3079.06 samples/sec Loss 0.3652 Epoch: 18 Global Step: 305950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:58:52,001-Speed 3120.97 samples/sec Loss 0.3643 Epoch: 18 Global Step: 306000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 10:59:45,001-[lfw][306000]XNorm: 21.618735 Training: 2021-03-17 10:59:45,001-[lfw][306000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 10:59:45,001-[lfw][306000]Accuracy-Highest: 0.99817 Training: 2021-03-17 11:00:46,884-[cfp_fp][306000]XNorm: 22.302105 Training: 2021-03-17 11:00:46,884-[cfp_fp][306000]Accuracy-Flip: 0.99257+-0.00494 Training: 2021-03-17 11:00:46,884-[cfp_fp][306000]Accuracy-Highest: 0.99271 Training: 2021-03-17 11:01:39,951-[agedb_30][306000]XNorm: 22.775047 Training: 2021-03-17 11:01:39,951-[agedb_30][306000]Accuracy-Flip: 0.98400+-0.00597 Training: 2021-03-17 11:01:39,951-[agedb_30][306000]Accuracy-Highest: 0.98483 Training: 2021-03-17 11:01:57,073-Speed 276.65 samples/sec Loss 0.3687 Epoch: 18 Global Step: 306050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:02:13,294-Speed 3156.39 samples/sec Loss 0.3655 Epoch: 18 Global Step: 306100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:02:29,618-Speed 3136.58 samples/sec Loss 0.3674 Epoch: 18 Global Step: 306150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:02:45,954-Speed 3134.24 samples/sec Loss 0.3709 Epoch: 18 Global Step: 306200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:03:02,077-Speed 3175.76 samples/sec Loss 0.3627 Epoch: 18 Global Step: 306250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:03:19,585-Speed 2924.53 samples/sec Loss 0.3676 Epoch: 18 Global Step: 306300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:03:37,130-Speed 2918.16 samples/sec Loss 0.3616 Epoch: 18 Global Step: 306350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:03:53,497-Speed 3128.39 samples/sec Loss 0.3724 Epoch: 18 Global Step: 306400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:04:09,893-Speed 3122.78 samples/sec Loss 0.3702 Epoch: 18 Global Step: 306450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:04:26,253-Speed 3129.77 samples/sec Loss 0.3579 Epoch: 18 Global Step: 306500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:04:42,504-Speed 3150.65 samples/sec Loss 0.3581 Epoch: 18 Global Step: 306550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:04:58,909-Speed 3120.98 samples/sec Loss 0.3693 Epoch: 18 Global Step: 306600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:05:15,257-Speed 3132.12 samples/sec Loss 0.3642 Epoch: 18 Global Step: 306650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:05:31,484-Speed 3155.27 samples/sec Loss 0.3602 Epoch: 18 Global Step: 306700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:05:47,716-Speed 3154.35 samples/sec Loss 0.3687 Epoch: 18 Global Step: 306750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:06:05,775-Speed 2835.24 samples/sec Loss 0.3752 Epoch: 18 Global Step: 306800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:06:22,061-Speed 3143.93 samples/sec Loss 0.3550 Epoch: 18 Global Step: 306850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:06:38,492-Speed 3116.08 samples/sec Loss 0.3752 Epoch: 18 Global Step: 306900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:06:55,202-Speed 3064.21 samples/sec Loss 0.3682 Epoch: 18 Global Step: 306950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:07:11,937-Speed 3059.56 samples/sec Loss 0.3662 Epoch: 18 Global Step: 307000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:07:28,675-Speed 3059.02 samples/sec Loss 0.3743 Epoch: 18 Global Step: 307050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:07:45,893-Speed 2973.64 samples/sec Loss 0.3635 Epoch: 18 Global Step: 307100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:08:02,003-Speed 3178.16 samples/sec Loss 0.3600 Epoch: 18 Global Step: 307150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:08:18,360-Speed 3130.29 samples/sec Loss 0.3699 Epoch: 18 Global Step: 307200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:08:34,482-Speed 3175.95 samples/sec Loss 0.3645 Epoch: 18 Global Step: 307250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:08:51,115-Speed 3078.32 samples/sec Loss 0.3674 Epoch: 18 Global Step: 307300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:09:07,465-Speed 3131.61 samples/sec Loss 0.3694 Epoch: 18 Global Step: 307350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:09:23,689-Speed 3155.91 samples/sec Loss 0.3634 Epoch: 18 Global Step: 307400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:09:40,204-Speed 3100.21 samples/sec Loss 0.3561 Epoch: 18 Global Step: 307450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:09:56,357-Speed 3169.76 samples/sec Loss 0.3602 Epoch: 18 Global Step: 307500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:10:12,947-Speed 3086.43 samples/sec Loss 0.3773 Epoch: 18 Global Step: 307550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:10:29,231-Speed 3144.24 samples/sec Loss 0.3643 Epoch: 18 Global Step: 307600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:10:45,432-Speed 3160.32 samples/sec Loss 0.3642 Epoch: 18 Global Step: 307650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:11:01,588-Speed 3169.34 samples/sec Loss 0.3705 Epoch: 18 Global Step: 307700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:11:18,544-Speed 3019.61 samples/sec Loss 0.3610 Epoch: 18 Global Step: 307750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:11:34,747-Speed 3160.03 samples/sec Loss 0.3589 Epoch: 18 Global Step: 307800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:11:51,805-Speed 3001.62 samples/sec Loss 0.3672 Epoch: 18 Global Step: 307850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:12:09,253-Speed 2934.39 samples/sec Loss 0.3676 Epoch: 18 Global Step: 307900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:12:25,596-Speed 3133.13 samples/sec Loss 0.3753 Epoch: 18 Global Step: 307950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:12:41,811-Speed 3157.49 samples/sec Loss 0.3679 Epoch: 18 Global Step: 308000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:13:35,062-[lfw][308000]XNorm: 21.433106 Training: 2021-03-17 11:13:35,062-[lfw][308000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 11:13:35,062-[lfw][308000]Accuracy-Highest: 0.99817 Training: 2021-03-17 11:14:36,941-[cfp_fp][308000]XNorm: 22.222741 Training: 2021-03-17 11:14:36,942-[cfp_fp][308000]Accuracy-Flip: 0.99200+-0.00565 Training: 2021-03-17 11:14:36,942-[cfp_fp][308000]Accuracy-Highest: 0.99271 Training: 2021-03-17 11:15:30,159-[agedb_30][308000]XNorm: 22.688855 Training: 2021-03-17 11:15:30,159-[agedb_30][308000]Accuracy-Flip: 0.98383+-0.00650 Training: 2021-03-17 11:15:30,159-[agedb_30][308000]Accuracy-Highest: 0.98483 Training: 2021-03-17 11:15:46,774-Speed 276.81 samples/sec Loss 0.3618 Epoch: 18 Global Step: 308050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:16:03,007-Speed 3154.07 samples/sec Loss 0.3688 Epoch: 18 Global Step: 308100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:16:19,378-Speed 3127.64 samples/sec Loss 0.3624 Epoch: 18 Global Step: 308150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:16:35,817-Speed 3114.48 samples/sec Loss 0.3655 Epoch: 18 Global Step: 308200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:16:51,940-Speed 3175.74 samples/sec Loss 0.3664 Epoch: 18 Global Step: 308250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:17:08,464-Speed 3098.64 samples/sec Loss 0.3669 Epoch: 18 Global Step: 308300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:17:25,894-Speed 2937.61 samples/sec Loss 0.3605 Epoch: 18 Global Step: 308350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:17:42,537-Speed 3076.45 samples/sec Loss 0.3613 Epoch: 18 Global Step: 308400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:17:59,365-Speed 3042.51 samples/sec Loss 0.3683 Epoch: 18 Global Step: 308450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:18:15,627-Speed 3148.69 samples/sec Loss 0.3678 Epoch: 18 Global Step: 308500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:18:33,363-Speed 2886.79 samples/sec Loss 0.3618 Epoch: 18 Global Step: 308550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:18:50,444-Speed 2997.63 samples/sec Loss 0.3673 Epoch: 18 Global Step: 308600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:19:06,662-Speed 3156.95 samples/sec Loss 0.3609 Epoch: 18 Global Step: 308650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:19:23,328-Speed 3072.36 samples/sec Loss 0.3600 Epoch: 18 Global Step: 308700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:19:39,533-Speed 3159.52 samples/sec Loss 0.3550 Epoch: 18 Global Step: 308750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:19:56,409-Speed 3034.04 samples/sec Loss 0.3604 Epoch: 18 Global Step: 308800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:20:13,028-Speed 3080.91 samples/sec Loss 0.3577 Epoch: 18 Global Step: 308850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:20:29,490-Speed 3110.31 samples/sec Loss 0.3712 Epoch: 18 Global Step: 308900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:20:46,276-Speed 3050.28 samples/sec Loss 0.3612 Epoch: 18 Global Step: 308950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:21:04,372-Speed 2829.31 samples/sec Loss 0.3600 Epoch: 18 Global Step: 309000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:21:20,859-Speed 3105.56 samples/sec Loss 0.3687 Epoch: 18 Global Step: 309050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:21:37,415-Speed 3092.67 samples/sec Loss 0.3631 Epoch: 18 Global Step: 309100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:21:53,734-Speed 3137.48 samples/sec Loss 0.3730 Epoch: 18 Global Step: 309150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:22:10,242-Speed 3101.67 samples/sec Loss 0.3591 Epoch: 18 Global Step: 309200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:22:26,333-Speed 3182.01 samples/sec Loss 0.3728 Epoch: 18 Global Step: 309250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:22:43,715-Speed 2945.74 samples/sec Loss 0.3635 Epoch: 18 Global Step: 309300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:23:00,120-Speed 3121.04 samples/sec Loss 0.3753 Epoch: 18 Global Step: 309350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:23:16,768-Speed 3075.46 samples/sec Loss 0.3745 Epoch: 18 Global Step: 309400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:23:33,321-Speed 3093.20 samples/sec Loss 0.3644 Epoch: 18 Global Step: 309450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:23:50,362-Speed 3004.68 samples/sec Loss 0.3671 Epoch: 18 Global Step: 309500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:24:07,015-Speed 3074.52 samples/sec Loss 0.3629 Epoch: 18 Global Step: 309550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:24:23,425-Speed 3120.14 samples/sec Loss 0.3660 Epoch: 18 Global Step: 309600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:24:39,690-Speed 3148.06 samples/sec Loss 0.3653 Epoch: 18 Global Step: 309650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:24:55,968-Speed 3145.39 samples/sec Loss 0.3598 Epoch: 18 Global Step: 309700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:25:12,411-Speed 3113.89 samples/sec Loss 0.3612 Epoch: 18 Global Step: 309750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:25:28,870-Speed 3110.88 samples/sec Loss 0.3668 Epoch: 18 Global Step: 309800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:25:45,265-Speed 3122.96 samples/sec Loss 0.3585 Epoch: 18 Global Step: 309850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:26:01,688-Speed 3117.65 samples/sec Loss 0.3585 Epoch: 18 Global Step: 309900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:26:17,938-Speed 3150.93 samples/sec Loss 0.3658 Epoch: 18 Global Step: 309950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:26:34,410-Speed 3108.38 samples/sec Loss 0.3617 Epoch: 18 Global Step: 310000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:27:27,618-[lfw][310000]XNorm: 21.494296 Training: 2021-03-17 11:27:27,618-[lfw][310000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 11:27:27,618-[lfw][310000]Accuracy-Highest: 0.99817 Training: 2021-03-17 11:28:29,602-[cfp_fp][310000]XNorm: 22.257713 Training: 2021-03-17 11:28:29,603-[cfp_fp][310000]Accuracy-Flip: 0.99229+-0.00479 Training: 2021-03-17 11:28:29,603-[cfp_fp][310000]Accuracy-Highest: 0.99271 Training: 2021-03-17 11:29:23,068-[agedb_30][310000]XNorm: 22.722266 Training: 2021-03-17 11:29:23,068-[agedb_30][310000]Accuracy-Flip: 0.98417+-0.00607 Training: 2021-03-17 11:29:23,068-[agedb_30][310000]Accuracy-Highest: 0.98483 Training: 2021-03-17 11:29:40,218-Speed 275.55 samples/sec Loss 0.3624 Epoch: 18 Global Step: 310050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:29:57,400-Speed 2979.94 samples/sec Loss 0.3737 Epoch: 18 Global Step: 310100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:30:14,181-Speed 3051.15 samples/sec Loss 0.3730 Epoch: 18 Global Step: 310150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:30:30,782-Speed 3084.35 samples/sec Loss 0.3610 Epoch: 18 Global Step: 310200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:30:47,588-Speed 3046.60 samples/sec Loss 0.3680 Epoch: 18 Global Step: 310250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:31:04,015-Speed 3117.03 samples/sec Loss 0.3765 Epoch: 18 Global Step: 310300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:31:20,365-Speed 3131.44 samples/sec Loss 0.3488 Epoch: 18 Global Step: 310350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:31:36,940-Speed 3089.20 samples/sec Loss 0.3610 Epoch: 18 Global Step: 310400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:31:53,321-Speed 3125.61 samples/sec Loss 0.3663 Epoch: 18 Global Step: 310450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:32:10,604-Speed 2962.46 samples/sec Loss 0.3667 Epoch: 18 Global Step: 310500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:32:27,153-Speed 3093.97 samples/sec Loss 0.3753 Epoch: 18 Global Step: 310550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:32:43,796-Speed 3076.61 samples/sec Loss 0.3773 Epoch: 18 Global Step: 310600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:33:00,003-Speed 3159.18 samples/sec Loss 0.3680 Epoch: 18 Global Step: 310650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:33:16,362-Speed 3129.90 samples/sec Loss 0.3725 Epoch: 18 Global Step: 310700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:33:33,418-Speed 3001.90 samples/sec Loss 0.3707 Epoch: 18 Global Step: 310750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:33:50,515-Speed 2994.74 samples/sec Loss 0.3688 Epoch: 18 Global Step: 310800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:34:06,954-Speed 3114.62 samples/sec Loss 0.3745 Epoch: 18 Global Step: 310850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:34:23,434-Speed 3106.85 samples/sec Loss 0.3674 Epoch: 18 Global Step: 310900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:34:39,883-Speed 3112.75 samples/sec Loss 0.3714 Epoch: 18 Global Step: 310950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:34:56,377-Speed 3104.41 samples/sec Loss 0.3648 Epoch: 18 Global Step: 311000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:35:12,756-Speed 3125.91 samples/sec Loss 0.3667 Epoch: 18 Global Step: 311050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:35:28,988-Speed 3154.51 samples/sec Loss 0.3736 Epoch: 18 Global Step: 311100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:35:45,261-Speed 3146.35 samples/sec Loss 0.3713 Epoch: 18 Global Step: 311150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:36:03,482-Speed 2809.98 samples/sec Loss 0.3594 Epoch: 18 Global Step: 311200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:36:19,700-Speed 3157.26 samples/sec Loss 0.3658 Epoch: 18 Global Step: 311250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:36:36,010-Speed 3139.13 samples/sec Loss 0.3569 Epoch: 18 Global Step: 311300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:36:52,884-Speed 3034.32 samples/sec Loss 0.3630 Epoch: 18 Global Step: 311350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:37:09,504-Speed 3080.87 samples/sec Loss 0.3671 Epoch: 18 Global Step: 311400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:37:26,509-Speed 3010.82 samples/sec Loss 0.3657 Epoch: 18 Global Step: 311450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:37:42,957-Speed 3113.08 samples/sec Loss 0.3663 Epoch: 18 Global Step: 311500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:37:59,462-Speed 3102.09 samples/sec Loss 0.3658 Epoch: 18 Global Step: 311550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:38:16,917-Speed 2933.31 samples/sec Loss 0.3758 Epoch: 18 Global Step: 311600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:38:33,398-Speed 3106.72 samples/sec Loss 0.3535 Epoch: 18 Global Step: 311650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:38:49,723-Speed 3136.38 samples/sec Loss 0.3688 Epoch: 18 Global Step: 311700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:39:06,044-Speed 3137.25 samples/sec Loss 0.3687 Epoch: 18 Global Step: 311750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:39:22,286-Speed 3152.36 samples/sec Loss 0.3734 Epoch: 18 Global Step: 311800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:39:38,725-Speed 3114.75 samples/sec Loss 0.3635 Epoch: 18 Global Step: 311850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-17 11:39:55,031-Speed 3140.02 samples/sec Loss 0.3619 Epoch: 18 Global Step: 311900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:40:11,091-Speed 3188.00 samples/sec Loss 0.3721 Epoch: 18 Global Step: 311950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:40:27,321-Speed 3154.86 samples/sec Loss 0.3649 Epoch: 18 Global Step: 312000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:41:20,354-[lfw][312000]XNorm: 21.643619 Training: 2021-03-17 11:41:20,354-[lfw][312000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 11:41:20,355-[lfw][312000]Accuracy-Highest: 0.99817 Training: 2021-03-17 11:42:22,188-[cfp_fp][312000]XNorm: 22.350833 Training: 2021-03-17 11:42:22,188-[cfp_fp][312000]Accuracy-Flip: 0.99286+-0.00469 Training: 2021-03-17 11:42:22,188-[cfp_fp][312000]Accuracy-Highest: 0.99286 Training: 2021-03-17 11:43:15,405-[agedb_30][312000]XNorm: 22.840422 Training: 2021-03-17 11:43:15,405-[agedb_30][312000]Accuracy-Flip: 0.98400+-0.00588 Training: 2021-03-17 11:43:15,405-[agedb_30][312000]Accuracy-Highest: 0.98483 Training: 2021-03-17 11:43:31,984-Speed 277.26 samples/sec Loss 0.3705 Epoch: 18 Global Step: 312050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:43:48,845-Speed 3036.62 samples/sec Loss 0.3688 Epoch: 18 Global Step: 312100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:44:05,393-Speed 3094.12 samples/sec Loss 0.3577 Epoch: 18 Global Step: 312150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:44:21,701-Speed 3139.72 samples/sec Loss 0.3690 Epoch: 18 Global Step: 312200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:44:37,879-Speed 3164.81 samples/sec Loss 0.3577 Epoch: 18 Global Step: 312250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:44:55,122-Speed 2969.40 samples/sec Loss 0.3614 Epoch: 18 Global Step: 312300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:45:11,781-Speed 3073.43 samples/sec Loss 0.3678 Epoch: 18 Global Step: 312350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:45:29,119-Speed 2953.22 samples/sec Loss 0.3711 Epoch: 18 Global Step: 312400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:45:45,294-Speed 3165.50 samples/sec Loss 0.3632 Epoch: 18 Global Step: 312450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:46:01,675-Speed 3125.72 samples/sec Loss 0.3639 Epoch: 18 Global Step: 312500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:46:18,065-Speed 3123.82 samples/sec Loss 0.3781 Epoch: 18 Global Step: 312550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:46:34,536-Speed 3108.54 samples/sec Loss 0.3583 Epoch: 18 Global Step: 312600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:46:50,760-Speed 3156.03 samples/sec Loss 0.3580 Epoch: 18 Global Step: 312650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:47:07,046-Speed 3143.91 samples/sec Loss 0.3662 Epoch: 18 Global Step: 312700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:47:24,465-Speed 2939.38 samples/sec Loss 0.3766 Epoch: 18 Global Step: 312750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:47:40,933-Speed 3109.13 samples/sec Loss 0.3664 Epoch: 18 Global Step: 312800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:47:57,258-Speed 3136.44 samples/sec Loss 0.3718 Epoch: 18 Global Step: 312850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:48:13,468-Speed 3158.71 samples/sec Loss 0.3591 Epoch: 18 Global Step: 312900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:48:30,751-Speed 2962.36 samples/sec Loss 0.3652 Epoch: 18 Global Step: 312950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:48:48,312-Speed 2915.78 samples/sec Loss 0.3680 Epoch: 18 Global Step: 313000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:49:04,626-Speed 3138.53 samples/sec Loss 0.3664 Epoch: 18 Global Step: 313050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:49:20,917-Speed 3142.93 samples/sec Loss 0.3701 Epoch: 18 Global Step: 313100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:49:37,479-Speed 3091.35 samples/sec Loss 0.3530 Epoch: 18 Global Step: 313150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:49:53,949-Speed 3108.86 samples/sec Loss 0.3626 Epoch: 18 Global Step: 313200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:50:10,331-Speed 3125.43 samples/sec Loss 0.3712 Epoch: 18 Global Step: 313250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:50:26,571-Speed 3152.81 samples/sec Loss 0.3712 Epoch: 18 Global Step: 313300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:50:43,165-Speed 3085.65 samples/sec Loss 0.3632 Epoch: 18 Global Step: 313350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:50:59,628-Speed 3110.09 samples/sec Loss 0.3623 Epoch: 18 Global Step: 313400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:51:16,895-Speed 2965.31 samples/sec Loss 0.3678 Epoch: 18 Global Step: 313450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:51:33,373-Speed 3107.16 samples/sec Loss 0.3619 Epoch: 18 Global Step: 313500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:51:50,454-Speed 2997.59 samples/sec Loss 0.3565 Epoch: 18 Global Step: 313550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:52:06,943-Speed 3105.17 samples/sec Loss 0.3699 Epoch: 18 Global Step: 313600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:52:23,104-Speed 3168.22 samples/sec Loss 0.3685 Epoch: 18 Global Step: 313650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:52:39,608-Speed 3102.49 samples/sec Loss 0.3684 Epoch: 18 Global Step: 313700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:52:55,994-Speed 3124.53 samples/sec Loss 0.3630 Epoch: 18 Global Step: 313750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:53:13,054-Speed 3001.38 samples/sec Loss 0.3777 Epoch: 18 Global Step: 313800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:53:29,530-Speed 3107.62 samples/sec Loss 0.3749 Epoch: 18 Global Step: 313850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:53:45,943-Speed 3119.58 samples/sec Loss 0.3717 Epoch: 18 Global Step: 313900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:54:02,822-Speed 3033.36 samples/sec Loss 0.3615 Epoch: 18 Global Step: 313950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:54:19,245-Speed 3117.77 samples/sec Loss 0.3697 Epoch: 18 Global Step: 314000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:55:12,416-[lfw][314000]XNorm: 21.336900 Training: 2021-03-17 11:55:12,416-[lfw][314000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 11:55:12,421-[lfw][314000]Accuracy-Highest: 0.99817 Training: 2021-03-17 11:56:14,262-[cfp_fp][314000]XNorm: 22.111694 Training: 2021-03-17 11:56:14,263-[cfp_fp][314000]Accuracy-Flip: 0.99257+-0.00494 Training: 2021-03-17 11:56:14,263-[cfp_fp][314000]Accuracy-Highest: 0.99286 Training: 2021-03-17 11:57:07,584-[agedb_30][314000]XNorm: 22.559419 Training: 2021-03-17 11:57:07,585-[agedb_30][314000]Accuracy-Flip: 0.98400+-0.00597 Training: 2021-03-17 11:57:07,585-[agedb_30][314000]Accuracy-Highest: 0.98483 Training: 2021-03-17 11:57:23,812-Speed 277.41 samples/sec Loss 0.3656 Epoch: 18 Global Step: 314050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:57:40,356-Speed 3094.89 samples/sec Loss 0.3679 Epoch: 18 Global Step: 314100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:57:57,046-Speed 3067.94 samples/sec Loss 0.3658 Epoch: 18 Global Step: 314150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:58:13,332-Speed 3143.75 samples/sec Loss 0.3666 Epoch: 18 Global Step: 314200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:58:29,519-Speed 3163.21 samples/sec Loss 0.3727 Epoch: 18 Global Step: 314250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:58:45,838-Speed 3137.47 samples/sec Loss 0.3721 Epoch: 18 Global Step: 314300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:59:02,643-Speed 3046.92 samples/sec Loss 0.3707 Epoch: 18 Global Step: 314350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:59:19,224-Speed 3087.92 samples/sec Loss 0.3631 Epoch: 18 Global Step: 314400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:59:35,854-Speed 3078.85 samples/sec Loss 0.3671 Epoch: 18 Global Step: 314450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 11:59:53,089-Speed 2970.84 samples/sec Loss 0.3702 Epoch: 18 Global Step: 314500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:00:09,470-Speed 3125.58 samples/sec Loss 0.3718 Epoch: 18 Global Step: 314550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:00:25,797-Speed 3136.04 samples/sec Loss 0.3757 Epoch: 18 Global Step: 314600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:00:42,165-Speed 3128.05 samples/sec Loss 0.3555 Epoch: 18 Global Step: 314650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:00:59,367-Speed 2976.56 samples/sec Loss 0.3600 Epoch: 18 Global Step: 314700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:01:15,995-Speed 3079.28 samples/sec Loss 0.3720 Epoch: 18 Global Step: 314750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:01:32,533-Speed 3095.94 samples/sec Loss 0.3724 Epoch: 18 Global Step: 314800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:01:49,161-Speed 3079.22 samples/sec Loss 0.3705 Epoch: 18 Global Step: 314850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:02:05,853-Speed 3067.48 samples/sec Loss 0.3627 Epoch: 18 Global Step: 314900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:02:23,500-Speed 2901.41 samples/sec Loss 0.3664 Epoch: 18 Global Step: 314950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:02:39,691-Speed 3162.26 samples/sec Loss 0.3644 Epoch: 18 Global Step: 315000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:02:55,992-Speed 3140.97 samples/sec Loss 0.3646 Epoch: 18 Global Step: 315050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:03:12,652-Speed 3073.43 samples/sec Loss 0.3648 Epoch: 18 Global Step: 315100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:03:30,281-Speed 2904.32 samples/sec Loss 0.3674 Epoch: 18 Global Step: 315150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:03:47,651-Speed 2947.79 samples/sec Loss 0.3719 Epoch: 18 Global Step: 315200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:04:04,110-Speed 3110.82 samples/sec Loss 0.3663 Epoch: 18 Global Step: 315250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:04:20,482-Speed 3127.36 samples/sec Loss 0.3688 Epoch: 18 Global Step: 315300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:04:36,873-Speed 3123.72 samples/sec Loss 0.3655 Epoch: 18 Global Step: 315350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:04:53,336-Speed 3110.05 samples/sec Loss 0.3653 Epoch: 18 Global Step: 315400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:05:09,746-Speed 3120.23 samples/sec Loss 0.3745 Epoch: 18 Global Step: 315450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:05:25,973-Speed 3155.30 samples/sec Loss 0.3691 Epoch: 18 Global Step: 315500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:05:42,355-Speed 3125.52 samples/sec Loss 0.3658 Epoch: 18 Global Step: 315550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:05:58,805-Speed 3112.47 samples/sec Loss 0.3715 Epoch: 18 Global Step: 315600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:06:16,225-Speed 2939.29 samples/sec Loss 0.3666 Epoch: 18 Global Step: 315650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:06:33,051-Speed 3042.89 samples/sec Loss 0.3701 Epoch: 18 Global Step: 315700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:06:50,396-Speed 2951.95 samples/sec Loss 0.3631 Epoch: 18 Global Step: 315750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:07:07,045-Speed 3075.45 samples/sec Loss 0.3664 Epoch: 18 Global Step: 315800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:07:23,231-Speed 3163.27 samples/sec Loss 0.3615 Epoch: 18 Global Step: 315850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:07:39,550-Speed 3137.59 samples/sec Loss 0.3628 Epoch: 18 Global Step: 315900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:07:55,818-Speed 3147.37 samples/sec Loss 0.3587 Epoch: 18 Global Step: 315950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:08:13,461-Speed 2902.06 samples/sec Loss 0.3732 Epoch: 18 Global Step: 316000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:09:06,751-[lfw][316000]XNorm: 21.426326 Training: 2021-03-17 12:09:06,751-[lfw][316000]Accuracy-Flip: 0.99767+-0.00260 Training: 2021-03-17 12:09:06,751-[lfw][316000]Accuracy-Highest: 0.99817 Training: 2021-03-17 12:10:08,540-[cfp_fp][316000]XNorm: 22.200001 Training: 2021-03-17 12:10:08,540-[cfp_fp][316000]Accuracy-Flip: 0.99229+-0.00547 Training: 2021-03-17 12:10:08,540-[cfp_fp][316000]Accuracy-Highest: 0.99286 Training: 2021-03-17 12:11:01,696-[agedb_30][316000]XNorm: 22.669203 Training: 2021-03-17 12:11:01,697-[agedb_30][316000]Accuracy-Flip: 0.98317+-0.00643 Training: 2021-03-17 12:11:01,697-[agedb_30][316000]Accuracy-Highest: 0.98483 Training: 2021-03-17 12:11:18,063-Speed 277.35 samples/sec Loss 0.3558 Epoch: 18 Global Step: 316050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:11:34,442-Speed 3126.08 samples/sec Loss 0.3745 Epoch: 18 Global Step: 316100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:11:51,044-Speed 3084.08 samples/sec Loss 0.3649 Epoch: 18 Global Step: 316150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:12:07,720-Speed 3070.34 samples/sec Loss 0.3744 Epoch: 18 Global Step: 316200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:12:24,472-Speed 3056.37 samples/sec Loss 0.3565 Epoch: 18 Global Step: 316250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:12:40,564-Speed 3181.91 samples/sec Loss 0.3641 Epoch: 18 Global Step: 316300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:12:57,433-Speed 3035.18 samples/sec Loss 0.3629 Epoch: 18 Global Step: 316350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:13:13,844-Speed 3120.02 samples/sec Loss 0.3653 Epoch: 18 Global Step: 316400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:13:30,308-Speed 3109.93 samples/sec Loss 0.3648 Epoch: 18 Global Step: 316450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:13:46,518-Speed 3158.65 samples/sec Loss 0.3647 Epoch: 18 Global Step: 316500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:14:02,907-Speed 3124.07 samples/sec Loss 0.3751 Epoch: 18 Global Step: 316550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:14:19,193-Speed 3143.95 samples/sec Loss 0.3655 Epoch: 18 Global Step: 316600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:14:35,520-Speed 3135.95 samples/sec Loss 0.3808 Epoch: 18 Global Step: 316650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:14:52,654-Speed 2988.35 samples/sec Loss 0.3613 Epoch: 18 Global Step: 316700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:15:09,004-Speed 3131.64 samples/sec Loss 0.3587 Epoch: 18 Global Step: 316750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:15:25,285-Speed 3144.76 samples/sec Loss 0.3617 Epoch: 18 Global Step: 316800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:15:41,768-Speed 3106.28 samples/sec Loss 0.3625 Epoch: 18 Global Step: 316850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:15:59,057-Speed 2961.64 samples/sec Loss 0.3769 Epoch: 18 Global Step: 316900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:16:15,545-Speed 3105.37 samples/sec Loss 0.3650 Epoch: 18 Global Step: 316950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:16:31,844-Speed 3141.41 samples/sec Loss 0.3581 Epoch: 18 Global Step: 317000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:16:48,469-Speed 3079.68 samples/sec Loss 0.3693 Epoch: 18 Global Step: 317050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:17:04,616-Speed 3171.11 samples/sec Loss 0.3634 Epoch: 18 Global Step: 317100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:17:35,785-Speed 1642.68 samples/sec Loss 0.3649 Epoch: 19 Global Step: 317150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:17:52,576-Speed 3049.31 samples/sec Loss 0.3691 Epoch: 19 Global Step: 317200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:18:09,435-Speed 3037.06 samples/sec Loss 0.3577 Epoch: 19 Global Step: 317250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:18:25,943-Speed 3101.74 samples/sec Loss 0.3557 Epoch: 19 Global Step: 317300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:18:42,517-Speed 3089.21 samples/sec Loss 0.3637 Epoch: 19 Global Step: 317350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:19:00,750-Speed 2808.21 samples/sec Loss 0.3603 Epoch: 19 Global Step: 317400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:19:17,369-Speed 3080.92 samples/sec Loss 0.3682 Epoch: 19 Global Step: 317450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:19:33,834-Speed 3109.61 samples/sec Loss 0.3654 Epoch: 19 Global Step: 317500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:19:50,275-Speed 3114.33 samples/sec Loss 0.3684 Epoch: 19 Global Step: 317550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:20:06,637-Speed 3129.23 samples/sec Loss 0.3752 Epoch: 19 Global Step: 317600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:20:23,354-Speed 3062.80 samples/sec Loss 0.3708 Epoch: 19 Global Step: 317650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:20:39,946-Speed 3086.05 samples/sec Loss 0.3550 Epoch: 19 Global Step: 317700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:20:56,461-Speed 3100.35 samples/sec Loss 0.3661 Epoch: 19 Global Step: 317750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:21:12,810-Speed 3131.61 samples/sec Loss 0.3677 Epoch: 19 Global Step: 317800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:21:30,703-Speed 2861.60 samples/sec Loss 0.3698 Epoch: 19 Global Step: 317850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:21:48,403-Speed 2892.72 samples/sec Loss 0.3642 Epoch: 19 Global Step: 317900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:22:05,062-Speed 3073.49 samples/sec Loss 0.3605 Epoch: 19 Global Step: 317950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:22:21,548-Speed 3105.94 samples/sec Loss 0.3696 Epoch: 19 Global Step: 318000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:23:14,869-[lfw][318000]XNorm: 21.526139 Training: 2021-03-17 12:23:14,870-[lfw][318000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 12:23:14,870-[lfw][318000]Accuracy-Highest: 0.99817 Training: 2021-03-17 12:24:16,605-[cfp_fp][318000]XNorm: 22.283637 Training: 2021-03-17 12:24:16,605-[cfp_fp][318000]Accuracy-Flip: 0.99200+-0.00500 Training: 2021-03-17 12:24:16,605-[cfp_fp][318000]Accuracy-Highest: 0.99286 Training: 2021-03-17 12:25:09,883-[agedb_30][318000]XNorm: 22.755627 Training: 2021-03-17 12:25:09,883-[agedb_30][318000]Accuracy-Flip: 0.98300+-0.00623 Training: 2021-03-17 12:25:09,884-[agedb_30][318000]Accuracy-Highest: 0.98483 Training: 2021-03-17 12:25:26,263-Speed 277.18 samples/sec Loss 0.3675 Epoch: 19 Global Step: 318050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:25:42,651-Speed 3124.34 samples/sec Loss 0.3647 Epoch: 19 Global Step: 318100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:25:59,312-Speed 3073.04 samples/sec Loss 0.3620 Epoch: 19 Global Step: 318150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:26:16,019-Speed 3064.64 samples/sec Loss 0.3562 Epoch: 19 Global Step: 318200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:26:33,617-Speed 2909.63 samples/sec Loss 0.3692 Epoch: 19 Global Step: 318250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:26:50,067-Speed 3112.59 samples/sec Loss 0.3659 Epoch: 19 Global Step: 318300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:27:06,369-Speed 3140.80 samples/sec Loss 0.3652 Epoch: 19 Global Step: 318350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:27:22,779-Speed 3120.10 samples/sec Loss 0.3614 Epoch: 19 Global Step: 318400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:27:39,151-Speed 3127.37 samples/sec Loss 0.3601 Epoch: 19 Global Step: 318450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:27:55,628-Speed 3107.36 samples/sec Loss 0.3626 Epoch: 19 Global Step: 318500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:28:11,926-Speed 3141.71 samples/sec Loss 0.3726 Epoch: 19 Global Step: 318550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:28:28,354-Speed 3116.75 samples/sec Loss 0.3615 Epoch: 19 Global Step: 318600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:28:44,913-Speed 3091.95 samples/sec Loss 0.3719 Epoch: 19 Global Step: 318650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:29:01,367-Speed 3111.76 samples/sec Loss 0.3675 Epoch: 19 Global Step: 318700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:29:17,828-Speed 3110.60 samples/sec Loss 0.3695 Epoch: 19 Global Step: 318750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:29:34,360-Speed 3097.03 samples/sec Loss 0.3628 Epoch: 19 Global Step: 318800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:29:51,512-Speed 2985.16 samples/sec Loss 0.3620 Epoch: 19 Global Step: 318850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:30:08,020-Speed 3101.69 samples/sec Loss 0.3686 Epoch: 19 Global Step: 318900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:30:24,468-Speed 3112.74 samples/sec Loss 0.3651 Epoch: 19 Global Step: 318950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:30:40,867-Speed 3122.41 samples/sec Loss 0.3610 Epoch: 19 Global Step: 319000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:30:57,340-Speed 3108.20 samples/sec Loss 0.3690 Epoch: 19 Global Step: 319050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:31:15,199-Speed 2866.94 samples/sec Loss 0.3702 Epoch: 19 Global Step: 319100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:31:31,631-Speed 3115.99 samples/sec Loss 0.3532 Epoch: 19 Global Step: 319150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:31:48,262-Speed 3078.63 samples/sec Loss 0.3595 Epoch: 19 Global Step: 319200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:32:04,703-Speed 3114.47 samples/sec Loss 0.3690 Epoch: 19 Global Step: 319250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:32:21,018-Speed 3138.27 samples/sec Loss 0.3709 Epoch: 19 Global Step: 319300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:32:38,447-Speed 2937.70 samples/sec Loss 0.3714 Epoch: 19 Global Step: 319350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:32:55,006-Speed 3092.03 samples/sec Loss 0.3648 Epoch: 19 Global Step: 319400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:33:11,428-Speed 3117.82 samples/sec Loss 0.3603 Epoch: 19 Global Step: 319450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:33:27,794-Speed 3128.48 samples/sec Loss 0.3691 Epoch: 19 Global Step: 319500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:33:44,229-Speed 3115.45 samples/sec Loss 0.3657 Epoch: 19 Global Step: 319550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:34:01,765-Speed 2919.84 samples/sec Loss 0.3543 Epoch: 19 Global Step: 319600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:34:18,046-Speed 3144.85 samples/sec Loss 0.3617 Epoch: 19 Global Step: 319650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:34:35,744-Speed 2892.97 samples/sec Loss 0.3640 Epoch: 19 Global Step: 319700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:34:52,248-Speed 3102.47 samples/sec Loss 0.3680 Epoch: 19 Global Step: 319750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:35:08,639-Speed 3123.61 samples/sec Loss 0.3596 Epoch: 19 Global Step: 319800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:35:25,086-Speed 3113.18 samples/sec Loss 0.3611 Epoch: 19 Global Step: 319850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:35:41,459-Speed 3127.25 samples/sec Loss 0.3668 Epoch: 19 Global Step: 319900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:35:57,731-Speed 3146.64 samples/sec Loss 0.3573 Epoch: 19 Global Step: 319950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:36:14,949-Speed 2973.62 samples/sec Loss 0.3723 Epoch: 19 Global Step: 320000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:37:08,175-[lfw][320000]XNorm: 21.482503 Training: 2021-03-17 12:37:08,175-[lfw][320000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 12:37:08,175-[lfw][320000]Accuracy-Highest: 0.99817 Training: 2021-03-17 12:38:10,221-[cfp_fp][320000]XNorm: 22.242099 Training: 2021-03-17 12:38:10,222-[cfp_fp][320000]Accuracy-Flip: 0.99271+-0.00449 Training: 2021-03-17 12:38:10,222-[cfp_fp][320000]Accuracy-Highest: 0.99286 Training: 2021-03-17 12:39:03,754-[agedb_30][320000]XNorm: 22.679101 Training: 2021-03-17 12:39:03,754-[agedb_30][320000]Accuracy-Flip: 0.98400+-0.00597 Training: 2021-03-17 12:39:03,754-[agedb_30][320000]Accuracy-Highest: 0.98483 Training: 2021-03-17 12:39:20,303-Speed 276.23 samples/sec Loss 0.3596 Epoch: 19 Global Step: 320050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:39:36,811-Speed 3101.68 samples/sec Loss 0.3629 Epoch: 19 Global Step: 320100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:39:54,205-Speed 2943.62 samples/sec Loss 0.3553 Epoch: 19 Global Step: 320150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:40:10,841-Speed 3077.69 samples/sec Loss 0.3781 Epoch: 19 Global Step: 320200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:40:27,357-Speed 3100.14 samples/sec Loss 0.3709 Epoch: 19 Global Step: 320250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:40:43,691-Speed 3134.60 samples/sec Loss 0.3623 Epoch: 19 Global Step: 320300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:41:00,106-Speed 3119.31 samples/sec Loss 0.3547 Epoch: 19 Global Step: 320350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:41:16,527-Speed 3118.01 samples/sec Loss 0.3611 Epoch: 19 Global Step: 320400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:41:32,755-Speed 3155.08 samples/sec Loss 0.3592 Epoch: 19 Global Step: 320450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:41:49,837-Speed 2997.34 samples/sec Loss 0.3617 Epoch: 19 Global Step: 320500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:42:06,974-Speed 2987.82 samples/sec Loss 0.3616 Epoch: 19 Global Step: 320550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:42:23,500-Speed 3098.28 samples/sec Loss 0.3564 Epoch: 19 Global Step: 320600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:42:39,912-Speed 3119.70 samples/sec Loss 0.3606 Epoch: 19 Global Step: 320650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-17 12:42:56,066-Speed 3169.64 samples/sec Loss 0.3591 Epoch: 19 Global Step: 320700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:43:12,786-Speed 3062.24 samples/sec Loss 0.3613 Epoch: 19 Global Step: 320750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:43:29,289-Speed 3102.67 samples/sec Loss 0.3620 Epoch: 19 Global Step: 320800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:43:45,839-Speed 3093.74 samples/sec Loss 0.3554 Epoch: 19 Global Step: 320850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:44:02,313-Speed 3107.99 samples/sec Loss 0.3662 Epoch: 19 Global Step: 320900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:44:18,907-Speed 3085.60 samples/sec Loss 0.3704 Epoch: 19 Global Step: 320950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:44:35,365-Speed 3110.88 samples/sec Loss 0.3635 Epoch: 19 Global Step: 321000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:44:52,822-Speed 2933.07 samples/sec Loss 0.3648 Epoch: 19 Global Step: 321050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:45:09,476-Speed 3074.39 samples/sec Loss 0.3596 Epoch: 19 Global Step: 321100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:45:25,805-Speed 3135.68 samples/sec Loss 0.3605 Epoch: 19 Global Step: 321150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:45:42,147-Speed 3133.14 samples/sec Loss 0.3585 Epoch: 19 Global Step: 321200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:45:59,958-Speed 2874.76 samples/sec Loss 0.3643 Epoch: 19 Global Step: 321250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:46:16,498-Speed 3095.55 samples/sec Loss 0.3660 Epoch: 19 Global Step: 321300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:46:33,119-Speed 3080.61 samples/sec Loss 0.3534 Epoch: 19 Global Step: 321350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:46:49,789-Speed 3071.37 samples/sec Loss 0.3618 Epoch: 19 Global Step: 321400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:47:06,347-Speed 3092.25 samples/sec Loss 0.3618 Epoch: 19 Global Step: 321450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:47:22,699-Speed 3131.31 samples/sec Loss 0.3626 Epoch: 19 Global Step: 321500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:47:39,899-Speed 2976.86 samples/sec Loss 0.3588 Epoch: 19 Global Step: 321550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:47:56,432-Speed 3096.95 samples/sec Loss 0.3625 Epoch: 19 Global Step: 321600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:48:12,719-Speed 3143.69 samples/sec Loss 0.3544 Epoch: 19 Global Step: 321650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:48:29,053-Speed 3134.58 samples/sec Loss 0.3609 Epoch: 19 Global Step: 321700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:48:45,458-Speed 3121.18 samples/sec Loss 0.3565 Epoch: 19 Global Step: 321750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:49:02,071-Speed 3081.99 samples/sec Loss 0.3638 Epoch: 19 Global Step: 321800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:49:18,739-Speed 3071.90 samples/sec Loss 0.3678 Epoch: 19 Global Step: 321850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:49:36,755-Speed 2841.91 samples/sec Loss 0.3678 Epoch: 19 Global Step: 321900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:49:53,185-Speed 3116.43 samples/sec Loss 0.3646 Epoch: 19 Global Step: 321950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:50:09,853-Speed 3071.84 samples/sec Loss 0.3584 Epoch: 19 Global Step: 322000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:51:03,355-[lfw][322000]XNorm: 21.462009 Training: 2021-03-17 12:51:03,355-[lfw][322000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 12:51:03,355-[lfw][322000]Accuracy-Highest: 0.99817 Training: 2021-03-17 12:52:05,569-[cfp_fp][322000]XNorm: 22.219722 Training: 2021-03-17 12:52:05,570-[cfp_fp][322000]Accuracy-Flip: 0.99243+-0.00465 Training: 2021-03-17 12:52:05,570-[cfp_fp][322000]Accuracy-Highest: 0.99286 Training: 2021-03-17 12:52:59,114-[agedb_30][322000]XNorm: 22.691015 Training: 2021-03-17 12:52:59,114-[agedb_30][322000]Accuracy-Flip: 0.98350+-0.00617 Training: 2021-03-17 12:52:59,114-[agedb_30][322000]Accuracy-Highest: 0.98483 Training: 2021-03-17 12:53:15,506-Speed 275.78 samples/sec Loss 0.3548 Epoch: 19 Global Step: 322050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:53:31,791-Speed 3144.12 samples/sec Loss 0.3694 Epoch: 19 Global Step: 322100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:53:48,276-Speed 3105.94 samples/sec Loss 0.3599 Epoch: 19 Global Step: 322150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:54:04,939-Speed 3072.80 samples/sec Loss 0.3603 Epoch: 19 Global Step: 322200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:54:22,201-Speed 2966.22 samples/sec Loss 0.3672 Epoch: 19 Global Step: 322250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:54:38,602-Speed 3121.93 samples/sec Loss 0.3752 Epoch: 19 Global Step: 322300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:54:56,213-Speed 2907.30 samples/sec Loss 0.3620 Epoch: 19 Global Step: 322350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:55:12,713-Speed 3103.18 samples/sec Loss 0.3616 Epoch: 19 Global Step: 322400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:55:29,460-Speed 3057.40 samples/sec Loss 0.3644 Epoch: 19 Global Step: 322450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:55:46,006-Speed 3094.48 samples/sec Loss 0.3615 Epoch: 19 Global Step: 322500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:56:02,383-Speed 3126.48 samples/sec Loss 0.3641 Epoch: 19 Global Step: 322550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:56:18,832-Speed 3112.66 samples/sec Loss 0.3574 Epoch: 19 Global Step: 322600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:56:35,329-Speed 3103.77 samples/sec Loss 0.3686 Epoch: 19 Global Step: 322650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:56:51,756-Speed 3116.89 samples/sec Loss 0.3587 Epoch: 19 Global Step: 322700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:57:08,235-Speed 3107.05 samples/sec Loss 0.3648 Epoch: 19 Global Step: 322750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:57:24,496-Speed 3148.73 samples/sec Loss 0.3736 Epoch: 19 Global Step: 322800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:57:41,737-Speed 2969.71 samples/sec Loss 0.3665 Epoch: 19 Global Step: 322850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:57:58,204-Speed 3109.42 samples/sec Loss 0.3599 Epoch: 19 Global Step: 322900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:58:14,682-Speed 3107.27 samples/sec Loss 0.3555 Epoch: 19 Global Step: 322950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:58:31,196-Speed 3100.38 samples/sec Loss 0.3713 Epoch: 19 Global Step: 323000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:58:47,919-Speed 3061.91 samples/sec Loss 0.3582 Epoch: 19 Global Step: 323050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:59:04,296-Speed 3126.41 samples/sec Loss 0.3568 Epoch: 19 Global Step: 323100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:59:20,685-Speed 3124.17 samples/sec Loss 0.3710 Epoch: 19 Global Step: 323150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:59:37,190-Speed 3102.11 samples/sec Loss 0.3693 Epoch: 19 Global Step: 323200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 12:59:54,538-Speed 2951.45 samples/sec Loss 0.3631 Epoch: 19 Global Step: 323250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:00:11,240-Speed 3065.53 samples/sec Loss 0.3648 Epoch: 19 Global Step: 323300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:00:27,762-Speed 3099.03 samples/sec Loss 0.3662 Epoch: 19 Global Step: 323350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:00:44,667-Speed 3028.72 samples/sec Loss 0.3682 Epoch: 19 Global Step: 323400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:01:02,341-Speed 2897.02 samples/sec Loss 0.3571 Epoch: 19 Global Step: 323450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:01:18,924-Speed 3087.59 samples/sec Loss 0.3685 Epoch: 19 Global Step: 323500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:01:35,580-Speed 3074.03 samples/sec Loss 0.3580 Epoch: 19 Global Step: 323550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:01:52,017-Speed 3115.12 samples/sec Loss 0.3548 Epoch: 19 Global Step: 323600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:02:08,369-Speed 3131.18 samples/sec Loss 0.3695 Epoch: 19 Global Step: 323650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:02:25,579-Speed 2975.01 samples/sec Loss 0.3620 Epoch: 19 Global Step: 323700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:02:41,967-Speed 3124.35 samples/sec Loss 0.3607 Epoch: 19 Global Step: 323750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:02:58,506-Speed 3095.77 samples/sec Loss 0.3554 Epoch: 19 Global Step: 323800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:03:15,096-Speed 3086.40 samples/sec Loss 0.3590 Epoch: 19 Global Step: 323850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:03:31,888-Speed 3049.01 samples/sec Loss 0.3607 Epoch: 19 Global Step: 323900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:03:48,272-Speed 3125.26 samples/sec Loss 0.3629 Epoch: 19 Global Step: 323950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:04:04,713-Speed 3114.25 samples/sec Loss 0.3561 Epoch: 19 Global Step: 324000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:04:58,067-[lfw][324000]XNorm: 21.470476 Training: 2021-03-17 13:04:58,067-[lfw][324000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 13:04:58,067-[lfw][324000]Accuracy-Highest: 0.99817 Training: 2021-03-17 13:06:00,268-[cfp_fp][324000]XNorm: 22.248879 Training: 2021-03-17 13:06:00,269-[cfp_fp][324000]Accuracy-Flip: 0.99257+-0.00506 Training: 2021-03-17 13:06:00,269-[cfp_fp][324000]Accuracy-Highest: 0.99286 Training: 2021-03-17 13:06:53,859-[agedb_30][324000]XNorm: 22.683525 Training: 2021-03-17 13:06:53,859-[agedb_30][324000]Accuracy-Flip: 0.98383+-0.00615 Training: 2021-03-17 13:06:53,859-[agedb_30][324000]Accuracy-Highest: 0.98483 Training: 2021-03-17 13:07:10,151-Speed 276.10 samples/sec Loss 0.3599 Epoch: 19 Global Step: 324050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:07:26,374-Speed 3156.13 samples/sec Loss 0.3676 Epoch: 19 Global Step: 324100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:07:44,524-Speed 2820.98 samples/sec Loss 0.3620 Epoch: 19 Global Step: 324150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:08:00,866-Speed 3133.06 samples/sec Loss 0.3701 Epoch: 19 Global Step: 324200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:08:17,181-Speed 3138.47 samples/sec Loss 0.3623 Epoch: 19 Global Step: 324250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:08:33,481-Speed 3141.06 samples/sec Loss 0.3605 Epoch: 19 Global Step: 324300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:08:50,009-Speed 3097.88 samples/sec Loss 0.3699 Epoch: 19 Global Step: 324350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:09:06,696-Speed 3068.29 samples/sec Loss 0.3577 Epoch: 19 Global Step: 324400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:09:24,265-Speed 2914.36 samples/sec Loss 0.3628 Epoch: 19 Global Step: 324450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:09:40,756-Speed 3104.85 samples/sec Loss 0.3654 Epoch: 19 Global Step: 324500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:09:57,469-Speed 3063.53 samples/sec Loss 0.3565 Epoch: 19 Global Step: 324550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:10:14,645-Speed 2981.09 samples/sec Loss 0.3632 Epoch: 19 Global Step: 324600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:10:31,010-Speed 3128.67 samples/sec Loss 0.3728 Epoch: 19 Global Step: 324650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:10:47,374-Speed 3128.94 samples/sec Loss 0.3704 Epoch: 19 Global Step: 324700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:11:03,830-Speed 3111.37 samples/sec Loss 0.3647 Epoch: 19 Global Step: 324750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:11:20,085-Speed 3150.01 samples/sec Loss 0.3664 Epoch: 19 Global Step: 324800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:11:36,830-Speed 3057.71 samples/sec Loss 0.3689 Epoch: 19 Global Step: 324850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:11:53,321-Speed 3104.87 samples/sec Loss 0.3633 Epoch: 19 Global Step: 324900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:12:09,986-Speed 3072.27 samples/sec Loss 0.3761 Epoch: 19 Global Step: 324950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:12:26,314-Speed 3135.84 samples/sec Loss 0.3689 Epoch: 19 Global Step: 325000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:12:42,686-Speed 3127.36 samples/sec Loss 0.3631 Epoch: 19 Global Step: 325050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:12:59,274-Speed 3086.81 samples/sec Loss 0.3642 Epoch: 19 Global Step: 325100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:13:16,509-Speed 2970.72 samples/sec Loss 0.3744 Epoch: 19 Global Step: 325150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:13:33,340-Speed 3042.09 samples/sec Loss 0.3614 Epoch: 19 Global Step: 325200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:13:49,635-Speed 3142.19 samples/sec Loss 0.3653 Epoch: 19 Global Step: 325250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:14:06,251-Speed 3081.48 samples/sec Loss 0.3663 Epoch: 19 Global Step: 325300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:14:22,674-Speed 3117.76 samples/sec Loss 0.3615 Epoch: 19 Global Step: 325350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:14:39,665-Speed 3013.30 samples/sec Loss 0.3699 Epoch: 19 Global Step: 325400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:14:57,109-Speed 2935.22 samples/sec Loss 0.3544 Epoch: 19 Global Step: 325450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:15:13,478-Speed 3128.00 samples/sec Loss 0.3677 Epoch: 19 Global Step: 325500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:15:29,836-Speed 3130.04 samples/sec Loss 0.3619 Epoch: 19 Global Step: 325550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:15:46,328-Speed 3104.68 samples/sec Loss 0.3694 Epoch: 19 Global Step: 325600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:16:03,636-Speed 2958.37 samples/sec Loss 0.3682 Epoch: 19 Global Step: 325650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:16:20,184-Speed 3094.10 samples/sec Loss 0.3709 Epoch: 19 Global Step: 325700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:16:36,521-Speed 3134.11 samples/sec Loss 0.3673 Epoch: 19 Global Step: 325750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:16:52,793-Speed 3146.65 samples/sec Loss 0.3697 Epoch: 19 Global Step: 325800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:17:09,326-Speed 3096.93 samples/sec Loss 0.3711 Epoch: 19 Global Step: 325850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:17:26,916-Speed 2910.72 samples/sec Loss 0.3654 Epoch: 19 Global Step: 325900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:17:43,506-Speed 3086.35 samples/sec Loss 0.3608 Epoch: 19 Global Step: 325950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:18:00,006-Speed 3103.03 samples/sec Loss 0.3573 Epoch: 19 Global Step: 326000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:18:53,395-[lfw][326000]XNorm: 21.490808 Training: 2021-03-17 13:18:53,396-[lfw][326000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-17 13:18:53,396-[lfw][326000]Accuracy-Highest: 0.99817 Training: 2021-03-17 13:19:55,386-[cfp_fp][326000]XNorm: 22.245787 Training: 2021-03-17 13:19:55,387-[cfp_fp][326000]Accuracy-Flip: 0.99229+-0.00492 Training: 2021-03-17 13:19:55,387-[cfp_fp][326000]Accuracy-Highest: 0.99286 Training: 2021-03-17 13:20:48,627-[agedb_30][326000]XNorm: 22.704488 Training: 2021-03-17 13:20:48,627-[agedb_30][326000]Accuracy-Flip: 0.98383+-0.00578 Training: 2021-03-17 13:20:48,627-[agedb_30][326000]Accuracy-Highest: 0.98483 Training: 2021-03-17 13:21:04,891-Speed 276.93 samples/sec Loss 0.3767 Epoch: 19 Global Step: 326050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:21:21,407-Speed 3100.16 samples/sec Loss 0.3721 Epoch: 19 Global Step: 326100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:21:37,767-Speed 3129.74 samples/sec Loss 0.3573 Epoch: 19 Global Step: 326150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:21:54,191-Speed 3117.52 samples/sec Loss 0.3637 Epoch: 19 Global Step: 326200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:22:10,657-Speed 3109.41 samples/sec Loss 0.3662 Epoch: 19 Global Step: 326250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:22:27,244-Speed 3086.84 samples/sec Loss 0.3638 Epoch: 19 Global Step: 326300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:22:44,565-Speed 2956.14 samples/sec Loss 0.3652 Epoch: 19 Global Step: 326350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:23:00,924-Speed 3129.77 samples/sec Loss 0.3609 Epoch: 19 Global Step: 326400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:23:18,633-Speed 2891.27 samples/sec Loss 0.3696 Epoch: 19 Global Step: 326450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:23:34,833-Speed 3160.66 samples/sec Loss 0.3501 Epoch: 19 Global Step: 326500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:23:51,323-Speed 3104.99 samples/sec Loss 0.3609 Epoch: 19 Global Step: 326550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:24:07,651-Speed 3135.81 samples/sec Loss 0.3654 Epoch: 19 Global Step: 326600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:24:24,872-Speed 2973.19 samples/sec Loss 0.3628 Epoch: 19 Global Step: 326650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:24:41,705-Speed 3041.77 samples/sec Loss 0.3570 Epoch: 19 Global Step: 326700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:24:58,064-Speed 3129.82 samples/sec Loss 0.3600 Epoch: 19 Global Step: 326750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:25:14,500-Speed 3115.22 samples/sec Loss 0.3649 Epoch: 19 Global Step: 326800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:25:30,854-Speed 3130.80 samples/sec Loss 0.3602 Epoch: 19 Global Step: 326850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:25:48,247-Speed 2943.77 samples/sec Loss 0.3695 Epoch: 19 Global Step: 326900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:26:04,945-Speed 3066.39 samples/sec Loss 0.3639 Epoch: 19 Global Step: 326950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:26:22,015-Speed 2999.54 samples/sec Loss 0.3658 Epoch: 19 Global Step: 327000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:26:38,701-Speed 3068.52 samples/sec Loss 0.3631 Epoch: 19 Global Step: 327050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:26:55,316-Speed 3081.71 samples/sec Loss 0.3517 Epoch: 19 Global Step: 327100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:27:11,677-Speed 3129.42 samples/sec Loss 0.3744 Epoch: 19 Global Step: 327150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:27:28,110-Speed 3115.75 samples/sec Loss 0.3803 Epoch: 19 Global Step: 327200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:27:44,800-Speed 3067.80 samples/sec Loss 0.3685 Epoch: 19 Global Step: 327250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:28:01,137-Speed 3134.20 samples/sec Loss 0.3687 Epoch: 19 Global Step: 327300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:28:17,492-Speed 3130.57 samples/sec Loss 0.3649 Epoch: 19 Global Step: 327350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:28:34,995-Speed 2925.21 samples/sec Loss 0.3567 Epoch: 19 Global Step: 327400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:28:51,433-Speed 3114.94 samples/sec Loss 0.3696 Epoch: 19 Global Step: 327450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:29:07,992-Speed 3092.02 samples/sec Loss 0.3654 Epoch: 19 Global Step: 327500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:29:24,649-Speed 3073.78 samples/sec Loss 0.3771 Epoch: 19 Global Step: 327550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:29:41,111-Speed 3110.37 samples/sec Loss 0.3654 Epoch: 19 Global Step: 327600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:29:57,788-Speed 3070.21 samples/sec Loss 0.3596 Epoch: 19 Global Step: 327650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:30:14,162-Speed 3126.90 samples/sec Loss 0.3618 Epoch: 19 Global Step: 327700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:30:31,565-Speed 2942.19 samples/sec Loss 0.3710 Epoch: 19 Global Step: 327750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:30:47,879-Speed 3138.47 samples/sec Loss 0.3666 Epoch: 19 Global Step: 327800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:31:04,504-Speed 3079.80 samples/sec Loss 0.3613 Epoch: 19 Global Step: 327850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:31:22,031-Speed 2921.33 samples/sec Loss 0.3660 Epoch: 19 Global Step: 327900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:31:38,556-Speed 3098.45 samples/sec Loss 0.3596 Epoch: 19 Global Step: 327950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:31:55,101-Speed 3094.62 samples/sec Loss 0.3688 Epoch: 19 Global Step: 328000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:32:49,042-[lfw][328000]XNorm: 21.662490 Training: 2021-03-17 13:32:49,043-[lfw][328000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 13:32:49,043-[lfw][328000]Accuracy-Highest: 0.99817 Training: 2021-03-17 13:33:51,069-[cfp_fp][328000]XNorm: 22.360862 Training: 2021-03-17 13:33:51,070-[cfp_fp][328000]Accuracy-Flip: 0.99229+-0.00500 Training: 2021-03-17 13:33:51,070-[cfp_fp][328000]Accuracy-Highest: 0.99286 Training: 2021-03-17 13:34:44,209-[agedb_30][328000]XNorm: 22.861406 Training: 2021-03-17 13:34:44,209-[agedb_30][328000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 13:34:44,209-[agedb_30][328000]Accuracy-Highest: 0.98483 Training: 2021-03-17 13:35:01,689-Speed 274.40 samples/sec Loss 0.3612 Epoch: 19 Global Step: 328050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:35:17,990-Speed 3140.95 samples/sec Loss 0.3616 Epoch: 19 Global Step: 328100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:35:34,366-Speed 3126.80 samples/sec Loss 0.3630 Epoch: 19 Global Step: 328150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:35:50,619-Speed 3150.19 samples/sec Loss 0.3584 Epoch: 19 Global Step: 328200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:36:06,959-Speed 3133.47 samples/sec Loss 0.3522 Epoch: 19 Global Step: 328250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:36:23,379-Speed 3118.20 samples/sec Loss 0.3588 Epoch: 19 Global Step: 328300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:36:39,696-Speed 3138.05 samples/sec Loss 0.3753 Epoch: 19 Global Step: 328350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:36:56,247-Speed 3093.46 samples/sec Loss 0.3710 Epoch: 19 Global Step: 328400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:37:12,757-Speed 3101.32 samples/sec Loss 0.3574 Epoch: 19 Global Step: 328450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:37:29,499-Speed 3058.19 samples/sec Loss 0.3531 Epoch: 19 Global Step: 328500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:37:45,755-Speed 3149.85 samples/sec Loss 0.3651 Epoch: 19 Global Step: 328550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:38:02,177-Speed 3117.90 samples/sec Loss 0.3703 Epoch: 19 Global Step: 328600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:38:20,486-Speed 2796.40 samples/sec Loss 0.3594 Epoch: 19 Global Step: 328650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:38:37,024-Speed 3096.07 samples/sec Loss 0.3581 Epoch: 19 Global Step: 328700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:38:53,363-Speed 3133.60 samples/sec Loss 0.3661 Epoch: 19 Global Step: 328750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:39:10,528-Speed 2983.03 samples/sec Loss 0.3629 Epoch: 19 Global Step: 328800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:39:26,917-Speed 3124.13 samples/sec Loss 0.3594 Epoch: 19 Global Step: 328850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:39:43,430-Speed 3100.61 samples/sec Loss 0.3656 Epoch: 19 Global Step: 328900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:40:00,008-Speed 3088.51 samples/sec Loss 0.3594 Epoch: 19 Global Step: 328950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:40:17,077-Speed 2999.67 samples/sec Loss 0.3706 Epoch: 19 Global Step: 329000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:40:34,310-Speed 2971.21 samples/sec Loss 0.3621 Epoch: 19 Global Step: 329050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:40:50,896-Speed 3086.88 samples/sec Loss 0.3655 Epoch: 19 Global Step: 329100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:41:07,416-Speed 3099.54 samples/sec Loss 0.3665 Epoch: 19 Global Step: 329150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:41:23,858-Speed 3113.97 samples/sec Loss 0.3593 Epoch: 19 Global Step: 329200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:41:40,210-Speed 3131.18 samples/sec Loss 0.3661 Epoch: 19 Global Step: 329250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:41:56,544-Speed 3134.62 samples/sec Loss 0.3657 Epoch: 19 Global Step: 329300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:42:12,867-Speed 3136.87 samples/sec Loss 0.3656 Epoch: 19 Global Step: 329350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:42:29,118-Speed 3150.66 samples/sec Loss 0.3668 Epoch: 19 Global Step: 329400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-17 13:42:45,834-Speed 3063.07 samples/sec Loss 0.3669 Epoch: 19 Global Step: 329450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:43:02,448-Speed 3081.84 samples/sec Loss 0.3686 Epoch: 19 Global Step: 329500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:43:18,883-Speed 3115.36 samples/sec Loss 0.3701 Epoch: 19 Global Step: 329550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:43:35,993-Speed 2992.48 samples/sec Loss 0.3661 Epoch: 19 Global Step: 329600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:43:52,530-Speed 3096.15 samples/sec Loss 0.3601 Epoch: 19 Global Step: 329650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:44:09,016-Speed 3105.85 samples/sec Loss 0.3555 Epoch: 19 Global Step: 329700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:44:25,315-Speed 3141.36 samples/sec Loss 0.3678 Epoch: 19 Global Step: 329750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:44:41,704-Speed 3124.26 samples/sec Loss 0.3632 Epoch: 19 Global Step: 329800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:44:58,026-Speed 3136.98 samples/sec Loss 0.3612 Epoch: 19 Global Step: 329850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:45:14,375-Speed 3131.73 samples/sec Loss 0.3702 Epoch: 19 Global Step: 329900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:45:30,958-Speed 3087.55 samples/sec Loss 0.3607 Epoch: 19 Global Step: 329950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:45:48,248-Speed 2961.25 samples/sec Loss 0.3644 Epoch: 19 Global Step: 330000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:46:41,588-[lfw][330000]XNorm: 21.593398 Training: 2021-03-17 13:46:41,589-[lfw][330000]Accuracy-Flip: 0.99800+-0.00267 Training: 2021-03-17 13:46:41,589-[lfw][330000]Accuracy-Highest: 0.99817 Training: 2021-03-17 13:47:43,629-[cfp_fp][330000]XNorm: 22.318690 Training: 2021-03-17 13:47:43,630-[cfp_fp][330000]Accuracy-Flip: 0.99229+-0.00470 Training: 2021-03-17 13:47:43,630-[cfp_fp][330000]Accuracy-Highest: 0.99286 Training: 2021-03-17 13:48:36,794-[agedb_30][330000]XNorm: 22.806306 Training: 2021-03-17 13:48:36,795-[agedb_30][330000]Accuracy-Flip: 0.98450+-0.00646 Training: 2021-03-17 13:48:36,795-[agedb_30][330000]Accuracy-Highest: 0.98483 Training: 2021-03-17 13:48:54,094-Speed 275.50 samples/sec Loss 0.3658 Epoch: 19 Global Step: 330050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:49:10,515-Speed 3118.24 samples/sec Loss 0.3588 Epoch: 19 Global Step: 330100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:49:26,808-Speed 3142.41 samples/sec Loss 0.3689 Epoch: 19 Global Step: 330150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:49:43,301-Speed 3104.48 samples/sec Loss 0.3588 Epoch: 19 Global Step: 330200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:49:59,653-Speed 3131.29 samples/sec Loss 0.3674 Epoch: 19 Global Step: 330250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:50:16,936-Speed 2962.52 samples/sec Loss 0.3591 Epoch: 19 Global Step: 330300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:50:33,368-Speed 3115.92 samples/sec Loss 0.3673 Epoch: 19 Global Step: 330350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:50:49,550-Speed 3164.10 samples/sec Loss 0.3556 Epoch: 19 Global Step: 330400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:51:06,079-Speed 3097.71 samples/sec Loss 0.3706 Epoch: 19 Global Step: 330450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:51:22,736-Speed 3073.85 samples/sec Loss 0.3585 Epoch: 19 Global Step: 330500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:51:39,049-Speed 3138.79 samples/sec Loss 0.3686 Epoch: 19 Global Step: 330550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:51:55,720-Speed 3071.28 samples/sec Loss 0.3625 Epoch: 19 Global Step: 330600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:52:12,170-Speed 3112.52 samples/sec Loss 0.3590 Epoch: 19 Global Step: 330650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:52:28,636-Speed 3109.49 samples/sec Loss 0.3626 Epoch: 19 Global Step: 330700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:52:45,079-Speed 3113.87 samples/sec Loss 0.3723 Epoch: 19 Global Step: 330750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:53:01,745-Speed 3072.21 samples/sec Loss 0.3591 Epoch: 19 Global Step: 330800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:53:18,174-Speed 3116.55 samples/sec Loss 0.3633 Epoch: 19 Global Step: 330850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:53:35,253-Speed 2997.86 samples/sec Loss 0.3612 Epoch: 19 Global Step: 330900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:53:51,546-Speed 3142.70 samples/sec Loss 0.3760 Epoch: 19 Global Step: 330950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:54:09,103-Speed 2916.33 samples/sec Loss 0.3725 Epoch: 19 Global Step: 331000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:54:25,725-Speed 3080.25 samples/sec Loss 0.3631 Epoch: 19 Global Step: 331050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:54:43,802-Speed 2832.51 samples/sec Loss 0.3673 Epoch: 19 Global Step: 331100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:55:00,339-Speed 3096.06 samples/sec Loss 0.3603 Epoch: 19 Global Step: 331150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:55:16,867-Speed 3097.90 samples/sec Loss 0.3598 Epoch: 19 Global Step: 331200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:55:33,371-Speed 3102.44 samples/sec Loss 0.3633 Epoch: 19 Global Step: 331250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:55:49,863-Speed 3104.52 samples/sec Loss 0.3625 Epoch: 19 Global Step: 331300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:56:06,381-Speed 3099.71 samples/sec Loss 0.3675 Epoch: 19 Global Step: 331350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:56:23,793-Speed 2940.62 samples/sec Loss 0.3676 Epoch: 19 Global Step: 331400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:56:40,198-Speed 3121.10 samples/sec Loss 0.3679 Epoch: 19 Global Step: 331450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:56:56,764-Speed 3090.82 samples/sec Loss 0.3714 Epoch: 19 Global Step: 331500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:57:13,504-Speed 3058.66 samples/sec Loss 0.3568 Epoch: 19 Global Step: 331550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:57:29,875-Speed 3127.45 samples/sec Loss 0.3665 Epoch: 19 Global Step: 331600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:57:46,487-Speed 3082.30 samples/sec Loss 0.3532 Epoch: 19 Global Step: 331650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:58:02,975-Speed 3105.44 samples/sec Loss 0.3721 Epoch: 19 Global Step: 331700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:58:19,454-Speed 3107.04 samples/sec Loss 0.3644 Epoch: 19 Global Step: 331750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:58:35,796-Speed 3133.11 samples/sec Loss 0.3698 Epoch: 19 Global Step: 331800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:58:53,133-Speed 2953.26 samples/sec Loss 0.3623 Epoch: 19 Global Step: 331850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:59:09,522-Speed 3124.20 samples/sec Loss 0.3674 Epoch: 19 Global Step: 331900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:59:25,973-Speed 3112.41 samples/sec Loss 0.3643 Epoch: 19 Global Step: 331950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 13:59:42,429-Speed 3111.26 samples/sec Loss 0.3553 Epoch: 19 Global Step: 332000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:00:36,742-[lfw][332000]XNorm: 21.468532 Training: 2021-03-17 14:00:36,743-[lfw][332000]Accuracy-Flip: 0.99783+-0.00259 Training: 2021-03-17 14:00:36,743-[lfw][332000]Accuracy-Highest: 0.99817 Training: 2021-03-17 14:01:39,172-[cfp_fp][332000]XNorm: 22.219699 Training: 2021-03-17 14:01:39,172-[cfp_fp][332000]Accuracy-Flip: 0.99271+-0.00480 Training: 2021-03-17 14:01:39,172-[cfp_fp][332000]Accuracy-Highest: 0.99286 Training: 2021-03-17 14:02:32,601-[agedb_30][332000]XNorm: 22.707016 Training: 2021-03-17 14:02:32,601-[agedb_30][332000]Accuracy-Flip: 0.98433+-0.00638 Training: 2021-03-17 14:02:32,601-[agedb_30][332000]Accuracy-Highest: 0.98483 Training: 2021-03-17 14:02:49,244-Speed 274.07 samples/sec Loss 0.3605 Epoch: 19 Global Step: 332050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:03:05,751-Speed 3101.78 samples/sec Loss 0.3616 Epoch: 19 Global Step: 332100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:03:23,269-Speed 2922.79 samples/sec Loss 0.3619 Epoch: 19 Global Step: 332150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:03:39,605-Speed 3134.39 samples/sec Loss 0.3700 Epoch: 19 Global Step: 332200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:03:56,813-Speed 2975.48 samples/sec Loss 0.3657 Epoch: 19 Global Step: 332250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:04:13,214-Speed 3121.79 samples/sec Loss 0.3614 Epoch: 19 Global Step: 332300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:04:29,791-Speed 3088.77 samples/sec Loss 0.3642 Epoch: 19 Global Step: 332350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:04:46,130-Speed 3133.64 samples/sec Loss 0.3676 Epoch: 19 Global Step: 332400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:05:02,442-Speed 3138.90 samples/sec Loss 0.3675 Epoch: 19 Global Step: 332450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:05:19,657-Speed 2974.33 samples/sec Loss 0.3605 Epoch: 19 Global Step: 332500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:05:36,095-Speed 3114.72 samples/sec Loss 0.3605 Epoch: 19 Global Step: 332550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:05:53,135-Speed 3004.79 samples/sec Loss 0.3616 Epoch: 19 Global Step: 332600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:06:09,601-Speed 3109.53 samples/sec Loss 0.3606 Epoch: 19 Global Step: 332650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:06:26,546-Speed 3021.76 samples/sec Loss 0.3690 Epoch: 19 Global Step: 332700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:06:42,833-Speed 3143.60 samples/sec Loss 0.3596 Epoch: 19 Global Step: 332750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:06:59,261-Speed 3116.71 samples/sec Loss 0.3669 Epoch: 19 Global Step: 332800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:07:16,039-Speed 3051.87 samples/sec Loss 0.3620 Epoch: 19 Global Step: 332850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:07:32,330-Speed 3142.78 samples/sec Loss 0.3638 Epoch: 19 Global Step: 332900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:07:48,658-Speed 3135.85 samples/sec Loss 0.3666 Epoch: 19 Global Step: 332950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:08:05,298-Speed 3077.03 samples/sec Loss 0.3647 Epoch: 19 Global Step: 333000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:08:21,752-Speed 3111.77 samples/sec Loss 0.3642 Epoch: 19 Global Step: 333050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:08:39,267-Speed 2923.26 samples/sec Loss 0.3653 Epoch: 19 Global Step: 333100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:08:55,894-Speed 3079.45 samples/sec Loss 0.3592 Epoch: 19 Global Step: 333150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:09:12,272-Speed 3126.31 samples/sec Loss 0.3720 Epoch: 19 Global Step: 333200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:09:28,699-Speed 3116.76 samples/sec Loss 0.3775 Epoch: 19 Global Step: 333250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:09:46,840-Speed 2822.47 samples/sec Loss 0.3618 Epoch: 19 Global Step: 333300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:10:03,148-Speed 3139.62 samples/sec Loss 0.3669 Epoch: 19 Global Step: 333350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:10:19,579-Speed 3116.22 samples/sec Loss 0.3542 Epoch: 19 Global Step: 333400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:10:36,081-Speed 3102.73 samples/sec Loss 0.3670 Epoch: 19 Global Step: 333450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:10:52,384-Speed 3140.60 samples/sec Loss 0.3582 Epoch: 19 Global Step: 333500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:11:09,020-Speed 3077.79 samples/sec Loss 0.3751 Epoch: 19 Global Step: 333550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:11:25,664-Speed 3076.24 samples/sec Loss 0.3642 Epoch: 19 Global Step: 333600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:11:42,361-Speed 3066.45 samples/sec Loss 0.3705 Epoch: 19 Global Step: 333650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:11:59,510-Speed 2985.74 samples/sec Loss 0.3557 Epoch: 19 Global Step: 333700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:12:15,891-Speed 3125.58 samples/sec Loss 0.3626 Epoch: 19 Global Step: 333750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-17 14:12:32,685-Speed 3048.90 samples/sec Loss 0.3631 Epoch: 19 Global Step: 333800 Fp16 Grad Scale: 16384 Required: 0 hours