* [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism @ 2021-11-30 12:36 Huang Rui 2021-11-30 12:36 ` [PATCH v5 01/22] x86/cpufeatures: add AMD Collaborative Processor Performance Control feature flag Huang Rui ` (21 more replies) 0 siblings, 22 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Hi all, We would like to introduce a new AMD CPU frequency control mechanism as the "amd-pstate" driver for modern AMD Zen based CPU series in Linux Kernel. The new mechanism is based on Collaborative processor performance control (CPPC) which is finer grain frequency management than legacy ACPI hardware P-States. Current AMD CPU platforms are using the ACPI P-states driver to manage CPU frequency and clocks with switching only in 3 P-states. AMD P-States is to replace the ACPI P-states controls, allows a flexible, low-latency interface for the Linux kernel to directly communicate the performance hints to hardware. "amd-pstate" leverages the Linux kernel governors such as *schedutil*, *ondemand*, etc. to manage the performance hints which are provided by CPPC hardware functionality. The first version for amd-pstate is to support one of the Zen3 processors, and we will support more in future after we verify the hardware and SBIOS functionalities. There are two types of hardware implementations for amd-pstate: one is full MSR support and another is shared memory support. It can use X86_FEATURE_CPPC feature flag to distinguish the different types. Using the new AMD P-States method + kernel governors (*schedutil*, *ondemand*, ...) to manage the frequency update is the most appropriate bridge between AMD Zen based hardware processor and Linux kernel, the processor is able to adjust to the most efficiency frequency according to the kernel scheduler loading. Please check the detailed CPU feature and MSR register description in Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors: https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip Performance Per Watt (PPW) Calculation: We use the RAPL interface with "perf" tool to get the energy data of the package power. The data comparisons between amd-pstate and acpi-freq module are tested on AMD Cezanne processor (mobile CPU): 1) TBench CPU benchmark: +----------------------------------------------------------------------------------------------+ | | | TBench4 (Performance Per Watt) | | Higher is better | +-------------------+------------------------+------------------------+------------------------+ | | Performance Per Watt | Performance Per Watt | Performance Per Watt | | Kernel Module | (Schedutil) | (Ondemand) | (Performance) | | | Unit: MB / J | Unit: MB / J | Unit: MB / J | +-------------------+------------------------+------------------------+------------------------+ | | | | | | acpi-cpufreq | 48.56 | 48.89 | 47.81 | | | | | | +-------------------+------------------------+------------------------+------------------------+ | | | | | | amd-pstate | 48.38 | 47.38 | 48.77 | | | | | | +-------------------+------------------------+------------------------+------------------------+ Note: The previous data was based on TBench2, as align with Suse, we use TBench4 to re-test it. The PPW is very closed to acpi-cpufreq. And we are still re-runing other tests. Steam Game Demo on Ryzen 5900X (12 core 24 threads): The picture to compare acpi-cpufreq vs amd-pstate: https://drive.google.com/file/d/1PvSduykJn9U5MMOhzFWycnbmGmznalM3/view?usp=sharing Two videos: https://drive.google.com/file/d/1nQQEteL-v_zQxnOJpyW8JqvRW2FFDN2Z/view?usp=sharing https://drive.google.com/file/d/1heuPgFG71SQHvGb6wfedrQciBfE2rhnu/view?usp=sharing Actually, the amd-pstate driver doesn't change the physical maximum frequency capacity in the processor. But it's able to provide the finer grain performance control range instead of legacy 3 P-States. It has a better performance and power efficiency than before. We will continue optimize amd-pstate function on kernel governors to support different types of processors such as mobile latop, performance desktop, and etc. See patch series in below git repo: V1: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v1 V2: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v2 V3: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v3 V4: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v4 V5: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v5 For details introduction, please see the patch 22. Changes from V1 -> V2: - cpufreq: - - Add detailed description in the commit log. - - Clean up the "extension" postfix in the x86 feature flag. - - Revise cppc_set_enable helper. - - Add a fix to check online cpus in cppc_acpi. - - Use static calls to avoid retpolines. - - Revise the comment style. - - Remove amd_pstate_boost_supported() function. - - Revise the return value in sysfs attribute functions. - cpupower: - - Refine the commit log for cpupower patches. - - Expose a function to get the sysfs value from specific table. - - Move amd-pstate sysfs definitions and functions into amd helper file. - - Move the boost init function into amd helper file and explain the details in the commit log. - - Remove the amd_pstate_get_data in the lib/cpufreq.c to keep the lib as common operations. - - Move print_speed function into misc helper file. - - Add amd_pstate_show_perf_and_freq() function in amd helper for cpufreq-info print. Changes from V2 -> V3: - cpufreq: - - Add a patch from Steven to add systemio register in cppc lib. (Thanks to verify the driver in his platform) - - Update online cpu mask to present cpu. - - Enhance cppc_set_enable to cover all valid use cases. - - Add more description in the Kconfig definition. - - Clean up some redundance functions and data members. - - Revise amd-pstate trace event prints. - - Move the amd-pstate traces into power trace system and set the driver as build-in instead of module. - - Clean up the duplicated sysfs with core cpufreq driver. - - Revise the amd-pstate RST documentation. - cpupower: - - Revise the cpupower_amd_pstate_enabled() function to use cpufreq_get_driver helper instead of read sysfs. - - Clean up the amd-pstate max/min frequency APIs, because they are actually the same with cpufreq info sysfs. Changes from V3 -> V4: - cpufreq: - - Rebase the whole series to Rafael's pm branch (5.15) - - Fix the typo in the commit message and comment. - - Clean up function implementation. - - Clean up freq&perf sysfs APIs. - - Fall back to move amd-pstate traces out of power trace system, because it's flexible to debug and fine tune processors with the shared memory solution. - - Add a kernel param to disable shared memory on amd-pstate, it can be enabled manually. - cpupower: - - Introduce acpi cppc library support. - - Clean up the duplicated amd specific perf/frequency. We received two issues that reported on the processors with shared memory solution from the community. First one: https://lore.kernel.org/linux-pm/a0e932477e9b826c0781dda1d0d2953e57f904cc.camel@suse.cz/ Thanks Giovanni's support and suggestions, we corrected the calculation method and duplicated the performance issue in a 64 cores / 128 threads threadripper which is similiar with EPYC 7713. Check with firmware guy, the sample rate is about 1 ms that the hardware responds the frequency change. We are working to simulate the acpi pstates with cppc api and checking where is the gap with acpi-cpufreq. Will share you the status later. Second one: https://lore.kernel.org/linux-pm/f9323c6fddd4a55d8ca4191a9539ebd056221045.camel@gmail.com/ Thanks Matt to help us reproduce this issue in our side with Ryzen 5900X. Below is the video to compare with amd-pstate and acpi-cpufreq on "Control" steam games. The FPS can be increased as well. It is probably something wrong in the SBIOS. https://lore.kernel.org/linux-pm/DM4PR12MB5136084E5DC1809FBB578075F1959@DM4PR12MB5136.namprd12.prod.outlook.com/ Due to some issues on shared memory solution, so we fallback to build as a module and add a parameter "shared_mem" to disable amd-pstate on processors with shared memory solution for the moment till we complete tunning on this function. Changes from V4 -> V5: - cpufreq: - - Fix subject typo. - - Fix Clang-13 build warning. - - Update RST documentation for the latest update. - cpupower: - - Fix the table check condition at cpufreq_get_sysfs_value_from_table. Thanks, Ray Huang Rui (19): x86/cpufeatures: add AMD Collaborative Processor Performance Control feature flag x86/msr: add AMD CPPC MSR definitions cpufreq: amd: introduce a new amd pstate driver to support future processors cpufreq: amd: add fast switch function for amd-pstate cpufreq: amd: introduce the support for the processors with shared memory solution cpufreq: amd: add trace for amd-pstate module cpufreq: amd: add boost mode support for amd-pstate cpufreq: amd: add amd-pstate frequencies attributes cpufreq: amd: add amd-pstate performance attributes cpupower: add AMD P-state capability flag cpupower: add the function to check amd-pstate enabled cpupower: initial AMD P-state capability cpupower: add the function to get the sysfs value from specific table cpupower: introduce acpi cppc library cpupower: add amd-pstate sysfs definition and access helper cpupower: enable boost state support for amd-pstate module cpupower: move print_speed function into misc helper cpupower: print amd-pstate information on cpupower Documentation: amd-pstate: add amd-pstate driver introduction Jinzhou Su (1): ACPI: CPPC: add cppc enable register function Mario Limonciello (1): ACPI: CPPC: Check present CPUs for determining _CPC is valid Steven Noonan (1): ACPI: CPPC: implement support for SystemIO registers Documentation/admin-guide/acpi/cppc_sysfs.rst | 2 + Documentation/admin-guide/pm/amd-pstate.rst | 383 +++++++++++ .../admin-guide/pm/working-state.rst | 1 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 17 + drivers/acpi/cppc_acpi.c | 93 ++- drivers/cpufreq/Kconfig.x86 | 17 + drivers/cpufreq/Makefile | 5 + drivers/cpufreq/amd-pstate-trace.c | 2 + drivers/cpufreq/amd-pstate-trace.h | 77 +++ drivers/cpufreq/amd-pstate.c | 609 ++++++++++++++++++ include/acpi/cppc_acpi.h | 5 + tools/power/cpupower/Makefile | 6 +- tools/power/cpupower/lib/acpi_cppc.c | 59 ++ tools/power/cpupower/lib/acpi_cppc.h | 21 + tools/power/cpupower/lib/cpufreq.c | 21 +- tools/power/cpupower/lib/cpufreq.h | 12 + tools/power/cpupower/utils/cpufreq-info.c | 68 +- tools/power/cpupower/utils/helpers/amd.c | 76 +++ tools/power/cpupower/utils/helpers/cpuid.c | 13 + tools/power/cpupower/utils/helpers/helpers.h | 22 + tools/power/cpupower/utils/helpers/misc.c | 62 ++ 22 files changed, 1508 insertions(+), 64 deletions(-) create mode 100644 Documentation/admin-guide/pm/amd-pstate.rst create mode 100644 drivers/cpufreq/amd-pstate-trace.c create mode 100644 drivers/cpufreq/amd-pstate-trace.h create mode 100644 drivers/cpufreq/amd-pstate.c create mode 100644 tools/power/cpupower/lib/acpi_cppc.c create mode 100644 tools/power/cpupower/lib/acpi_cppc.h -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 01/22] x86/cpufeatures: add AMD Collaborative Processor Performance Control feature flag 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 02/22] x86/msr: add AMD CPPC MSR definitions Huang Rui ` (20 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Add Collaborative Processor Performance Control feature flag for AMD processors. This feature flag will be used on the following amd-pstate driver. The amd-pstate driver has two approaches to implement the frequency control behavior. That depends on the CPU hardware implementation. One is "Full MSR Support" and another is "Shared Memory Support". The feature flag indicates the current processors with "Full MSR Support". Acked-by: Borislav Petkov Signed-off-by: Huang Rui --- arch/x86/include/asm/cpufeatures.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index d5b5f2ab87a0..18de5f76f198 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -315,6 +315,7 @@ #define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store Bypass Disable */ #define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */ #define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */ +#define X86_FEATURE_CPPC (13*32+27) /* Collaborative Processor Performance Control */ /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 02/22] x86/msr: add AMD CPPC MSR definitions 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui 2021-11-30 12:36 ` [PATCH v5 01/22] x86/cpufeatures: add AMD Collaborative Processor Performance Control feature flag Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 03/22] ACPI: CPPC: implement support for SystemIO registers Huang Rui ` (19 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui AMD CPPC (Collaborative Processor Performance Control) function uses MSR registers to manage the performance hints. So add the MSR register macro here. Signed-off-by: Huang Rui --- arch/x86/include/asm/msr-index.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 01e2650b9585..e7945ef6a8df 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -486,6 +486,23 @@ #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f +/* AMD Collaborative Processor Performance Control MSRs */ +#define MSR_AMD_CPPC_CAP1 0xc00102b0 +#define MSR_AMD_CPPC_ENABLE 0xc00102b1 +#define MSR_AMD_CPPC_CAP2 0xc00102b2 +#define MSR_AMD_CPPC_REQ 0xc00102b3 +#define MSR_AMD_CPPC_STATUS 0xc00102b4 + +#define CAP1_LOWEST_PERF(x) (((x) >> 0) & 0xff) +#define CAP1_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff) +#define CAP1_NOMINAL_PERF(x) (((x) >> 16) & 0xff) +#define CAP1_HIGHEST_PERF(x) (((x) >> 24) & 0xff) + +#define REQ_MAX_PERF(x) (((x) & 0xff) << 0) +#define REQ_MIN_PERF(x) (((x) & 0xff) << 8) +#define REQ_DES_PERF(x) (((x) & 0xff) << 16) +#define REQ_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24) + /* Fam 17h MSRs */ #define MSR_F17H_IRPERF 0xc00000e9 -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 03/22] ACPI: CPPC: implement support for SystemIO registers 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui 2021-11-30 12:36 ` [PATCH v5 01/22] x86/cpufeatures: add AMD Collaborative Processor Performance Control feature flag Huang Rui 2021-11-30 12:36 ` [PATCH v5 02/22] x86/msr: add AMD CPPC MSR definitions Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 04/22] ACPI: CPPC: Check present CPUs for determining _CPC is valid Huang Rui ` (18 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui From: Steven Noonan According to the ACPI v6.2 (and later) specification, SystemIO can be used for _CPC registers. This teaches cppc_acpi how to handle such registers. This patch was tested using the amd_pstate driver on my Zephyrus G15 (model GA503QS) using the current version 410 BIOS, which uses a SystemIO register for the HighestPerformance element in _CPC. Signed-off-by: Steven Noonan Signed-off-by: Huang Rui --- drivers/acpi/cppc_acpi.c | 46 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index a85c351589be..ca62c3dc9899 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -746,9 +746,24 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr) goto out_free; cpc_ptr->cpc_regs[i-2].sys_mem_vaddr = addr; } + } else if (gas_t->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { + if (gas_t->access_width < 1 || gas_t->access_width > 3) { + /* 1 = 8-bit, 2 = 16-bit, and 3 = 32-bit. SystemIO doesn't + * implement 64-bit registers. + */ + pr_debug("Invalid access width %d for SystemIO register\n", + gas_t->access_width); + goto out_free; + } + if (gas_t->address & ~0xFFFFULL) { + /* SystemIO registers use 16-bit integer addresses */ + pr_debug("Invalid IO port %llu for SystemIO register\n", + gas_t->address); + goto out_free; + } } else { if (gas_t->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE || !cpc_ffh_supported()) { - /* Support only PCC ,SYS MEM and FFH type regs */ + /* Support only PCC, SystemMemory, SystemIO, and FFH type regs. */ pr_debug("Unsupported register type: %d\n", gas_t->space_id); goto out_free; } @@ -923,7 +938,20 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val) } *val = 0; - if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) + + if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { + u32 width = 8 << (reg->access_width - 1); + acpi_status status; + + status = acpi_os_read_port((acpi_io_address)reg->address, (u32 *)val, width); + + if (status != AE_OK) { + pr_debug("Error: Failed to read SystemIO port %llx\n", reg->address); + return -EFAULT; + } + + return 0; + } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id); else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) vaddr = reg_res->sys_mem_vaddr; @@ -962,7 +990,19 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val) int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu); struct cpc_reg *reg = ®_res->cpc_entry.reg; - if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) + if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { + u32 width = 8 << (reg->access_width - 1); + acpi_status status; + + status = acpi_os_write_port((acpi_io_address)reg->address, (u32)val, width); + + if (status != AE_OK) { + pr_debug("Error: Failed to write SystemIO port %llx\n", reg->address); + return -EFAULT; + } + + return 0; + } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id); else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) vaddr = reg_res->sys_mem_vaddr; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 04/22] ACPI: CPPC: Check present CPUs for determining _CPC is valid 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (2 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 03/22] ACPI: CPPC: implement support for SystemIO registers Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 05/22] ACPI: CPPC: add cppc enable register function Huang Rui ` (17 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui From: Mario Limonciello As this is a static check, it should be based upon what is currently present on the system. This makes probeing more deterministic. While local APIC flags field (lapic_flags) of cpu core in MADT table is 0, then the cpu core won't be enabled. In this case, _CPC won't be found in this core, and return back to _CPC invalid with walking through possible cpus (include disable cpus). This is not expected, so switch to check present CPUs instead. Reported-by: Jinzhou Su Signed-off-by: Mario Limonciello Signed-off-by: Huang Rui --- drivers/acpi/cppc_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index ca62c3dc9899..a46f227dc254 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -411,7 +411,7 @@ bool acpi_cpc_valid(void) struct cpc_desc *cpc_ptr; int cpu; - for_each_possible_cpu(cpu) { + for_each_present_cpu(cpu) { cpc_ptr = per_cpu(cpc_desc_ptr, cpu); if (!cpc_ptr) return false; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 05/22] ACPI: CPPC: add cppc enable register function 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (3 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 04/22] ACPI: CPPC: Check present CPUs for determining _CPC is valid Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 06/22] cpufreq: amd: introduce a new amd pstate driver to support future processors Huang Rui ` (16 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui From: Jinzhou Su Add a new function to enable CPPC feature. This function will write Continuous Performance Control package EnableRegister field on the processor. CPPC EnableRegister register described in section 8.4.7.1 of ACPI 6.4: This element is optional. If supported, contains a resource descriptor with a single Register() descriptor that describes a register to which OSPM writes a One to enable CPPC on this processor. Before this register is set, the processor will be controlled by legacy mechanisms (ACPI Pstates, firmware, etc.). This register will be used for AMD processors to enable amd-pstate function instead of legacy ACPI P-States. Signed-off-by: Jinzhou Su Signed-off-by: Huang Rui --- drivers/acpi/cppc_acpi.c | 45 ++++++++++++++++++++++++++++++++++++++++ include/acpi/cppc_acpi.h | 5 +++++ 2 files changed, 50 insertions(+) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index a46f227dc254..003df9fba122 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1262,6 +1262,51 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs) } EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs); +/** + * cppc_set_enable - Set to enable CPPC on the processor by writing the + * Continuous Performance Control package EnableRegister field. + * @cpu: CPU for which to enable CPPC register. + * @enable: 0 - disable, 1 - enable CPPC feature on the processor. + * + * Return: 0 for success, -ERRNO or -EIO otherwise. + */ +int cppc_set_enable(int cpu, bool enable) +{ + int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu); + struct cpc_register_resource *enable_reg; + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu); + struct cppc_pcc_data *pcc_ss_data = NULL; + int ret = -EINVAL; + + if (!cpc_desc) { + pr_debug("No CPC descriptor for CPU:%d\n", cpu); + return -EINVAL; + } + + enable_reg = &cpc_desc->cpc_regs[ENABLE]; + + if (CPC_IN_PCC(enable_reg)) { + + if (pcc_ss_id < 0) + return -EIO; + + ret = cpc_write(cpu, enable_reg, enable); + if (ret) + return ret; + + pcc_ss_data = pcc_data[pcc_ss_id]; + + down_write(&pcc_ss_data->pcc_lock); + /* after writing CPC, transfer the ownership of PCC to platfrom */ + ret = send_pcc_cmd(pcc_ss_id, CMD_WRITE); + up_write(&pcc_ss_data->pcc_lock); + return ret; + } + + return cpc_write(cpu, enable_reg, enable); +} +EXPORT_SYMBOL_GPL(cppc_set_enable); + /** * cppc_set_perf - Set a CPU's performance controls. * @cpu: CPU for which to set performance controls. diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h index bc159a9b4a73..92b7ea8d8f5e 100644 --- a/include/acpi/cppc_acpi.h +++ b/include/acpi/cppc_acpi.h @@ -138,6 +138,7 @@ extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf); extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf); extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs); extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls); +extern int cppc_set_enable(int cpu, bool enable); extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps); extern bool acpi_cpc_valid(void); extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data); @@ -162,6 +163,10 @@ static inline int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls) { return -ENOTSUPP; } +static inline int cppc_set_enable(int cpu, bool enable) +{ + return -ENOTSUPP; +} static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps) { return -ENOTSUPP; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 06/22] cpufreq: amd: introduce a new amd pstate driver to support future processors 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (4 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 05/22] ACPI: CPPC: add cppc enable register function Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 07/22] cpufreq: amd: add fast switch function for amd-pstate Huang Rui ` (15 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui amd-pstate is the AMD CPU performance scaling driver that introduces a new CPU frequency control mechanism on AMD Zen based CPU series in Linux kernel. The new mechanism is based on Collaborative processor performance control (CPPC) which is finer grain frequency management than legacy ACPI hardware P-States. Current AMD CPU platforms are using the ACPI P-states driver to manage CPU frequency and clocks with switching only in 3 P-states. AMD P-States is to replace the ACPI P-states controls, allows a flexible, low-latency interface for the Linux kernel to directly communicate the performance hints to hardware. "amd-pstate" leverages the Linux kernel governors such as *schedutil*, *ondemand*, etc. to manage the performance hints which are provided by CPPC hardware functionality. The first version for amd-pstate is to support one of the Zen3 processors, and we will support more in future after we verify the hardware and SBIOS functionalities. There are two types of hardware implementations for amd-pstate: one is full MSR support and another is shared memory support. It can use X86_FEATURE_CPPC feature flag to distinguish the different types. Using the new AMD P-States method + kernel governors (*schedutil*, *ondemand*, ...) to manage the frequency update is the most appropriate bridge between AMD Zen based hardware processor and Linux kernel, the processor is able to adjust to the most efficiency frequency according to the kernel scheduler loading. Please check the detailed CPU feature and MSR register description in Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors: https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip Signed-off-by: Huang Rui --- drivers/cpufreq/Kconfig.x86 | 17 ++ drivers/cpufreq/Makefile | 1 + drivers/cpufreq/amd-pstate.c | 398 +++++++++++++++++++++++++++++++++++ 3 files changed, 416 insertions(+) create mode 100644 drivers/cpufreq/amd-pstate.c diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86 index 92701a18bdd9..21837eb1698b 100644 --- a/drivers/cpufreq/Kconfig.x86 +++ b/drivers/cpufreq/Kconfig.x86 @@ -34,6 +34,23 @@ config X86_PCC_CPUFREQ If in doubt, say N. +config X86_AMD_PSTATE + tristate "AMD Processor P-State driver" + depends on X86 + select ACPI_PROCESSOR if ACPI + select ACPI_CPPC_LIB if X86_64 && ACPI + select CPU_FREQ_GOV_SCHEDUTIL if SMP + help + This driver adds a CPUFreq driver which utilizes a fine grain + processor performance frequency control range instead of legacy + performance levels. This driver supports the AMD processors with + _CPC object in the SBIOS. + + For details, take a look at: + . + + If in doubt, say N. + config X86_ACPI_CPUFREQ tristate "ACPI Processor P-States driver" depends on ACPI_PROCESSOR diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index 48ee5859030c..c8d307010922 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -25,6 +25,7 @@ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o # speedstep-* is preferred over p4-clockmod. obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o +obj-$(CONFIG_X86_AMD_PSTATE) += amd-pstate.o obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o obj-$(CONFIG_X86_PCC_CPUFREQ) += pcc-cpufreq.o obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c new file mode 100644 index 000000000000..20ffbc30118f --- /dev/null +++ b/drivers/cpufreq/amd-pstate.c @@ -0,0 +1,398 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * amd-pstate.c - AMD Processor P-state Frequency Driver + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. All Rights Reserved. + * + * Author: Huang Rui + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include + +#define AMD_PSTATE_TRANSITION_LATENCY 0x20000 +#define AMD_PSTATE_TRANSITION_DELAY 500 + +static struct cpufreq_driver amd_pstate_driver; + +struct amd_cpudata { + int cpu; + + struct freq_qos_request req[2]; + + u64 cppc_req_cached; + + u32 highest_perf; + u32 nominal_perf; + u32 lowest_nonlinear_perf; + u32 lowest_perf; + + u32 max_freq; + u32 min_freq; + u32 nominal_freq; + u32 lowest_nonlinear_freq; +}; + +static inline int pstate_enable(bool enable) +{ + return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); +} + +DEFINE_STATIC_CALL(amd_pstate_enable, pstate_enable); + +static inline int amd_pstate_enable(bool enable) +{ + return static_call(amd_pstate_enable)(enable); +} + +static int pstate_init_perf(struct amd_cpudata *cpudata) +{ + u64 cap1; + + int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, + &cap1); + if (ret) + return ret; + + /* + * TODO: Introduce AMD specific power feature. + * + * CPPC entry doesn't indicate the highest performance in some ASICs. + */ + WRITE_ONCE(cpudata->highest_perf, amd_get_highest_perf()); + + WRITE_ONCE(cpudata->nominal_perf, CAP1_NOMINAL_PERF(cap1)); + WRITE_ONCE(cpudata->lowest_nonlinear_perf, CAP1_LOWNONLIN_PERF(cap1)); + WRITE_ONCE(cpudata->lowest_perf, CAP1_LOWEST_PERF(cap1)); + + return 0; +} + +DEFINE_STATIC_CALL(amd_pstate_init_perf, pstate_init_perf); + +static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata) +{ + return static_call(amd_pstate_init_perf)(cpudata); +} + +static void pstate_update_perf(struct amd_cpudata *cpudata, u32 min_perf, + u32 des_perf, u32 max_perf, bool fast_switch) +{ + if (fast_switch) + wrmsrl(MSR_AMD_CPPC_REQ, READ_ONCE(cpudata->cppc_req_cached)); + else + wrmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, + READ_ONCE(cpudata->cppc_req_cached)); +} + +DEFINE_STATIC_CALL(amd_pstate_update_perf, pstate_update_perf); + +static inline void amd_pstate_update_perf(struct amd_cpudata *cpudata, + u32 min_perf, u32 des_perf, + u32 max_perf, bool fast_switch) +{ + static_call(amd_pstate_update_perf)(cpudata, min_perf, des_perf, + max_perf, fast_switch); +} + +static void amd_pstate_update(struct amd_cpudata *cpudata, u32 min_perf, + u32 des_perf, u32 max_perf, bool fast_switch) +{ + u64 prev = READ_ONCE(cpudata->cppc_req_cached); + u64 value = prev; + + value &= ~REQ_MIN_PERF(~0L); + value |= REQ_MIN_PERF(min_perf); + + value &= ~REQ_DES_PERF(~0L); + value |= REQ_DES_PERF(des_perf); + + value &= ~REQ_MAX_PERF(~0L); + value |= REQ_MAX_PERF(max_perf); + + if (value == prev) + return; + + WRITE_ONCE(cpudata->cppc_req_cached, value); + + amd_pstate_update_perf(cpudata, min_perf, des_perf, + max_perf, fast_switch); +} + +static int amd_pstate_verify(struct cpufreq_policy_data *policy) +{ + cpufreq_verify_within_cpu_limits(policy); + + return 0; +} + +static int amd_pstate_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation) +{ + struct cpufreq_freqs freqs; + struct amd_cpudata *cpudata = policy->driver_data; + unsigned long max_perf, min_perf, des_perf, cap_perf; + + if (!cpudata->max_freq) + return -ENODEV; + + cap_perf = READ_ONCE(cpudata->highest_perf); + min_perf = READ_ONCE(cpudata->lowest_nonlinear_perf); + max_perf = cap_perf; + + freqs.old = policy->cur; + freqs.new = target_freq; + + des_perf = DIV_ROUND_CLOSEST(target_freq * cap_perf, + cpudata->max_freq); + + cpufreq_freq_transition_begin(policy, &freqs); + amd_pstate_update(cpudata, min_perf, des_perf, + max_perf, false); + cpufreq_freq_transition_end(policy, &freqs, false); + + return 0; +} + +static int amd_get_min_freq(struct amd_cpudata *cpudata) +{ + struct cppc_perf_caps cppc_perf; + + int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); + if (ret) + return ret; + + /* Switch to khz */ + return cppc_perf.lowest_freq * 1000; +} + +static int amd_get_max_freq(struct amd_cpudata *cpudata) +{ + struct cppc_perf_caps cppc_perf; + u32 max_perf, max_freq, nominal_freq, nominal_perf; + u64 boost_ratio; + + int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); + if (ret) + return ret; + + nominal_freq = cppc_perf.nominal_freq; + nominal_perf = READ_ONCE(cpudata->nominal_perf); + max_perf = READ_ONCE(cpudata->highest_perf); + + boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT, + nominal_perf); + + max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT; + + /* Switch to khz */ + return max_freq * 1000; +} + +static int amd_get_nominal_freq(struct amd_cpudata *cpudata) +{ + struct cppc_perf_caps cppc_perf; + + int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); + if (ret) + return ret; + + /* Switch to khz */ + return cppc_perf.nominal_freq * 1000; +} + +static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata) +{ + struct cppc_perf_caps cppc_perf; + u32 lowest_nonlinear_freq, lowest_nonlinear_perf, + nominal_freq, nominal_perf; + u64 lowest_nonlinear_ratio; + + int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); + if (ret) + return ret; + + nominal_freq = cppc_perf.nominal_freq; + nominal_perf = READ_ONCE(cpudata->nominal_perf); + + lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf; + + lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT, + nominal_perf); + + lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT; + + /* Switch to khz */ + return lowest_nonlinear_freq * 1000; +} + +static int amd_pstate_cpu_init(struct cpufreq_policy *policy) +{ + int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret; + struct device *dev; + struct amd_cpudata *cpudata; + + dev = get_cpu_device(policy->cpu); + if (!dev) + return -ENODEV; + + cpudata = kzalloc(sizeof(*cpudata), GFP_KERNEL); + if (!cpudata) + return -ENOMEM; + + cpudata->cpu = policy->cpu; + + ret = amd_pstate_init_perf(cpudata); + if (ret) + goto free_cpudata1; + + min_freq = amd_get_min_freq(cpudata); + max_freq = amd_get_max_freq(cpudata); + nominal_freq = amd_get_nominal_freq(cpudata); + lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata); + + if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) { + dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n", + min_freq, max_freq); + ret = -EINVAL; + goto free_cpudata1; + } + + policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY; + policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY; + + policy->min = min_freq; + policy->max = max_freq; + + policy->cpuinfo.min_freq = min_freq; + policy->cpuinfo.max_freq = max_freq; + + /* It will be updated by governor */ + policy->cur = policy->cpuinfo.min_freq; + + ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0], + FREQ_QOS_MIN, policy->cpuinfo.min_freq); + if (ret < 0) { + dev_err(dev, "Failed to add min-freq constraint (%d)\n", ret); + goto free_cpudata1; + } + + ret = freq_qos_add_request(&policy->constraints, &cpudata->req[1], + FREQ_QOS_MAX, policy->cpuinfo.max_freq); + if (ret < 0) { + dev_err(dev, "Failed to add max-freq constraint (%d)\n", ret); + goto free_cpudata2; + } + + /* Initial processor data capability frequencies */ + cpudata->max_freq = max_freq; + cpudata->min_freq = min_freq; + cpudata->nominal_freq = nominal_freq; + cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq; + + policy->driver_data = cpudata; + + return 0; + +free_cpudata2: + freq_qos_remove_request(&cpudata->req[0]); +free_cpudata1: + kfree(cpudata); + return ret; +} + +static int amd_pstate_cpu_exit(struct cpufreq_policy *policy) +{ + struct amd_cpudata *cpudata; + + cpudata = policy->driver_data; + + freq_qos_remove_request(&cpudata->req[1]); + freq_qos_remove_request(&cpudata->req[0]); + kfree(cpudata); + + return 0; +} + +static struct cpufreq_driver amd_pstate_driver = { + .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, + .verify = amd_pstate_verify, + .target = amd_pstate_target, + .init = amd_pstate_cpu_init, + .exit = amd_pstate_cpu_exit, + .name = "amd-pstate", +}; + +static int __init amd_pstate_init(void) +{ + int ret; + + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) + return -ENODEV; + + if (!acpi_cpc_valid()) { + pr_debug("the _CPC object is not present in SBIOS\n"); + return -ENODEV; + } + + /* don't keep reloading if cpufreq_driver exists */ + if (cpufreq_get_current_driver()) + return -EEXIST; + + /* capability check */ + if (!boot_cpu_has(X86_FEATURE_CPPC)) { + pr_debug("AMD CPPC MSR based functionality is not supported\n"); + return -ENODEV; + } + + /* enable amd pstate feature */ + ret = amd_pstate_enable(true); + if (ret) { + pr_err("failed to enable amd-pstate with return %d\n", ret); + return ret; + } + + ret = cpufreq_register_driver(&amd_pstate_driver); + if (ret) + pr_err("failed to register amd_pstate_driver with return %d\n", + ret); + + return ret; +} + +static void __exit amd_pstate_exit(void) +{ + cpufreq_unregister_driver(&amd_pstate_driver); + + amd_pstate_enable(false); +} + +module_init(amd_pstate_init); +module_exit(amd_pstate_exit); + +MODULE_AUTHOR("Huang Rui "); +MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver"); +MODULE_LICENSE("GPL"); -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 07/22] cpufreq: amd: add fast switch function for amd-pstate 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (5 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 06/22] cpufreq: amd: introduce a new amd pstate driver to support future processors Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 08/22] cpufreq: amd: introduce the support for the processors with shared memory solution Huang Rui ` (14 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Introduce the fast switch function for amd-pstate on the AMD processors which support the full MSR register control. It's able to decrease the latency on interrupt context. Signed-off-by: Huang Rui --- drivers/cpufreq/amd-pstate.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index 20ffbc30118f..cab266b8bf35 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -177,6 +177,39 @@ static int amd_pstate_target(struct cpufreq_policy *policy, return 0; } +static void amd_pstate_adjust_perf(unsigned int cpu, + unsigned long _min_perf, + unsigned long target_perf, + unsigned long capacity) +{ + unsigned long max_perf, min_perf, des_perf, + cap_perf, lowest_nonlinear_perf; + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + struct amd_cpudata *cpudata = policy->driver_data; + + cap_perf = READ_ONCE(cpudata->highest_perf); + lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf); + + des_perf = cap_perf; + if (target_perf < capacity) + des_perf = DIV_ROUND_UP(cap_perf * target_perf, capacity); + + min_perf = READ_ONCE(cpudata->highest_perf); + if (_min_perf < capacity) + min_perf = DIV_ROUND_UP(cap_perf * _min_perf, capacity); + + if (min_perf < lowest_nonlinear_perf) + min_perf = lowest_nonlinear_perf; + + max_perf = cap_perf; + if (max_perf < min_perf) + max_perf = min_perf; + + des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf); + + amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true); +} + static int amd_get_min_freq(struct amd_cpudata *cpudata) { struct cppc_perf_caps cppc_perf; @@ -293,6 +326,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy) /* It will be updated by governor */ policy->cur = policy->cpuinfo.min_freq; + policy->fast_switch_possible = true; + ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0], FREQ_QOS_MIN, policy->cpuinfo.min_freq); if (ret < 0) { @@ -341,6 +376,7 @@ static struct cpufreq_driver amd_pstate_driver = { .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, .verify = amd_pstate_verify, .target = amd_pstate_target, + .adjust_perf = amd_pstate_adjust_perf, .init = amd_pstate_cpu_init, .exit = amd_pstate_cpu_exit, .name = "amd-pstate", -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 08/22] cpufreq: amd: introduce the support for the processors with shared memory solution 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (6 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 07/22] cpufreq: amd: add fast switch function for amd-pstate Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 09/22] cpufreq: amd: add trace for amd-pstate module Huang Rui ` (13 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui In some of Zen2 and Zen3 based processors, they are using the shared memory that exposed from ACPI SBIOS. In this kind of the processors, there is no MSR support, so we add acpi cppc function as the backend for them. It is using a module param (shared_mem) to enable related processors manually. We will enable this by default once we address performance issue on this solution. Signed-off-by: Jinzhou Su Signed-off-by: Huang Rui --- drivers/cpufreq/amd-pstate.c | 72 ++++++++++++++++++++++++++++++++++-- 1 file changed, 68 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index cab266b8bf35..68991c450fd5 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -35,6 +35,19 @@ #define AMD_PSTATE_TRANSITION_LATENCY 0x20000 #define AMD_PSTATE_TRANSITION_DELAY 500 +/* TODO: We need more time to fine tune processors with shared memory solution + * with community together. + * + * There are some performance drops on the CPU benchmarks which reports from + * Suse. We are co-working with them to fine tune the shared memory solution. So + * we disable it by default to go acpi-cpufreq on these processors and add a + * module parameter to be able to enable it manually for debugging. + */ +static bool shared_mem = false; +module_param(shared_mem, bool, 0444); +MODULE_PARM_DESC(shared_mem, + "enable amd-pstate on processors with shared memory solution (false = disabled (default), true = enabled)"); + static struct cpufreq_driver amd_pstate_driver; struct amd_cpudata { @@ -60,6 +73,19 @@ static inline int pstate_enable(bool enable) return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable); } +static int cppc_enable(bool enable) +{ + int cpu, ret = 0; + + for_each_online_cpu(cpu) { + ret = cppc_set_enable(cpu, enable); + if (ret) + return ret; + } + + return ret; +} + DEFINE_STATIC_CALL(amd_pstate_enable, pstate_enable); static inline int amd_pstate_enable(bool enable) @@ -90,6 +116,24 @@ static int pstate_init_perf(struct amd_cpudata *cpudata) return 0; } +static int cppc_init_perf(struct amd_cpudata *cpudata) +{ + struct cppc_perf_caps cppc_perf; + + int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); + if (ret) + return ret; + + WRITE_ONCE(cpudata->highest_perf, amd_get_highest_perf()); + + WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); + WRITE_ONCE(cpudata->lowest_nonlinear_perf, + cppc_perf.lowest_nonlinear_perf); + WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf); + + return 0; +} + DEFINE_STATIC_CALL(amd_pstate_init_perf, pstate_init_perf); static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata) @@ -107,6 +151,19 @@ static void pstate_update_perf(struct amd_cpudata *cpudata, u32 min_perf, READ_ONCE(cpudata->cppc_req_cached)); } +static void cppc_update_perf(struct amd_cpudata *cpudata, + u32 min_perf, u32 des_perf, + u32 max_perf, bool fast_switch) +{ + struct cppc_perf_ctrls perf_ctrls; + + perf_ctrls.max_perf = max_perf; + perf_ctrls.min_perf = min_perf; + perf_ctrls.desired_perf = des_perf; + + cppc_set_perf(cpudata->cpu, &perf_ctrls); +} + DEFINE_STATIC_CALL(amd_pstate_update_perf, pstate_update_perf); static inline void amd_pstate_update_perf(struct amd_cpudata *cpudata, @@ -326,7 +383,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy) /* It will be updated by governor */ policy->cur = policy->cpuinfo.min_freq; - policy->fast_switch_possible = true; + if (boot_cpu_has(X86_FEATURE_CPPC)) + policy->fast_switch_possible = true; ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0], FREQ_QOS_MIN, policy->cpuinfo.min_freq); @@ -376,7 +434,6 @@ static struct cpufreq_driver amd_pstate_driver = { .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, .verify = amd_pstate_verify, .target = amd_pstate_target, - .adjust_perf = amd_pstate_adjust_perf, .init = amd_pstate_cpu_init, .exit = amd_pstate_cpu_exit, .name = "amd-pstate", @@ -399,8 +456,15 @@ static int __init amd_pstate_init(void) return -EEXIST; /* capability check */ - if (!boot_cpu_has(X86_FEATURE_CPPC)) { - pr_debug("AMD CPPC MSR based functionality is not supported\n"); + if (boot_cpu_has(X86_FEATURE_CPPC)) { + pr_debug("AMD CPPC MSR based functionality is supported\n"); + amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf; + } else if (shared_mem) { + static_call_update(amd_pstate_enable, cppc_enable); + static_call_update(amd_pstate_init_perf, cppc_init_perf); + static_call_update(amd_pstate_update_perf, cppc_update_perf); + } else { + pr_info("This processor supports shared memory solution, you can enable it with amd_pstate.shared_mem=1\n"); return -ENODEV; } -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 09/22] cpufreq: amd: add trace for amd-pstate module 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (7 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 08/22] cpufreq: amd: introduce the support for the processors with shared memory solution Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 10/22] cpufreq: amd: add boost mode support for amd-pstate Huang Rui ` (12 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Add trace event to monitor the performance value changes which is controlled by cpu governors. Signed-off-by: Huang Rui --- drivers/cpufreq/Makefile | 6 ++- drivers/cpufreq/amd-pstate-trace.c | 2 + drivers/cpufreq/amd-pstate-trace.h | 77 ++++++++++++++++++++++++++++++ drivers/cpufreq/amd-pstate.c | 4 ++ 4 files changed, 88 insertions(+), 1 deletion(-) create mode 100644 drivers/cpufreq/amd-pstate-trace.c create mode 100644 drivers/cpufreq/amd-pstate-trace.h diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index c8d307010922..285de70af877 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -17,6 +17,10 @@ obj-$(CONFIG_CPU_FREQ_GOV_ATTR_SET) += cpufreq_governor_attr_set.o obj-$(CONFIG_CPUFREQ_DT) += cpufreq-dt.o obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o +# Traces +CFLAGS_amd-pstate-trace.o := -I$(src) +amd_pstate-y := amd-pstate.o amd-pstate-trace.o + ################################################################################## # x86 drivers. # Link order matters. K8 is preferred to ACPI because of firmware bugs in early @@ -25,7 +29,7 @@ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o # speedstep-* is preferred over p4-clockmod. obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o -obj-$(CONFIG_X86_AMD_PSTATE) += amd-pstate.o +obj-$(CONFIG_X86_AMD_PSTATE) += amd_pstate.o obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o obj-$(CONFIG_X86_PCC_CPUFREQ) += pcc-cpufreq.o obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o diff --git a/drivers/cpufreq/amd-pstate-trace.c b/drivers/cpufreq/amd-pstate-trace.c new file mode 100644 index 000000000000..891b696dcd69 --- /dev/null +++ b/drivers/cpufreq/amd-pstate-trace.c @@ -0,0 +1,2 @@ +#define CREATE_TRACE_POINTS +#include "amd-pstate-trace.h" diff --git a/drivers/cpufreq/amd-pstate-trace.h b/drivers/cpufreq/amd-pstate-trace.h new file mode 100644 index 000000000000..647505957d4f --- /dev/null +++ b/drivers/cpufreq/amd-pstate-trace.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * amd-pstate-trace.h - AMD Processor P-state Frequency Driver Tracer + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. All Rights Reserved. + * + * Author: Huang Rui + */ + +#if !defined(_AMD_PSTATE_TRACE_H) || defined(TRACE_HEADER_MULTI_READ) +#define _AMD_PSTATE_TRACE_H + +#include +#include +#include + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM amd_cpu + +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE amd-pstate-trace + +#define TPS(x) tracepoint_string(x) + +TRACE_EVENT(amd_pstate_perf, + + TP_PROTO(unsigned long min_perf, + unsigned long target_perf, + unsigned long capacity, + unsigned int cpu_id, + bool changed, + bool fast_switch + ), + + TP_ARGS(min_perf, + target_perf, + capacity, + cpu_id, + changed, + fast_switch + ), + + TP_STRUCT__entry( + __field(unsigned long, min_perf) + __field(unsigned long, target_perf) + __field(unsigned long, capacity) + __field(unsigned int, cpu_id) + __field(bool, changed) + __field(bool, fast_switch) + ), + + TP_fast_assign( + __entry->min_perf = min_perf; + __entry->target_perf = target_perf; + __entry->capacity = capacity; + __entry->cpu_id = cpu_id; + __entry->changed = changed; + __entry->fast_switch = fast_switch; + ), + + TP_printk("amd_min_perf=%lu amd_des_perf=%lu amd_max_perf=%lu cpu_id=%u changed=%s fast_switch=%s", + (unsigned long)__entry->min_perf, + (unsigned long)__entry->target_perf, + (unsigned long)__entry->capacity, + (unsigned int)__entry->cpu_id, + (__entry->changed) ? "true" : "false", + (__entry->fast_switch) ? "true" : "false" + ) +); + +#endif /* _AMD_PSTATE_TRACE_H */ + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH . + +#include diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index 68991c450fd5..72a4e2258fe7 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -31,6 +31,7 @@ #include #include #include +#include "amd-pstate-trace.h" #define AMD_PSTATE_TRANSITION_LATENCY 0x20000 #define AMD_PSTATE_TRANSITION_DELAY 500 @@ -189,6 +190,9 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u32 min_perf, value &= ~REQ_MAX_PERF(~0L); value |= REQ_MAX_PERF(max_perf); + trace_amd_pstate_perf(min_perf, des_perf, max_perf, + cpudata->cpu, (value != prev), fast_switch); + if (value == prev) return; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 10/22] cpufreq: amd: add boost mode support for amd-pstate 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (8 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 09/22] cpufreq: amd: add trace for amd-pstate module Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 11/22] cpufreq: amd: add amd-pstate frequencies attributes Huang Rui ` (11 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui If the sbios supports the boost mode of amd-pstate, let's switch to boost enabled by default. Signed-off-by: Huang Rui --- drivers/cpufreq/amd-pstate.c | 44 ++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index 72a4e2258fe7..c5d786af199d 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -67,6 +67,8 @@ struct amd_cpudata { u32 min_freq; u32 nominal_freq; u32 lowest_nonlinear_freq; + + bool boost_supported; }; static inline int pstate_enable(bool enable) @@ -343,6 +345,45 @@ static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata) return lowest_nonlinear_freq * 1000; } +static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state) +{ + struct amd_cpudata *cpudata = policy->driver_data; + int ret; + + if (!cpudata->boost_supported) { + pr_err("Boost mode is not supported by this processor or SBIOS\n"); + return -EINVAL; + } + + if (state) + policy->cpuinfo.max_freq = cpudata->max_freq; + else + policy->cpuinfo.max_freq = cpudata->nominal_freq; + + policy->max = policy->cpuinfo.max_freq; + + ret = freq_qos_update_request(&cpudata->req[1], + policy->cpuinfo.max_freq); + if (ret < 0) + return ret; + + return 0; +} + +static void amd_pstate_boost_init(struct amd_cpudata *cpudata) +{ + u32 highest_perf, nominal_perf; + + highest_perf = READ_ONCE(cpudata->highest_perf); + nominal_perf = READ_ONCE(cpudata->nominal_perf); + + if (highest_perf <= nominal_perf) + return; + + cpudata->boost_supported = true; + amd_pstate_driver.boost_enabled = true; +} + static int amd_pstate_cpu_init(struct cpufreq_policy *policy) { int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret; @@ -412,6 +453,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy) policy->driver_data = cpudata; + amd_pstate_boost_init(cpudata); + return 0; free_cpudata2: @@ -440,6 +483,7 @@ static struct cpufreq_driver amd_pstate_driver = { .target = amd_pstate_target, .init = amd_pstate_cpu_init, .exit = amd_pstate_cpu_exit, + .set_boost = amd_pstate_set_boost, .name = "amd-pstate", }; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 11/22] cpufreq: amd: add amd-pstate frequencies attributes 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (9 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 10/22] cpufreq: amd: add boost mode support for amd-pstate Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 12/22] cpufreq: amd: add amd-pstate performance attributes Huang Rui ` (10 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Introduce sysfs attributes to get the different level processor frequencies. Signed-off-by: Huang Rui --- drivers/cpufreq/amd-pstate.c | 46 ++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index c5d786af199d..d462d5a28e08 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -477,6 +477,51 @@ static int amd_pstate_cpu_exit(struct cpufreq_policy *policy) return 0; } +/* Sysfs attributes */ + +/* This frequency is to indicate the maximum hardware frequency. + * If boost is not active but supported, the frequency will be larger than the + * one in cpuinfo. + */ +static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy, + char *buf) +{ + int max_freq; + struct amd_cpudata *cpudata; + + cpudata = policy->driver_data; + + max_freq = amd_get_max_freq(cpudata); + if (max_freq < 0) + return max_freq; + + return sprintf(&buf[0], "%u\n", max_freq); +} + +static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy, + char *buf) +{ + int freq; + struct amd_cpudata *cpudata; + + cpudata = policy->driver_data; + + freq = amd_get_lowest_nonlinear_freq(cpudata); + if (freq < 0) + return freq; + + return sprintf(&buf[0], "%u\n", freq); +} + +cpufreq_freq_attr_ro(amd_pstate_max_freq); +cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq); + +static struct freq_attr *amd_pstate_attr[] = { + &amd_pstate_max_freq, + &amd_pstate_lowest_nonlinear_freq, + NULL, +}; + static struct cpufreq_driver amd_pstate_driver = { .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, .verify = amd_pstate_verify, @@ -485,6 +530,7 @@ static struct cpufreq_driver amd_pstate_driver = { .exit = amd_pstate_cpu_exit, .set_boost = amd_pstate_set_boost, .name = "amd-pstate", + .attr = amd_pstate_attr, }; static int __init amd_pstate_init(void) -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 12/22] cpufreq: amd: add amd-pstate performance attributes 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (10 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 11/22] cpufreq: amd: add amd-pstate frequencies attributes Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 13/22] cpupower: add AMD P-state capability flag Huang Rui ` (9 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Introduce sysfs attributes to get the different level amd-pstate performances. Signed-off-by: Huang Rui --- drivers/cpufreq/amd-pstate.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index d462d5a28e08..c70274ae4046 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -513,12 +513,29 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli return sprintf(&buf[0], "%u\n", freq); } +/* In some of ASICs, the highest_perf is not the one in the _CPC table, so we + * need to expose it to sysfs. + */ +static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy, + char *buf) +{ + u32 perf; + struct amd_cpudata *cpudata = policy->driver_data; + + perf = READ_ONCE(cpudata->highest_perf); + + return sprintf(&buf[0], "%u\n", perf); +} + cpufreq_freq_attr_ro(amd_pstate_max_freq); cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq); +cpufreq_freq_attr_ro(amd_pstate_highest_perf); + static struct freq_attr *amd_pstate_attr[] = { &amd_pstate_max_freq, &amd_pstate_lowest_nonlinear_freq, + &amd_pstate_highest_perf, NULL, }; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 13/22] cpupower: add AMD P-state capability flag 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (11 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 12/22] cpufreq: amd: add amd-pstate performance attributes Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 14/22] cpupower: add the function to check amd-pstate enabled Huang Rui ` (8 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Add AMD P-state capability flag in cpupower to indicate AMD new P-state kernel module support on Ryzen processors. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/helpers/helpers.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index 33ffacee7fcb..b4813efdfb00 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -73,6 +73,7 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL, #define CPUPOWER_CAP_AMD_HW_PSTATE 0x00000100 #define CPUPOWER_CAP_AMD_PSTATEDEF 0x00000200 #define CPUPOWER_CAP_AMD_CPB_MSR 0x00000400 +#define CPUPOWER_CAP_AMD_PSTATE 0x00000800 #define CPUPOWER_AMD_CPBDIS 0x02000000 -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 14/22] cpupower: add the function to check amd-pstate enabled 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (12 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 13/22] cpupower: add AMD P-state capability flag Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 15/22] cpupower: initial AMD P-state capability Huang Rui ` (7 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui The processor with amd-pstate function also supports legacy ACPI hardware P-States feature as well. Once driver sets amd-pstate eanbled, the processor will respond the finer grain amd-pstate feature instead of legacy ACPI P-States. So it introduces the cpupower_amd_pstate_enabled() to check whether the current kernel enables amd-pstate or acpi-cpufreq module. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/helpers/helpers.h | 10 ++++++++++ tools/power/cpupower/utils/helpers/misc.c | 18 ++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index b4813efdfb00..e03cc97297aa 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -11,6 +11,7 @@ #include #include +#include #include "helpers/bitmask.h" #include @@ -136,6 +137,12 @@ extern int decode_pstates(unsigned int cpu, int boost_states, extern int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active, int * states); + +/* AMD P-States stuff **************************/ +extern bool cpupower_amd_pstate_enabled(void); + +/* AMD P-States stuff **************************/ + /* * CPUID functions returning a single datum */ @@ -168,6 +175,9 @@ static inline int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active, int * states) { return -1; } +static inline bool cpupower_amd_pstate_enabled(void) +{ return false; } + /* cpuid and cpuinfo helpers **************************/ static inline unsigned int cpuid_eax(unsigned int op) { return 0; }; diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c index fc6e34511721..0c483cdefcc2 100644 --- a/tools/power/cpupower/utils/helpers/misc.c +++ b/tools/power/cpupower/utils/helpers/misc.c @@ -3,9 +3,11 @@ #include #include #include +#include #include "helpers/helpers.h" #include "helpers/sysfs.h" +#include "cpufreq.h" #if defined(__i386__) || defined(__x86_64__) @@ -83,6 +85,22 @@ int cpupower_intel_set_perf_bias(unsigned int cpu, unsigned int val) return 0; } +bool cpupower_amd_pstate_enabled(void) +{ + char *driver = cpufreq_get_driver(0); + bool ret = false; + + if (!driver) + return ret; + + if (!strcmp(driver, "amd-pstate")) + ret = true; + + cpufreq_put_driver(driver); + + return ret; +} + #endif /* #if defined(__i386__) || defined(__x86_64__) */ /* get_cpustate -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 15/22] cpupower: initial AMD P-state capability 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (13 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 14/22] cpupower: add the function to check amd-pstate enabled Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 16/22] cpupower: add the function to get the sysfs value from specific table Huang Rui ` (6 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui If kernel starts the amd-pstate module, the cpupower will initial the capability flag as CPUPOWER_CAP_AMD_PSTATE. And once amd-pstate capability is set, it won't need to set legacy ACPI relative capabilities anymore. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/helpers/cpuid.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tools/power/cpupower/utils/helpers/cpuid.c b/tools/power/cpupower/utils/helpers/cpuid.c index 72eb43593180..2a6dc104e76b 100644 --- a/tools/power/cpupower/utils/helpers/cpuid.c +++ b/tools/power/cpupower/utils/helpers/cpuid.c @@ -149,6 +149,19 @@ int get_cpu_info(struct cpupower_cpu_info *cpu_info) if (ext_cpuid_level >= 0x80000008 && cpuid_ebx(0x80000008) & (1 << 4)) cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU; + + if (cpupower_amd_pstate_enabled()) { + cpu_info->caps |= CPUPOWER_CAP_AMD_PSTATE; + + /* + * If AMD P-state is enabled, the firmware will treat + * AMD P-state function as high priority. + */ + cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB; + cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB_MSR; + cpu_info->caps &= ~CPUPOWER_CAP_AMD_HW_PSTATE; + cpu_info->caps &= ~CPUPOWER_CAP_AMD_PSTATEDEF; + } } if (cpu_info->vendor == X86_VENDOR_INTEL) { -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 16/22] cpupower: add the function to get the sysfs value from specific table 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (14 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 15/22] cpupower: initial AMD P-state capability Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 17/22] cpupower: introduce acpi cppc library Huang Rui ` (5 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Expose the helper into cpufreq header, then cpufreq driver can use this function to get the sysfs value if it has any specific sysfs interfaces. Signed-off-by: Huang Rui --- tools/power/cpupower/lib/cpufreq.c | 21 +++++++++++++++------ tools/power/cpupower/lib/cpufreq.h | 12 ++++++++++++ 2 files changed, 27 insertions(+), 6 deletions(-) diff --git a/tools/power/cpupower/lib/cpufreq.c b/tools/power/cpupower/lib/cpufreq.c index c3b56db8b921..c011bca27041 100644 --- a/tools/power/cpupower/lib/cpufreq.c +++ b/tools/power/cpupower/lib/cpufreq.c @@ -83,20 +83,21 @@ static const char *cpufreq_value_files[MAX_CPUFREQ_VALUE_READ_FILES] = { [STATS_NUM_TRANSITIONS] = "stats/total_trans" }; - -static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu, - enum cpufreq_value which) +unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu, + const char **table, + unsigned index, + unsigned size) { unsigned long value; unsigned int len; char linebuf[MAX_LINE_LEN]; char *endp; - if (which >= MAX_CPUFREQ_VALUE_READ_FILES) + if (!table || index >= size || !table[index]) return 0; - len = sysfs_cpufreq_read_file(cpu, cpufreq_value_files[which], - linebuf, sizeof(linebuf)); + len = sysfs_cpufreq_read_file(cpu, table[index], linebuf, + sizeof(linebuf)); if (len == 0) return 0; @@ -109,6 +110,14 @@ static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu, return value; } +static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu, + enum cpufreq_value which) +{ + return cpufreq_get_sysfs_value_from_table(cpu, cpufreq_value_files, + which, + MAX_CPUFREQ_VALUE_READ_FILES); +} + /* read access to files which contain one string */ enum cpufreq_string { diff --git a/tools/power/cpupower/lib/cpufreq.h b/tools/power/cpupower/lib/cpufreq.h index 95f4fd9e2656..107668c0c454 100644 --- a/tools/power/cpupower/lib/cpufreq.h +++ b/tools/power/cpupower/lib/cpufreq.h @@ -203,6 +203,18 @@ int cpufreq_modify_policy_governor(unsigned int cpu, char *governor); int cpufreq_set_frequency(unsigned int cpu, unsigned long target_frequency); +/* + * get the sysfs value from specific table + * + * Read the value with the sysfs file name from specific table. Does + * only work if the cpufreq driver has the specific sysfs interfaces. + */ + +unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu, + const char **table, + unsigned index, + unsigned size); + #ifdef __cplusplus } #endif -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 17/22] cpupower: introduce acpi cppc library 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (15 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 16/22] cpupower: add the function to get the sysfs value from specific table Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 18/22] cpupower: add amd-pstate sysfs definition and access helper Huang Rui ` (4 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Kernel ACPI subsytem introduced the sysfs attributes for acpi cppc library in below path: /sys/devices/system/cpu/cpuX/acpi_cppc/ And these attributes will be used for amd-pstate driver to provide some performance and frequency values. Signed-off-by: Huang Rui --- tools/power/cpupower/Makefile | 6 +-- tools/power/cpupower/lib/acpi_cppc.c | 59 ++++++++++++++++++++++++++++ tools/power/cpupower/lib/acpi_cppc.h | 21 ++++++++++ 3 files changed, 83 insertions(+), 3 deletions(-) create mode 100644 tools/power/cpupower/lib/acpi_cppc.c create mode 100644 tools/power/cpupower/lib/acpi_cppc.h diff --git a/tools/power/cpupower/Makefile b/tools/power/cpupower/Makefile index 3b1594447f29..e9b6de314654 100644 --- a/tools/power/cpupower/Makefile +++ b/tools/power/cpupower/Makefile @@ -143,9 +143,9 @@ UTIL_HEADERS = utils/helpers/helpers.h utils/idle_monitor/cpupower-monitor.h \ utils/helpers/bitmask.h \ utils/idle_monitor/idle_monitors.h utils/idle_monitor/idle_monitors.def -LIB_HEADERS = lib/cpufreq.h lib/cpupower.h lib/cpuidle.h -LIB_SRC = lib/cpufreq.c lib/cpupower.c lib/cpuidle.c -LIB_OBJS = lib/cpufreq.o lib/cpupower.o lib/cpuidle.o +LIB_HEADERS = lib/cpufreq.h lib/cpupower.h lib/cpuidle.h lib/acpi_cppc.h +LIB_SRC = lib/cpufreq.c lib/cpupower.c lib/cpuidle.c lib/acpi_cppc.c +LIB_OBJS = lib/cpufreq.o lib/cpupower.o lib/cpuidle.o lib/acpi_cppc.o LIB_OBJS := $(addprefix $(OUTPUT),$(LIB_OBJS)) override CFLAGS += -pipe diff --git a/tools/power/cpupower/lib/acpi_cppc.c b/tools/power/cpupower/lib/acpi_cppc.c new file mode 100644 index 000000000000..a07a8922eca2 --- /dev/null +++ b/tools/power/cpupower/lib/acpi_cppc.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cpupower_intern.h" +#include "acpi_cppc.h" + +/* ACPI CPPC sysfs access ***********************************************/ + +static int acpi_cppc_read_file(unsigned int cpu, const char *fname, + char *buf, size_t buflen) +{ + char path[SYSFS_PATH_MAX]; + + snprintf(path, sizeof(path), PATH_TO_CPU "cpu%u/acpi_cppc/%s", + cpu, fname); + return cpupower_read_sysfs(path, buf, buflen); +} + +static const char *acpi_cppc_value_files[] = { + [HIGHEST_PERF] = "highest_perf", + [LOWEST_PERF] = "lowest_perf", + [NOMINAL_PERF] = "nominal_perf", + [LOWEST_NONLINEAR_PERF] = "lowest_nonlinear_perf", + [LOWEST_FREQ] = "lowest_freq", + [NOMINAL_FREQ] = "nominal_freq", + [REFERENCE_PERF] = "reference_perf", + [WRAPAROUND_TIME] = "wraparound_time" +}; + +unsigned long acpi_cppc_get_data(unsigned cpu, enum acpi_cppc_value which) +{ + unsigned long long value; + unsigned int len; + char linebuf[MAX_LINE_LEN]; + char *endp; + + if (which >= MAX_CPPC_VALUE_FILES) + return 0; + + len = acpi_cppc_read_file(cpu, acpi_cppc_value_files[which], + linebuf, sizeof(linebuf)); + if (len == 0) + return 0; + + value = strtoull(linebuf, &endp, 0); + + if (endp == linebuf || errno == ERANGE) + return 0; + + return value; +} diff --git a/tools/power/cpupower/lib/acpi_cppc.h b/tools/power/cpupower/lib/acpi_cppc.h new file mode 100644 index 000000000000..576291155224 --- /dev/null +++ b/tools/power/cpupower/lib/acpi_cppc.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ACPI_CPPC_H__ +#define __ACPI_CPPC_H__ + +enum acpi_cppc_value { + HIGHEST_PERF, + LOWEST_PERF, + NOMINAL_PERF, + LOWEST_NONLINEAR_PERF, + LOWEST_FREQ, + NOMINAL_FREQ, + REFERENCE_PERF, + WRAPAROUND_TIME, + MAX_CPPC_VALUE_FILES +}; + +extern unsigned long acpi_cppc_get_data(unsigned cpu, + enum acpi_cppc_value which); + +#endif /* _ACPI_CPPC_H */ -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 18/22] cpupower: add amd-pstate sysfs definition and access helper 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (16 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 17/22] cpupower: introduce acpi cppc library Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 19/22] cpupower: enable boost state support for amd-pstate module Huang Rui ` (3 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Introduce the marco definitions and access helper function for amd-pstate sysfs interfaces such as each performance goals and frequency levels in amd helper file. They will be used to read the sysfs attribute from amd-pstate cpufreq driver for cpupower utilities. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/helpers/amd.c | 30 ++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c index 97f2c857048e..14c658daba4b 100644 --- a/tools/power/cpupower/utils/helpers/amd.c +++ b/tools/power/cpupower/utils/helpers/amd.c @@ -8,7 +8,10 @@ #include #include "helpers/helpers.h" +#include "cpufreq.h" +#include "acpi_cppc.h" +/* ACPI P-States Helper Functions for AMD Processors ***************/ #define MSR_AMD_PSTATE_STATUS 0xc0010063 #define MSR_AMD_PSTATE 0xc0010064 #define MSR_AMD_PSTATE_LIMIT 0xc0010061 @@ -146,4 +149,31 @@ int amd_pci_get_num_boost_states(int *active, int *states) pci_cleanup(pci_acc); return 0; } + +/* ACPI P-States Helper Functions for AMD Processors ***************/ + +/* AMD P-States Helper Functions ***************/ +enum amd_pstate_value { + AMD_PSTATE_HIGHEST_PERF, + AMD_PSTATE_MAX_FREQ, + AMD_PSTATE_LOWEST_NONLINEAR_FREQ, + MAX_AMD_PSTATE_VALUE_READ_FILES, +}; + +static const char *amd_pstate_value_files[MAX_AMD_PSTATE_VALUE_READ_FILES] = { + [AMD_PSTATE_HIGHEST_PERF] = "amd_pstate_highest_perf", + [AMD_PSTATE_MAX_FREQ] = "amd_pstate_max_freq", + [AMD_PSTATE_LOWEST_NONLINEAR_FREQ] = "amd_pstate_lowest_nonlinear_freq", +}; + +static unsigned long amd_pstate_get_data(unsigned int cpu, + enum amd_pstate_value value) +{ + return cpufreq_get_sysfs_value_from_table(cpu, + amd_pstate_value_files, + value, + MAX_AMD_PSTATE_VALUE_READ_FILES); +} + +/* AMD P-States Helper Functions ***************/ #endif /* defined(__i386__) || defined(__x86_64__) */ -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 19/22] cpupower: enable boost state support for amd-pstate module 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (17 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 18/22] cpupower: add amd-pstate sysfs definition and access helper Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 20/22] cpupower: move print_speed function into misc helper Huang Rui ` (2 subsequent siblings) 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui The legacy ACPI hardware P-States function has 3 P-States on ACPI table, the CPU frequency only can be switched between the 3 P-States. While the processor supports the boost state, it will have another boost state that the frequency can be higher than P0 state, and the state can be decoded by the function of decode_pstates() and read by amd_pci_get_num_boost_states(). However, the new AMD P-States function is different than legacy ACPI hardware P-State on AMD processors. That has a finer grain frequency range between the highest and lowest frequency. And boost frequency is actually the frequency which is mapped on highest performance ratio. The similiar previous P0 frequency is mapped on nominal performance ratio. If the highest performance on the processor is higher than nominal performance, then we think the current processor supports the boost state. And it uses amd_pstate_boost_init() to initialize boost for AMD P-States function. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/helpers/amd.c | 18 ++++++++++++++++++ tools/power/cpupower/utils/helpers/helpers.h | 5 +++++ tools/power/cpupower/utils/helpers/misc.c | 2 ++ 3 files changed, 25 insertions(+) diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c index 14c658daba4b..bde6065cabf4 100644 --- a/tools/power/cpupower/utils/helpers/amd.c +++ b/tools/power/cpupower/utils/helpers/amd.c @@ -175,5 +175,23 @@ static unsigned long amd_pstate_get_data(unsigned int cpu, MAX_AMD_PSTATE_VALUE_READ_FILES); } +void amd_pstate_boost_init(unsigned int cpu, int *support, int *active) +{ + unsigned long highest_perf, nominal_perf, cpuinfo_min, + cpuinfo_max, amd_pstate_max; + + highest_perf = amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF); + nominal_perf = acpi_cppc_get_data(cpu, NOMINAL_PERF); + + *support = highest_perf > nominal_perf ? 1 : 0; + if (!(*support)) + return; + + cpufreq_get_hardware_limits(cpu, &cpuinfo_min, &cpuinfo_max); + amd_pstate_max = amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ); + + *active = cpuinfo_max == amd_pstate_max ? 1 : 0; +} + /* AMD P-States Helper Functions ***************/ #endif /* defined(__i386__) || defined(__x86_64__) */ diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index e03cc97297aa..c03925bea655 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -140,6 +140,8 @@ extern int cpufreq_has_boost_support(unsigned int cpu, int *support, /* AMD P-States stuff **************************/ extern bool cpupower_amd_pstate_enabled(void); +extern void amd_pstate_boost_init(unsigned int cpu, + int *support, int *active); /* AMD P-States stuff **************************/ @@ -177,6 +179,9 @@ static inline int cpufreq_has_boost_support(unsigned int cpu, int *support, static inline bool cpupower_amd_pstate_enabled(void) { return false; } +static void amd_pstate_boost_init(unsigned int cpu, + int *support, int *active) +{ return; } /* cpuid and cpuinfo helpers **************************/ diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c index 0c483cdefcc2..e0d3145434d3 100644 --- a/tools/power/cpupower/utils/helpers/misc.c +++ b/tools/power/cpupower/utils/helpers/misc.c @@ -41,6 +41,8 @@ int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active, if (ret) return ret; } + } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) { + amd_pstate_boost_init(cpu, support, active); } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA) *support = *active = 1; return 0; -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 20/22] cpupower: move print_speed function into misc helper 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (18 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 19/22] cpupower: enable boost state support for amd-pstate module Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 21/22] cpupower: print amd-pstate information on cpupower Huang Rui 2021-11-30 12:36 ` [PATCH v5 22/22] Documentation: amd-pstate: add amd-pstate driver introduction Huang Rui 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui The print_speed can be as a common function, and expose it into misc helper header. Then it can be used on other helper files as well. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/cpufreq-info.c | 59 ++++---------------- tools/power/cpupower/utils/helpers/helpers.h | 1 + tools/power/cpupower/utils/helpers/misc.c | 42 ++++++++++++++ 3 files changed, 54 insertions(+), 48 deletions(-) diff --git a/tools/power/cpupower/utils/cpufreq-info.c b/tools/power/cpupower/utils/cpufreq-info.c index f9895e31ff5a..b429454bf3ae 100644 --- a/tools/power/cpupower/utils/cpufreq-info.c +++ b/tools/power/cpupower/utils/cpufreq-info.c @@ -84,43 +84,6 @@ static void proc_cpufreq_output(void) } static int no_rounding; -static void print_speed(unsigned long speed) -{ - unsigned long tmp; - - if (no_rounding) { - if (speed > 1000000) - printf("%u.%06u GHz", ((unsigned int) speed/1000000), - ((unsigned int) speed%1000000)); - else if (speed > 1000) - printf("%u.%03u MHz", ((unsigned int) speed/1000), - (unsigned int) (speed%1000)); - else - printf("%lu kHz", speed); - } else { - if (speed > 1000000) { - tmp = speed%10000; - if (tmp >= 5000) - speed += 10000; - printf("%u.%02u GHz", ((unsigned int) speed/1000000), - ((unsigned int) (speed%1000000)/10000)); - } else if (speed > 100000) { - tmp = speed%1000; - if (tmp >= 500) - speed += 1000; - printf("%u MHz", ((unsigned int) speed/1000)); - } else if (speed > 1000) { - tmp = speed%100; - if (tmp >= 50) - speed += 100; - printf("%u.%01u MHz", ((unsigned int) speed/1000), - ((unsigned int) (speed%1000)/100)); - } - } - - return; -} - static void print_duration(unsigned long duration) { unsigned long tmp; @@ -254,11 +217,11 @@ static int get_boost_mode(unsigned int cpu) if (freqs) { printf(_(" boost frequency steps: ")); while (freqs->next) { - print_speed(freqs->frequency); + print_speed(freqs->frequency, no_rounding); printf(", "); freqs = freqs->next; } - print_speed(freqs->frequency); + print_speed(freqs->frequency, no_rounding); printf("\n"); cpufreq_put_available_frequencies(freqs); } @@ -277,7 +240,7 @@ static int get_freq_kernel(unsigned int cpu, unsigned int human) return -EINVAL; } if (human) { - print_speed(freq); + print_speed(freq, no_rounding); } else printf("%lu", freq); printf(_(" (asserted by call to kernel)\n")); @@ -296,7 +259,7 @@ static int get_freq_hardware(unsigned int cpu, unsigned int human) return -EINVAL; } if (human) { - print_speed(freq); + print_speed(freq, no_rounding); } else printf("%lu", freq); printf(_(" (asserted by call to hardware)\n")); @@ -316,9 +279,9 @@ static int get_hardware_limits(unsigned int cpu, unsigned int human) if (human) { printf(_(" hardware limits: ")); - print_speed(min); + print_speed(min, no_rounding); printf(" - "); - print_speed(max); + print_speed(max, no_rounding); printf("\n"); } else { printf("%lu %lu\n", min, max); @@ -350,9 +313,9 @@ static int get_policy(unsigned int cpu) return -EINVAL; } printf(_(" current policy: frequency should be within ")); - print_speed(policy->min); + print_speed(policy->min, no_rounding); printf(_(" and ")); - print_speed(policy->max); + print_speed(policy->max, no_rounding); printf(".\n "); printf(_("The governor \"%s\" may decide which speed to use\n" @@ -436,7 +399,7 @@ static int get_freq_stats(unsigned int cpu, unsigned int human) struct cpufreq_stats *stats = cpufreq_get_stats(cpu, &total_time); while (stats) { if (human) { - print_speed(stats->frequency); + print_speed(stats->frequency, no_rounding); printf(":%.2f%%", (100.0 * stats->time_in_state) / total_time); } else @@ -486,11 +449,11 @@ static void debug_output_one(unsigned int cpu) if (freqs) { printf(_(" available frequency steps: ")); while (freqs->next) { - print_speed(freqs->frequency); + print_speed(freqs->frequency, no_rounding); printf(", "); freqs = freqs->next; } - print_speed(freqs->frequency); + print_speed(freqs->frequency, no_rounding); printf("\n"); cpufreq_put_available_frequencies(freqs); } diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index c03925bea655..fbbfa6047c83 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -200,5 +200,6 @@ extern struct bitmask *offline_cpus; void get_cpustate(void); void print_online_cpus(void); void print_offline_cpus(void); +void print_speed(unsigned long speed, int no_rounding); #endif /* __CPUPOWERUTILS_HELPERS__ */ diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c index e0d3145434d3..d693c96cd09c 100644 --- a/tools/power/cpupower/utils/helpers/misc.c +++ b/tools/power/cpupower/utils/helpers/misc.c @@ -164,3 +164,45 @@ void print_offline_cpus(void) printf(_("cpupower set operation was not performed on them\n")); } } + +/* + * print_speed + * + * Print the exact CPU frequency with appropriate unit + */ +void print_speed(unsigned long speed, int no_rounding) +{ + unsigned long tmp; + + if (no_rounding) { + if (speed > 1000000) + printf("%u.%06u GHz", ((unsigned int) speed/1000000), + ((unsigned int) speed%1000000)); + else if (speed > 1000) + printf("%u.%03u MHz", ((unsigned int) speed/1000), + (unsigned int) (speed%1000)); + else + printf("%lu kHz", speed); + } else { + if (speed > 1000000) { + tmp = speed%10000; + if (tmp >= 5000) + speed += 10000; + printf("%u.%02u GHz", ((unsigned int) speed/1000000), + ((unsigned int) (speed%1000000)/10000)); + } else if (speed > 100000) { + tmp = speed%1000; + if (tmp >= 500) + speed += 1000; + printf("%u MHz", ((unsigned int) speed/1000)); + } else if (speed > 1000) { + tmp = speed%100; + if (tmp >= 50) + speed += 100; + printf("%u.%01u MHz", ((unsigned int) speed/1000), + ((unsigned int) (speed%1000)/100)); + } + } + + return; +} -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 21/22] cpupower: print amd-pstate information on cpupower 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (19 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 20/22] cpupower: move print_speed function into misc helper Huang Rui @ 2021-11-30 12:36 ` Huang Rui 2021-11-30 12:36 ` [PATCH v5 22/22] Documentation: amd-pstate: add amd-pstate driver introduction Huang Rui 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui amd-pstate kernel module is using the fine grain frequency instead of acpi hardware pstate. So the performance and frequency values should be printed in frequency-info. Signed-off-by: Huang Rui --- tools/power/cpupower/utils/cpufreq-info.c | 9 ++++--- tools/power/cpupower/utils/helpers/amd.c | 28 ++++++++++++++++++++ tools/power/cpupower/utils/helpers/helpers.h | 5 ++++ 3 files changed, 39 insertions(+), 3 deletions(-) diff --git a/tools/power/cpupower/utils/cpufreq-info.c b/tools/power/cpupower/utils/cpufreq-info.c index b429454bf3ae..f828f3c35a6f 100644 --- a/tools/power/cpupower/utils/cpufreq-info.c +++ b/tools/power/cpupower/utils/cpufreq-info.c @@ -146,9 +146,12 @@ static int get_boost_mode_x86(unsigned int cpu) printf(_(" Supported: %s\n"), support ? _("yes") : _("no")); printf(_(" Active: %s\n"), active ? _("yes") : _("no")); - if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD && - cpupower_cpu_info.family >= 0x10) || - cpupower_cpu_info.vendor == X86_VENDOR_HYGON) { + if (cpupower_cpu_info.vendor == X86_VENDOR_AMD && + cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) { + amd_pstate_show_perf_and_freq(cpu, no_rounding); + } else if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD && + cpupower_cpu_info.family >= 0x10) || + cpupower_cpu_info.vendor == X86_VENDOR_HYGON) { ret = decode_pstates(cpu, b_states, pstates, &pstate_no); if (ret) return ret; diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c index bde6065cabf4..a1115891d76d 100644 --- a/tools/power/cpupower/utils/helpers/amd.c +++ b/tools/power/cpupower/utils/helpers/amd.c @@ -193,5 +193,33 @@ void amd_pstate_boost_init(unsigned int cpu, int *support, int *active) *active = cpuinfo_max == amd_pstate_max ? 1 : 0; } +void amd_pstate_show_perf_and_freq(unsigned int cpu, int no_rounding) +{ + printf(_(" AMD PSTATE Highest Performance: %lu. Maximum Frequency: "), + amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF)); + /* If boost isn't active, the cpuinfo_max doesn't indicate real max + * frequency. So we read it back from amd-pstate sysfs entry. + */ + print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ), no_rounding); + printf(".\n"); + + printf(_(" AMD PSTATE Nominal Performance: %lu. Nominal Frequency: "), + acpi_cppc_get_data(cpu, NOMINAL_PERF)); + print_speed(acpi_cppc_get_data(cpu, NOMINAL_FREQ) * 1000, + no_rounding); + printf(".\n"); + + printf(_(" AMD PSTATE Lowest Non-linear Performance: %lu. Lowest Non-linear Frequency: "), + acpi_cppc_get_data(cpu, LOWEST_NONLINEAR_PERF)); + print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_LOWEST_NONLINEAR_FREQ), + no_rounding); + printf(".\n"); + + printf(_(" AMD PSTATE Lowest Performance: %lu. Lowest Frequency: "), + acpi_cppc_get_data(cpu, LOWEST_PERF)); + print_speed(acpi_cppc_get_data(cpu, LOWEST_FREQ) * 1000, no_rounding); + printf(".\n"); +} + /* AMD P-States Helper Functions ***************/ #endif /* defined(__i386__) || defined(__x86_64__) */ diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h index fbbfa6047c83..5f6862502dbf 100644 --- a/tools/power/cpupower/utils/helpers/helpers.h +++ b/tools/power/cpupower/utils/helpers/helpers.h @@ -142,6 +142,8 @@ extern int cpufreq_has_boost_support(unsigned int cpu, int *support, extern bool cpupower_amd_pstate_enabled(void); extern void amd_pstate_boost_init(unsigned int cpu, int *support, int *active); +extern void amd_pstate_show_perf_and_freq(unsigned int cpu, + int no_rounding); /* AMD P-States stuff **************************/ @@ -182,6 +184,9 @@ static inline bool cpupower_amd_pstate_enabled(void) static void amd_pstate_boost_init(unsigned int cpu, int *support, int *active) { return; } +static inline void amd_pstate_show_perf_and_freq(unsigned int cpu, + int no_rounding) +{ return; } /* cpuid and cpuinfo helpers **************************/ -- 2.25.1 ^ permalink raw reply [flat|nested] 23+ messages in thread * [PATCH v5 22/22] Documentation: amd-pstate: add amd-pstate driver introduction 2021-11-30 12:36 [PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism Huang Rui ` (20 preceding siblings ...) 2021-11-30 12:36 ` [PATCH v5 21/22] cpupower: print amd-pstate information on cpupower Huang Rui @ 2021-11-30 12:36 ` Huang Rui 21 siblings, 0 replies; 23+ messages in thread From: Huang Rui @ 2021-11-30 12:36 UTC (permalink / raw) To: Rafael J . Wysocki, Viresh Kumar, Shuah Khan, Borislav Petkov, Peter Zijlstra, Ingo Molnar, Giovanni Gherdovich, linux-pm Cc: Deepak Sharma, Alex Deucher, Mario Limonciello, Steven Noonan, Nathan Fontenot, Jinzhou Su, Xiaojian Du, linux-kernel, x86, Huang Rui Introduce the amd-pstate driver design and implementation. Signed-off-by: Huang Rui --- Documentation/admin-guide/acpi/cppc_sysfs.rst | 2 + Documentation/admin-guide/pm/amd-pstate.rst | 383 ++++++++++++++++++ .../admin-guide/pm/working-state.rst | 1 + 3 files changed, 386 insertions(+) create mode 100644 Documentation/admin-guide/pm/amd-pstate.rst diff --git a/Documentation/admin-guide/acpi/cppc_sysfs.rst b/Documentation/admin-guide/acpi/cppc_sysfs.rst index fccf22114e85..e53d76365aa7 100644 --- a/Documentation/admin-guide/acpi/cppc_sysfs.rst +++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst @@ -4,6 +4,8 @@ Collaborative Processor Performance Control (CPPC) ================================================== +.. _cppc_sysfs: + CPPC ==== diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst new file mode 100644 index 000000000000..6bafb9354ba0 --- /dev/null +++ b/Documentation/admin-guide/pm/amd-pstate.rst @@ -0,0 +1,383 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. include:: + +=============================================== +``amd-pstate`` CPU Performance Scaling Driver +=============================================== + +:Copyright: |copy| 2021 Advanced Micro Devices, Inc. + +:Author: Huang Rui + + +Introduction +=================== + +``amd-pstate`` is the AMD CPU performance scaling driver that introduces a +new CPU frequency control mechanism on modern AMD APU and CPU series in +Linux kernel. The new mechanism is based on Collaborative Processor +Performance Control (CPPC) which provides finer grain frequency management +than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using +the ACPI P-states driver to manage CPU frequency and clocks with switching +only in 3 P-states. CPPC replaces the ACPI P-states controls, allows a +flexible, low-latency interface for the Linux kernel to directly +communicate the performance hints to hardware. + +``amd-pstate`` leverages the Linux kernel governors such as ``schedutil``, +``ondemand``, etc. to manage the performance hints which are provided by +CPPC hardware functionality that internally follows the hardware +specification (for details refer to AMD64 Architecture Programmer's Manual +Volume 2: System Programming [1]_). Currently ``amd-pstate`` supports basic +frequency control function according to kernel governors on some of the +Zen2 and Zen3 processors, and we will implement more AMD specific functions +in future after we verify them on the hardware and SBIOS. + + +AMD CPPC Overview +======================= + +Collaborative Processor Performance Control (CPPC) interface enumerates a +continuous, abstract, and unit-less performance value in a scale that is +not tied to a specific performance state / frequency. This is an ACPI +standard [2]_ which software can specify application performance goals and +hints as a relative target to the infrastructure limits. AMD processors +provides the low latency register model (MSR) instead of AML code +interpreter for performance adjustments. ``amd-pstate`` will initialize a +``struct cpufreq_driver`` instance ``amd_pstate_driver`` with the callbacks +to manage each performance update behavior. :: + + Highest Perf ------>+-----------------------+ +-----------------------+ + | | | | + | | | | + | | Max Perf ---->| | + | | | | + | | | | + Nominal Perf ------>+-----------------------+ +-----------------------+ + | | | | + | | | | + | | | | + | | | | + | | | | + | | | | + | | Desired Perf ---->| | + | | | | + | | | | + | | | | + | | | | + | | | | + | | | | + | | | | + | | | | + | | | | + Lowest non- | | | | + linear perf ------>+-----------------------+ +-----------------------+ + | | | | + | | Lowest perf ---->| | + | | | | + Lowest perf ------>+-----------------------+ +-----------------------+ + | | | | + | | | | + | | | | + 0 ------>+-----------------------+ +-----------------------+ + + AMD P-States Performance Scale + + +.. _perf_cap: + +AMD CPPC Performance Capability +-------------------------------- + +Highest Performance (RO) +......................... + +It is the absolute maximum performance an individual processor may reach, +assuming ideal conditions. This performance level may not be sustainable +for long durations and may only be achievable if other platform components +are in a specific state; for example, it may require other processors be in +an idle state. This would be equivalent to the highest frequencies +supported by the processor. + +Nominal (Guaranteed) Performance (RO) +...................................... + +It is the maximum sustained performance level of the processor, assuming +ideal operating conditions. In absence of an external constraint (power, +thermal, etc.) this is the performance level the processor is expected to +be able to maintain continuously. All cores/processors are expected to be +able to sustain their nominal performance state simultaneously. + +Lowest non-linear Performance (RO) +................................... + +It is the lowest performance level at which nonlinear power savings are +achieved, for example, due to the combined effects of voltage and frequency +scaling. Above this threshold, lower performance levels should be generally +more energy efficient than higher performance levels. This register +effectively conveys the most efficient performance level to ``amd-pstate``. + +Lowest Performance (RO) +........................ + +It is the absolute lowest performance level of the processor. Selecting a +performance level lower than the lowest nonlinear performance level may +cause an efficiency penalty but should reduce the instantaneous power +consumption of the processor. + +AMD CPPC Performance Control +------------------------------ + +``amd-pstate`` passes performance goals through these registers. The +register drives the behavior of the desired performance target. + +Minimum requested performance (RW) +................................... + +``amd-pstate`` specifies the minimum allowed performance level. + +Maximum requested performance (RW) +................................... + +``amd-pstate`` specifies a limit the maximum performance that is expected +to be supplied by the hardware. + +Desired performance target (RW) +................................... + +``amd-pstate`` specifies a desired target in the CPPC performance scale as +a relative number. This can be expressed as percentage of nominal +performance (infrastructure max). Below the nominal sustained performance +level, desired performance expresses the average performance level of the +processor subject to hardware. Above the nominal performance level, +processor must provide at least nominal performance requested and go higher +if current operating conditions allow. + +Energy Performance Preference (EPP) (RW) +......................................... + +Provides a hint to the hardware if software wants to bias toward performance +(0x0) or energy efficiency (0xff). + + +Key Governors Support +======================= + +``amd-pstate`` can be used with all the (generic) scaling governors listed +by the ``scaling_available_governors`` policy attribute in ``sysfs``. Then, +it is responsible for the configuration of policy objects corresponding to +CPUs and provides the ``CPUFreq`` core (and the scaling governors attached +to the policy objects) with accurate information on the maximum and minimum +operating frequencies supported by the hardware. Users can check the +``scaling_cur_freq`` information comes from the ``CPUFreq`` core. + +``amd-pstate`` mainly supports ``schedutil`` and ``ondemand`` for dynamic +frequency control. It is to fine tune the processor configuration on +``amd-pstate`` to the ``schedutil`` with CPU CFS scheduler. ``amd-pstate`` +registers adjust_perf callback to implement the CPPC similar performance +update behavior. It is initialized by ``sugov_start`` and then populate the +CPU's update_util_data pointer to assign ``sugov_update_single_perf`` as +the utilization update callback function in CPU scheduler. CPU scheduler +will call ``cpufreq_update_util`` and assign the target performance +according to the ``struct sugov_cpu`` that utilization update belongs to. +Then ``amd-pstate`` updates the desired performance according to the CPU +scheduler assigned. + + +Processor Support +======================= + +The ``amd-pstate`` initialization will fail if the _CPC in ACPI SBIOS is +not existed at the detected processor, and it uses ``acpi_cpc_valid`` to +check the _CPC existence. All Zen based processors support legacy ACPI +hardware P-States function, so while the ``amd-pstate`` fails to be +initialized, the kernel will fall back to initialize ``acpi-cpufreq`` +driver. + +There are two types of hardware implementations for ``amd-pstate``: one is +`Full MSR Support `_ and another is `Shared Memory Support +`_. It can use :c:macro:`X86_FEATURE_CPPC` feature flag (for +details refer to Processor Programming Reference (PPR) for AMD Family +19h Model 51h, Revision A1 Processors [3]_) to indicate the different +types. ``amd-pstate`` is to register different ``static_call`` instances +for different hardware implementations. + +Currently, some of Zen2 and Zen3 processors support ``amd-pstate``. In the +future, it will be supported on more and more AMD processors. + +Full MSR Support +----------------- + +Some new Zen3 processors such as Cezanne provide the MSR registers directly +while the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is set. +``amd-pstate`` can handle the MSR register to implement the fast switch +function in ``CPUFreq`` that can shrink latency of frequency control on the +interrupt context. The functions with ``pstate_xxx`` prefix represent the +operations of MSR registers. + +Shared Memory Support +---------------------- + +If :c:macro:`X86_FEATURE_CPPC` CPU feature flag is not set, that means the +processor supports shared memory solution. In this case, ``amd-pstate`` +uses the ``cppc_acpi`` helper methods to implement the callback functions +that defined on ``static_call``. The functions with ``cppc_xxx`` prefix +represent the operations of acpi cppc helpers for shared memory solution. + + +AMD P-States and ACPI hardware P-States always can be supported in one +processor. But AMD P-States has the higher priority and if it is enabled +with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond +to the request from AMD P-States. + + +User Space Interface in ``sysfs`` +================================== + +``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to +control its functionality at the system level. They located in the +``/sys/devices/system/cpu/cpufreq/policyX/`` directory and affect all CPUs. :: + + root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd* + /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf + /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq + /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq + + +``amd_pstate_highest_perf / amd_pstate_max_freq`` + +Maximum CPPC performance and CPU frequency that the driver is allowed to +set in percent of the maximum supported CPPC performance level (the highest +performance supported in `AMD CPPC Performance Capability `_). +In some of ASICs, the highest CPPC performance is not the one in the _CPC +table, so we need to expose it to sysfs. If boost is not active but +supported, this maximum frequency will be larger than the one in +``cpuinfo``. +This attribute is read-only. + +``amd_pstate_lowest_nonlinear_freq`` + +The lowest non-linear CPPC CPU frequency that the driver is allowed to set +in percent of the maximum supported CPPC performance level (Please see the +lowest non-linear performance in `AMD CPPC Performance Capability +`_). +This attribute is read-only. + +For other performance and frequency values, we can read them back from +``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`. + + +``amd-pstate`` vs ``acpi-cpufreq`` +====================================== + +On majority of AMD platforms supported by ``acpi-cpufreq``, the ACPI tables +provided by the platform firmware used for CPU performance scaling, but +only provides 3 P-states on AMD processors. +However, on modern AMD APU and CPU series, it provides the collaborative +processor performance control according to ACPI protocol and customize this +for AMD platforms. That is fine-grain and continuous frequency range +instead of the legacy hardware P-states. ``amd-pstate`` is the kernel +module which supports the new AMD P-States mechanism on most of future AMD +platforms. The AMD P-States mechanism will be the more performance and energy +efficiency frequency management method on AMD processors. + +Kernel Module Options for ``amd-pstate`` +========================================= + +``shared_mem`` +Use a module param (shared_mem) to enable related processors manually with +**amd_pstate.shared_mem=1**. +Due to the performance issue on the processors with `Shared Memory Support +`_, so we disable it for the moment and will enable this by default +once we address performance issue on this solution. + +The way to check whether current processor is `Full MSR Support `_ +or `Shared Memory Support `_ : :: + + ray@hr-test1:~$ lscpu | grep cppc + Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm + +If CPU Flags have cppc, then this processor supports `Full MSR Support +`_. Otherwise it supports `Shared Memory Support `_. + + +``cpupower`` tool support for ``amd-pstate`` +=============================================== + +``amd-pstate`` is supported on ``cpupower`` tool that can be used to dump the frequency +information. And it is in progress to support more and more operations for new +``amd-pstate`` module with this tool. :: + + root@hr-test1:/home/ray# cpupower frequency-info + analyzing CPU 0: + driver: amd-pstate + CPUs which run at the same hardware frequency: 0 + CPUs which need to have their frequency coordinated by software: 0 + maximum transition latency: 131 us + hardware limits: 400 MHz - 4.68 GHz + available cpufreq governors: ondemand conservative powersave userspace performance schedutil + current policy: frequency should be within 400 MHz and 4.68 GHz. + The governor "schedutil" may decide which speed to use + within this range. + current CPU frequency: Unable to call hardware + current CPU frequency: 4.02 GHz (asserted by call to kernel) + boost state support: + Supported: yes + Active: yes + AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.68 GHz. + AMD PSTATE Nominal Performance: 117. Nominal Frequency: 3.30 GHz. + AMD PSTATE Lowest Non-linear Performance: 39. Lowest Non-linear Frequency: 1.10 GHz. + AMD PSTATE Lowest Performance: 15. Lowest Frequency: 400 MHz. + + +Diagnostics and Tuning +======================= + +Trace Events +-------------- + +There are two static trace events that can be used for ``amd-pstate`` +diagnostics. One of them is the cpu_frequency trace event generally used +by ``CPUFreq``, and the other one is the ``amd_pstate_perf`` trace event +specific to ``amd-pstate``. The following sequence of shell commands can +be used to enable them and see their output (if the kernel is generally +configured to support event tracing). :: + + root@hr-test1:/home/ray# cd /sys/kernel/tracing/ + root@hr-test1:/sys/kernel/tracing# echo 1 > events/amd_cpu/enable + root@hr-test1:/sys/kernel/tracing# cat trace + # tracer: nop + # + # entries-in-buffer/entries-written: 47827/42233061 #P:2 + # + # _-----=> irqs-off + # / _----=> need-resched + # | / _---=> hardirq/softirq + # || / _--=> preempt-depth + # ||| / delay + # TASK-PID CPU# |||| TIMESTAMP FUNCTION + # | | | |||| | | + -0 [015] dN... 4995.979886: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=15 changed=false fast_switch=true + -0 [007] d.h.. 4995.979893: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true + cat-2161 [000] d.... 4995.980841: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=0 changed=false fast_switch=true + sshd-2125 [004] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=4 changed=false fast_switch=true + -0 [007] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true + -0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true + -0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true + +The cpu_frequency trace event will be triggered either by the ``schedutil`` scaling +governor (for the policies it is attached to), or by the ``CPUFreq`` core (for the +policies with other scaling governors). + + +Reference +=========== + +.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming, + https://www.amd.com/system/files/TechDocs/24593.pdf + +.. [2] Advanced Configuration and Power Interface Specification, + https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf + +.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors + https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip + diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst index f40994c422dc..5d2757e2de65 100644 --- a/Documentation/admin-guide/pm/working-state.rst +++ b/Documentation/admin-guide/pm/working-state.rst @@ -11,6 +11,7 @@ Working-State Power Management intel_idle cpufreq intel_pstate + amd-pstate cpufreq_drivers intel_epb intel-speed-select -- 2.25.1 --