{"id":136,"date":"2026-01-20T00:00:00","date_gmt":"2026-01-19T23:00:00","guid":{"rendered":"https:\/\/helloblog.io\/fi\/wp-bench-wordpress-ai-benchmark\/"},"modified":"2026-01-20T00:00:00","modified_gmt":"2026-01-19T23:00:00","slug":"wp-bench-wordpress-ai-benchmark","status":"publish","type":"post","link":"https:\/\/helloblog.io\/fi\/wp-bench-wordpress-ai-benchmark\/","title":{"rendered":"WP-Bench: miten hyvin AI oikeasti osaa WordPressi\u00e4 (ja miten testaat sen itse)"},"content":{"rendered":"\n<p>WordPress-kehityksess\u00e4 AI-avustaja on parhaimmillaan silloin, kun se ymm\u00e4rt\u00e4\u00e4 muutakin kuin PHP-syntaksin: miten hookit ketjuuntuvat, miten tietokantakyselyt tehd\u00e4\u00e4n oikein, mit\u00e4 WordPress Coding Standards k\u00e4yt\u00e4nn\u00f6ss\u00e4 tarkoittaa ja miss\u00e4 kohtaa tietoturva menee rikki (nonce, capability checks, escaping\/sanitization). T\u00e4st\u00e4 syntyy my\u00f6s ongelma: mallien vertailu \u201cyleisill\u00e4\u201d ohjelmointitesteill\u00e4 ei kerro kovin paljon WordPress-arkeen liittyv\u00e4st\u00e4 osaamisesta.<\/p>\n\n\n\n<p>T\u00e4t\u00e4 aukkoa paikkaamaan WordPress-projekti on tuonut esiin <strong>WP-Benchin<\/strong> \u2013 virallisen WordPress AI -benchmarkin. Sen idea on yksinkertainen: mitataan nimenomaan sit\u00e4, miten hyvin kielimallit suoriutuvat WordPress-kehityksen osa-alueista, sek\u00e4 teoriassa ett\u00e4 k\u00e4yt\u00e4nn\u00f6n koodissa.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Mik\u00e4 WP-Bench on (ja mit\u00e4 se mittaa)?<\/h2>\n\n\n\n<p>WP-Bench on avoimen l\u00e4hdekoodin benchmark (vertailutesti), joka arvioi kielimallien WordPress-osaamista kahdessa ulottuvuudessa:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>Knowledge<\/strong>: monivalintakysymyksi\u00e4 WordPress-konsepteista, core-API:sta, hookeista, tietoturvak\u00e4yt\u00e4nn\u00f6ist\u00e4 ja koodausstandardeista. Mukana painotus my\u00f6s uudempiin lis\u00e4yksiin kuten <strong>Abilities API<\/strong> ja <strong>Interactivity API<\/strong> (ensimm\u00e4isell\u00e4 maininnalla: WordPressin uudemmat rajapinnat\/arkkitehtuuripalikat, joiden osaaminen ei v\u00e4ltt\u00e4m\u00e4tt\u00e4 ole ehtinyt mallien koulutusdataan).<\/li>\n\n\n<li><strong>Execution<\/strong>: koodin generointiteht\u00e4vi\u00e4, jotka arvioidaan ajamalla syntynyt koodi oikeassa WordPress-ajoymp\u00e4rist\u00f6ss\u00e4. Arviointi sis\u00e4lt\u00e4\u00e4 staattista analyysi\u00e4 ja ajonaikaisia assertioita (testiv\u00e4itt\u00e4mi\u00e4).<\/li>\n\n<\/ul>\n\n\n\n<p>Oleellinen twist on se, ett\u00e4 <strong>WordPress itse toimii \u201ctuomarina\u201d<\/strong>: generoitua koodia ei vain verrata tekstiin, vaan se ajetaan eristetyss\u00e4 ymp\u00e4rist\u00f6ss\u00e4 ja katsotaan, toimiiko se, noudattaako se standardeja ja onko siin\u00e4 ilmeisi\u00e4 turvallisuusongelmia.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Miksi t\u00e4m\u00e4 on k\u00e4yt\u00e4nn\u00f6ss\u00e4 t\u00e4rke\u00e4\u00e4 WordPress-kehitt\u00e4j\u00e4lle?<\/h2>\n\n\n\n<p>WordPress py\u00f6ritt\u00e4\u00e4 isoa osaa webist\u00e4, mutta AI-mallit arvioidaan usein teht\u00e4vill\u00e4, joissa WordPressin omat reunaehdot puuttuvat. WP-Benchin arvo tulee siit\u00e4, ett\u00e4 se tuo vertailuun sen, mik\u00e4 WordPressiss\u00e4 oikeasti on tyypillisesti haastavaa: API:t, plugin-arkkitehtuuri, hook-pohjainen ohjelmointimalli ja tietoturvan perustekniikat.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>Ty\u00f6kalun valinta<\/strong>: jos rakennat AI-ominaisuuksia pluginin sis\u00e4\u00e4n tai k\u00e4yt\u00e4t koodiavustajaa p\u00e4ivitt\u00e4in, haluat mallin, joka tuottaa WordPressiin sopivaa koodia eik\u00e4 vain \u201cmelkein toimivaa PHP:t\u00e4\u201d.<\/li>\n\n\n<li><strong>Mallien kehityksen suuntaaminen<\/strong>: tavoite on, ett\u00e4 WP-Benchist\u00e4 muodostuu standardi, jonka AI-laboratoriot ajavat my\u00f6s ennen julkaisua. Kun WordPress-tulokset ovat n\u00e4kyviss\u00e4, niist\u00e4 tulee signaali, johon kannattaa optimoida.<\/li>\n\n\n<li><strong>Avoin leaderboard<\/strong>: ty\u00f6n alla on julkinen tulostaulukko, joka tekee mallien suoriutumisesta l\u00e4pin\u00e4kyv\u00e4mp\u00e4\u00e4 ja auttaa kehitt\u00e4ji\u00e4 vertailemaan vaihtoehtoja.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Miten WP-Benchin arviointi toimii?<\/h2>\n\n\n\n<p>Execution-puolella WP-Benchin \u201charness\u201d (ajokehys) k\u00e4y l\u00e4pi melko suoran putken, jossa mallin tuottama koodi vied\u00e4\u00e4n WordPress-ajoon:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n\n<li>Ajokehys l\u00e4hett\u00e4\u00e4 mallille promptin, jossa pyydet\u00e4\u00e4n WordPress-koodia tiettyyn teht\u00e4v\u00e4\u00e4n.<\/li>\n\n\n<li>Generoitu koodi toimitetaan WordPress-runtimeen <strong>WP-CLI:n<\/strong> kautta (WP-CLI = WordPressin komentorivity\u00f6kalu).<\/li>\n\n\n<li>Runtime tekee staattista analyysi\u00e4: syntaksi, koodausstandardit ja tietoturvapoikkeamat.<\/li>\n\n\n<li>Koodi ajetaan hiekkalaatikossa, jossa testit\/assertiot tarkistavat toiminnallisuuden.<\/li>\n\n\n<li>Tulokset palautuvat JSON-muodossa pisteiden ja yksityiskohtaisen palautteen kanssa.<\/li>\n\n<\/ol>\n\n\n\n<div class=\"wp-block-group callout callout-info is-style-info is-layout-flow wp-block-group-is-layout-flow\" style=\"border-width:1px;border-radius:8px;padding-top:1rem;padding-right:1.5rem;padding-bottom:1rem;padding-left:1.5rem\">\n\n<h4 class=\"wp-block-heading callout-title\">Mit\u00e4 t\u00e4m\u00e4 tarkoittaa arjessa?<\/h4>\n\n\n<p>Benchmark ei palkitse pelkk\u00e4\u00e4 \u201coikean n\u00e4k\u00f6ist\u00e4\u201d koodia. Se pyrkii mittaamaan sit\u00e4, tuottaako malli toimivaa, WordPress-ymp\u00e4rist\u00f6\u00f6n istuvaa ja turvallisilta perusratkaisuiltaan j\u00e4rkev\u00e4\u00e4 koodia.<\/p>\n\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Pikak\u00e4ynnistys: benchmark ajoon omalla koneella<\/h2>\n\n\n\n<p>Alla on tiivistetty polku, jolla saat WP-Benchin py\u00f6rim\u00e4\u00e4n. Kokonaisuus koostuu Python-puolen ajokehyksest\u00e4 ja erillisest\u00e4 WordPress-runtime-osasta (joka hoitaa arvioinnin).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Asenna ajokehys (Python)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>python3 -m venv .venv &amp;&amp; source .venv\/bin\/activate\npip install -e .\/python\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#B392F0\">python3<\/span><span style=\"color:#79B8FF\"> -m<\/span><span style=\"color:#9ECBFF\"> venv<\/span><span style=\"color:#9ECBFF\"> .venv<\/span><span style=\"color:#E1E4E8\"> &#x26;&#x26; <\/span><span style=\"color:#79B8FF\">source<\/span><span style=\"color:#9ECBFF\"> .venv\/bin\/activate<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">pip<\/span><span style=\"color:#9ECBFF\"> install<\/span><span style=\"color:#79B8FF\"> -e<\/span><span style=\"color:#9ECBFF\"> .\/python<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2) Lis\u00e4\u00e4 API-avaimet .env-tiedostoon<\/h3>\n\n\n\n<p>WP-Bench tukee useita mallitarjoajia. K\u00e4yt\u00e4nn\u00f6ss\u00e4 lis\u00e4\u00e4t projektin juureen <code>.env<\/code>-tiedoston ja viet sinne avaimet niille palveluille, joita aiot k\u00e4ytt\u00e4\u00e4:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>OPENAI_API_KEY=sk-...\nANTHROPIC_API_KEY=sk-ant-...\nGOOGLE_API_KEY=...\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#E1E4E8\">OPENAI_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">sk-...<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">ANTHROPIC_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">sk-ant-...<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">GOOGLE_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">...<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">3) K\u00e4ynnist\u00e4 WordPress runtime (arvioija)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd runtime\nnpm install\nnpm start\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">cd<\/span><span style=\"color:#9ECBFF\"> runtime<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">npm<\/span><span style=\"color:#9ECBFF\"> install<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">npm<\/span><span style=\"color:#9ECBFF\"> start<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4) Aja benchmark<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd ..\nwp-bench run --config wp-bench.example.yaml\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">cd<\/span><span style=\"color:#9ECBFF\"> ..<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.example.yaml<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Tulokset kirjoitetaan tiedostoon <code>output\/results.json<\/code>, ja testikohtaiset lokit l\u00f6ytyv\u00e4t muodossa <code>output\/results.jsonl<\/code>. N\u00e4m\u00e4 ovat hyvi\u00e4, kun haluat n\u00e4hd\u00e4, miss\u00e4 kohtaa malli kompastui (esim. v\u00e4\u00e4r\u00e4 hook, puuttuva escaping tai koodistandardi-virhe).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Usean mallin vertailu yhdell\u00e4 ajolla<\/h2>\n\n\n\n<p>WP-Benchin yksi k\u00e4yt\u00e4nn\u00f6llisimmist\u00e4 piirteist\u00e4 on se, ett\u00e4 voit listata useita malleja samaan konfiguraatioon ja ajaa ne per\u00e4kk\u00e4in. N\u00e4in saat vertailutaulukon ilman, ett\u00e4 joudut s\u00e4\u00e4t\u00e4m\u00e4\u00e4n komentoja joka kierroksella.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>models:\n  - name: gpt-4o\n  - name: gpt-4o-mini\n  - name: claude-sonnet-4-20250514\n  - name: claude-opus-4-5-20251101\n  - name: gemini\/gemini-2.5-pro\n  - name: gemini\/gemini-2.5-flash\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">models<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o-mini<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">claude-sonnet-4-20250514<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">claude-opus-4-5-20251101<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gemini\/gemini-2.5-pro<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gemini\/gemini-2.5-flash<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Mallinimet seuraavat LiteLLM:n konventioita (LiteLLM = kirjasto\/kerros, joka yhten\u00e4ist\u00e4\u00e4 eri tarjoajien mallien kutsumista ja nime\u00e4mist\u00e4).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Konfiguraation t\u00e4rkeimm\u00e4t kohdat<\/h2>\n\n\n\n<p>K\u00e4yt\u00e4nn\u00f6ss\u00e4 kopioit <code>wp-bench.example.yaml<\/code>-tiedoston ja muokkaat sit\u00e4. Olennaisimmat osat ovat datasetin l\u00e4hde, ajettava suite ja arvioijan (grader) runtime-polku:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>dataset:\n  source: local              # 'local' or 'huggingface'\n  name: wp-core-v1           # suite name\n\nmodels:\n  - name: gpt-4o\n\ngrader:\n  kind: docker\n  wp_env_dir: .\/runtime      # path to wp-env project\n\nrun:\n  suite: wp-core-v1\n  limit: 10                  # limit tests (null = all)\n  concurrency: 4\n\noutput:\n  path: output\/results.json\n  jsonl_path: output\/results.jsonl\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">dataset<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  source<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">local<\/span><span style=\"color:#6A737D\">              # 'local' or 'huggingface'<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">wp-core-v1<\/span><span style=\"color:#6A737D\">           # suite name<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">models<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">grader<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  kind<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">docker<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  wp_env_dir<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">.\/runtime<\/span><span style=\"color:#6A737D\">      # path to wp-env project<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">run<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  suite<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">wp-core-v1<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  limit<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#79B8FF\">10<\/span><span style=\"color:#6A737D\">                  # limit tests (null = all)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  concurrency<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#79B8FF\">4<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">output<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  path<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">output\/results.json<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  jsonl_path<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">output\/results.jsonl<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Hy\u00f6dylliset CLI-komennot<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>wp-bench run --config wp-bench.yaml          # aja configilla\nwp-bench run --model-name gpt-4o --limit 5   # nopea smoke test yhdelle mallille\nwp-bench dry-run --config wp-bench.yaml      # validoi config ilman mallikutsuja\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.yaml<\/span><span style=\"color:#6A737D\">          # aja configilla<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --model-name<\/span><span style=\"color:#9ECBFF\"> gpt-4o<\/span><span style=\"color:#79B8FF\"> --limit<\/span><span style=\"color:#79B8FF\"> 5<\/span><span style=\"color:#6A737D\">   # nopea smoke test yhdelle mallille<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> dry-run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.yaml<\/span><span style=\"color:#6A737D\">      # validoi config ilman mallikutsuja<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Milt\u00e4 projektin rakenne n\u00e4ytt\u00e4\u00e4?<\/h2>\n\n\n\n<p>Repo on jaettu selke\u00e4sti ajokehyksen, runtime-arvioijan ja datasetien kesken. T\u00e4m\u00e4 on hyv\u00e4 my\u00f6s kontribuutiota ajatellen: jos haluat lis\u00e4t\u00e4 teht\u00e4vi\u00e4, kosket yleens\u00e4 vain dataset-hakemistoa.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>.\n\u251c\u2500\u2500 python\/          # Benchmark harness (pip installable)\n\u251c\u2500\u2500 runtime\/         # WordPress grader plugin + wp-env config\n\u251c\u2500\u2500 datasets\/        # Test suites (local JSON + Hugging Face builder)\n\u251c\u2500\u2500 notebooks\/       # Results visualization and reporting\n\u2514\u2500\u2500 output\/          # Benchmark results (gitignored)\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">.<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> python\/<\/span><span style=\"color:#6A737D\">          # Benchmark harness (pip installable)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> runtime\/<\/span><span style=\"color:#6A737D\">         # WordPress grader plugin + wp-env config<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> datasets\/<\/span><span style=\"color:#6A737D\">        # Test suites (local JSON + Hugging Face builder)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> notebooks\/<\/span><span style=\"color:#6A737D\">       # Results visualization and reporting<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u2514\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> output\/<\/span><span style=\"color:#6A737D\">          # Benchmark results (gitignored)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Testisuitet: knowledge vs execution<\/h2>\n\n\n\n<p>Testisuitet l\u00f6ytyv\u00e4t polusta <code>datasets\/suites\/&lt;suite-name&gt;\/<\/code>. Jokaisessa suitessa on erikseen <code>execution\/<\/code> (kooditeht\u00e4v\u00e4t) ja <code>knowledge\/<\/code> (monivalinnat), tyypillisesti JSON-tiedostoina kategorioittain.<\/p>\n\n\n\n<p>Oletussuite <code>wp-core-v1<\/code> kattaa WordPress coren perusasioita: API:t, hookit, tietokantaoperaatiot ja tietoturvapatternit.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Dataset Hugging Facesta<\/h3>\n\n\n\n<p>Datasetin voi my\u00f6s ladata Hugging Facen kautta konfiguraatiolla:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>dataset:\n  source: huggingface\n  name: WordPress\/wp-bench-v1\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">dataset<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  source<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">huggingface<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">WordPress\/wp-bench-v1<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Nykytila ja rajoitteet, jotka kannattaa tiet\u00e4\u00e4<\/h2>\n\n\n\n<p>WP-Bench on julkaistu varhaisessa vaiheessa, ja se n\u00e4kyy muutamassa kohdassa. N\u00e4m\u00e4 eiv\u00e4t tee ty\u00f6kalusta hy\u00f6dyt\u00f6nt\u00e4 \u2014 p\u00e4invastoin, ne kertovat mihin suuntaan benchmarkia on tarkoitus vahvistaa \u2014 mutta tuloksia kannattaa tulkita niiden l\u00e4pi.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>Dataset on viel\u00e4 pieni<\/strong>: jotta vertailusta tulisi kattava, tarvitaan lis\u00e4\u00e4 testej\u00e4 eri WordPress-API:hin ja yleisiin arkkitehtuurimalleihin (my\u00f6s \u201coikean el\u00e4m\u00e4n\u201d edge case -tilanteita).<\/li>\n\n\n<li><strong>Versiopaino uuteen<\/strong>: benchmark painottaa t\u00e4ll\u00e4 hetkell\u00e4 WordPress 6.9 -aikakauden ominaisuuksia, kuten Abilities API ja Interactivity API. T\u00e4m\u00e4 on osin tarkoituksellista (uudet rajapinnat ovat usein malleille vaikeita), mutta se my\u00f6s vinouttaa tuloksia, koska osa malleista on koulutettu ennen n\u00e4it\u00e4 muutoksia.<\/li>\n\n\n<li><strong>Saturaatio vanhoissa aiheissa<\/strong>: varhaiset testit n\u00e4yttiv\u00e4t, ett\u00e4 vanhemmista WordPress-perusasioista mallit saavat helposti eritt\u00e4in korkeita pisteit\u00e4. Silloin kysymykset eiv\u00e4t en\u00e4\u00e4 erottele malleja, ja tarvitaan aidosti vaikeampia (ei vain \u201cuusia\u201d) teht\u00e4vi\u00e4.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Yhteenveto: mihin WP-Bench istuu kehitt\u00e4j\u00e4n ty\u00f6kalupakkiin?<\/h2>\n\n\n\n<p>WP-Bench on k\u00e4yt\u00e4nn\u00f6llinen tapa mitata, miten hyvin valitsemasi kielimalli selvi\u00e4\u00e4 nimenomaan WordPress-teht\u00e4vist\u00e4 \u2014 sek\u00e4 tietopohjaisesti ett\u00e4 ajettavan koodin tasolla. Jos k\u00e4yt\u00e4t AI:ta plugin- tai teemakehityksess\u00e4, benchmark antaa konkreettisen keinon vertailla malleja samalla viivalla ja l\u00f6yt\u00e4\u00e4 ne kohdat, joissa malli tuottaa eniten kitkaa (tai riski\u00e4) WordPress-projekteissa.<\/p>\n\n\n<div class=\"references-section\">\n                <h2>Viitteet \/ L\u00e4hteet<\/h2>\n                <ul class=\"references-list\"><li><a href=\"https:\/\/make.wordpress.org\/ai\/2026\/01\/14\/introducing-wp-bench-a-wordpress-ai-benchmark\/\" target=\"_blank\" rel=\"noopener noreferrer\">Introducing WP-Bench: A WordPress AI Benchmark<\/a><\/li><li><a href=\"https:\/\/github.com\/WordPress\/wp-bench\" target=\"_blank\" rel=\"noopener noreferrer\">WP-Bench GitHub README<\/a><\/li><\/ul>\n            <\/div>","protected":false},"excerpt":{"rendered":"<p>Koodiavustajat osaavat usein yleist\u00e4 ohjelmointia, mutta WordPressin omituisuudet paljastuvat vasta k\u00e4yt\u00e4nn\u00f6ss\u00e4. WP-Bench tuo WordPress-spesifin tavan mitata, mitk\u00e4 mallit ymm\u00e4rt\u00e4v\u00e4t core-API:t, hookit ja turvalliset k\u00e4yt\u00e4nn\u00f6t oikeasti.<\/p>\n","protected":false},"author":58,"featured_media":135,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[79],"tags":[25,80,33,10,7],"class_list":["post-136","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-benchmark","tag-tietoturva","tag-wordpress","tag-wp-cli"],"_links":{"self":[{"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/posts\/136","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/users\/58"}],"replies":[{"embeddable":true,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/comments?post=136"}],"version-history":[{"count":0,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/posts\/136\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/media\/135"}],"wp:attachment":[{"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/media?parent=136"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/categories?post=136"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/helloblog.io\/fi\/wp-json\/wp\/v2\/tags?post=136"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}