{"id":143,"date":"2026-01-20T00:00:00","date_gmt":"2026-01-19T23:00:00","guid":{"rendered":"https:\/\/helloblog.io\/et\/wp-bench-wordpressi-ai-benchmark\/"},"modified":"2026-01-20T00:00:00","modified_gmt":"2026-01-19T23:00:00","slug":"wp-bench-wordpressi-ai-benchmark","status":"publish","type":"post","link":"https:\/\/helloblog.io\/et\/wp-bench-wordpressi-ai-benchmark\/","title":{"rendered":"WP-Bench: ametlik WordPressi AI-benchmark, mis paneb mudelid p\u00e4riselt WP-koodi kirjutama"},"content":{"rendered":"\n<p>WordPressi projekt tuli v\u00e4lja uue algatusega, mis on arendajatele \u00fcllatavalt praktiline: <strong>WP-Bench<\/strong> on ametlik WordPressi AI-benchmark ehk standardiseeritud testikomplekt, millega hinnata, kuidas erinevad keelemudelid (LLM-id) saavad hakkama WordPressi-spetsiifiliste \u00fclesannetega. Mitte \u201ckirjuta mulle suvaline PHP funktsioon\u201d, vaid p\u00e4riselt: hook\u2019id, core API-d, plugin arhitektuur, turvamustrid ja koodistandardid.<\/p>\n\n\n\n<p>Oluline n\u00fcanss: WP-Bench ei piirdu viktoriinik\u00fcsimustega. Ta laseb mudelil koodi genereerida ja hindab tulemuse \u00e4ra <strong>p\u00e4ris WordPressi runtime\u2019is<\/strong>, automatiseeritud kontrollidega. See teeb temast rohkem \u201ct\u00f6\u00f6kindluse testi\u201d kui lihtsalt teoreetilise teadmistekontrolli.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Miks WordPress vajab oma benchmark\u2019i?<\/h2>\n\n\n\n<p>Enamik AI-mudeleid ja nende v\u00f5rdlustabeleid on \u00fcles ehitatud \u00fcldistele programmeerimis\u00fclesannetele. WordPressi puhul on see probleem, sest WP arenduses on palju spetsiifikat: konventsioonid, globaalne olek, hook\u2019ide (actions\/filters) maailm, turvaprintsiibid (nonce\u2019id, capability check\u2019id), andmebaasi ligip\u00e4\u00e4su mustrid, REST API erip\u00e4rad jne.<\/p>\n\n\n\n<p>WP-Bench t\u00e4idab selle t\u00fchimiku kahe suure eesm\u00e4rgiga:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>M\u00f5ista t\u00e4naseid mudeleid:<\/strong> kui valid endale t\u00f6\u00f6riista (nt koodiassistent, sisuloome\/automaatika plugin v\u00f5i \u201cAI agent\u201d), tahad teada, milline mudel p\u00e4riselt WordPressi kontekstis paremini hakkama saab.<\/li>\n\n\n<li><strong>M\u00f5jutada homseid mudeleid:<\/strong> kui AI-laborid teevad enne v\u00e4ljalaset eval\u2019e (pre-release evaluations), siis WordPressi v\u00f5imekus peaks olema \u00fcks m\u00f5\u00f5dikuid, mitte juhuslik k\u00f5rvalm\u00f5te. Benchmark loob motiivi optimeerida miljonite WP kasutajate ja arendajate jaoks.<\/li>\n\n\n<li><strong>Liikuda avatud edetabeli suunas:<\/strong> projekti plaanide j\u00e4rgi ehitatakse avalik leaderboard, mis n\u00e4itab mudelite tulemusi WordPressi \u00fclesannetes ja annab kogukonnale l\u00e4bipaistvuse.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Kuidas WP-Bench mudelit hindab: teadmised vs teostus<\/h2>\n\n\n\n<p>WP-Bench vaatab mudeli v\u00f5imekust kahest k\u00fcljest:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>Knowledge<\/strong> \u2013 valikvastustega k\u00fcsimused WordPressi kontseptsioonide kohta: API-d, hook\u2019id, turvamustrid, koodistandardid. Eraldi r\u00f5hk on ka uuematel lisandustel nagu <strong>Abilities API<\/strong> ja <strong>Interactivity API<\/strong> (mainitud kui \u201cmodern additions\u201d).<\/li>\n\n\n<li><strong>Execution<\/strong> \u2013 koodigeneratsiooni \u00fclesanded, mille tulemus pannakse p\u00e4riselt t\u00f6\u00f6le WordPressi keskkonnas ja hinnatakse automaatselt (staatiline anal\u00fc\u00fcs + runtime assertion\u2019id).<\/li>\n\n<\/ul>\n\n\n\n<p>Praktiline v\u00e4\u00e4rtus arendajale tuleb just sellest teisest osast. Kui mudel oskab \u201c\u00f5iget juttu r\u00e4\u00e4kida\u201d, aga toodab koodi, mis ei l\u00e4bi anal\u00fc\u00fcsi v\u00f5i kukub runtime\u2019is l\u00e4bi, siis see kajastub skooris.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Hindamise toru (grading pipeline) l\u00fchidalt<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n\n<li>Benchmark\u2019i harness saadab mudelile prompt\u2019i ja palub genereerida WordPressi koodi.<\/li>\n\n\n<li>Genereeritud kood edastatakse WordPressi runtime\u2019i, kasutades <strong>WP-CLI<\/strong>-d.<\/li>\n\n\n<li>Runtime teeb staatilise anal\u00fc\u00fcsi: s\u00fcntaks, koodistandardid, turvakontrollid.<\/li>\n\n\n<li>Kood k\u00e4ivitatakse sandbox\u2019is ning jooksutatakse assertion\u2019id\/testid.<\/li>\n\n\n<li>Tulemused tulevad tagasi JSON-ina koos skoori ja detailse tagasisidega.<\/li>\n\n<\/ol>\n\n\n\n<div class=\"wp-block-group callout callout-info is-style-info is-layout-flow wp-block-group-is-layout-flow\" style=\"border-width:1px;border-radius:8px;padding-top:1rem;padding-right:1.5rem;padding-bottom:1rem;padding-left:1.5rem\">\n\n<h4 class=\"wp-block-heading callout-title\">Mida see arendaja jaoks t\u00e4hendab?<\/h4>\n\n\n<p>Kui mudel \u201challutsineerib\u201d hook\u2019i nime, kasutab valet API-t v\u00f5i unustab nonce\/capability kontrollid, siis WP-Bench suudab seda j\u00e4rjest paremini kinni p\u00fc\u00fcda \u2013 mitte oletuse, vaid k\u00e4ituse p\u00f5hjal.<\/p>\n\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Kiirstart: kuidas WP-Bench enda masinas k\u00e4ima panna<\/h2>\n\n\n\n<p>WP-Bench repo on avalik GitHubis ning setup on t\u00fc\u00fcpiline \u201ckaks maailma\u201d lahendus: <strong>Python<\/strong>-p\u00f5hine harness + <strong>Node<\/strong>\/wp-env baasil WordPressi runtime grader. Allolev on l\u00fchike teekond, et saada esimene jooks tehtud.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Paigalda Python harness (virtuaalkeskkond + editable install)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>python3 -m venv .venv &amp;&amp; source .venv\/bin\/activate\npip install -e .\/python\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#B392F0\">python3<\/span><span style=\"color:#79B8FF\"> -m<\/span><span style=\"color:#9ECBFF\"> venv<\/span><span style=\"color:#9ECBFF\"> .venv<\/span><span style=\"color:#E1E4E8\"> &#x26;&#x26; <\/span><span style=\"color:#79B8FF\">source<\/span><span style=\"color:#9ECBFF\"> .venv\/bin\/activate<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">pip<\/span><span style=\"color:#9ECBFF\"> install<\/span><span style=\"color:#79B8FF\"> -e<\/span><span style=\"color:#9ECBFF\"> .\/python<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2) Pane API v\u00f5tmed .env faili<\/h3>\n\n\n\n<p>WP-Bench toetub mudelipakkujate API-dele. Lisa projekti juurkausta <code>.env<\/code> (v\u00f5i halda v\u00f5tmeid oma tavap\u00e4rases secrets-lahenduses):<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>OPENAI_API_KEY=sk-...\nANTHROPIC_API_KEY=sk-ant-...\nGOOGLE_API_KEY=...\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#E1E4E8\">OPENAI_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">sk-...<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">ANTHROPIC_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">sk-ant-...<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">GOOGLE_API_KEY<\/span><span style=\"color:#F97583\">=<\/span><span style=\"color:#9ECBFF\">...<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">3) K\u00e4ivita WordPressi runtime (grader)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd runtime\nnpm install\nnpm start\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">cd<\/span><span style=\"color:#9ECBFF\"> runtime<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">npm<\/span><span style=\"color:#9ECBFF\"> install<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">npm<\/span><span style=\"color:#9ECBFF\"> start<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4) Jooksuta benchmark<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd ..\nwp-bench run --config wp-bench.example.yaml\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">cd<\/span><span style=\"color:#9ECBFF\"> ..<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.example.yaml<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>V\u00e4ljund kirjutatakse <code>output\/results.json<\/code> faili ning testide detailsemad logid <code>output\/results.jsonl<\/code> faili.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Mitme mudeli v\u00f5rdlus \u00fches jooksus (praktiline mudelivalik)<\/h2>\n\n\n\n<p>Kui valid tiimis mudelit (v\u00f5i tahad kinnitada, kas \u201codavam\u201d variant on WordPressi jaoks piisav), siis WP-Bench suudab teha \u00fche konfiguratsiooniga j\u00e4rjestikuse v\u00f5rdluse. Konfis loetled mudelid ja harness v\u00e4ljastab v\u00f5rdlustabeli.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>models:\n  - name: gpt-4o\n  - name: gpt-4o-mini\n  - name: claude-sonnet-4-20250514\n  - name: claude-opus-4-5-20251101\n  - name: gemini\/gemini-2.5-pro\n  - name: gemini\/gemini-2.5-flash\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">models<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o-mini<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">claude-sonnet-4-20250514<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">claude-opus-4-5-20251101<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gemini\/gemini-2.5-pro<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gemini\/gemini-2.5-flash<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Mudelinimed j\u00e4rgivad <strong>LiteLLM<\/strong> konventsioone (see on praktiline standard, mille kaudu saab erinevaid pakkujaid \u00fchtse liidese taha \u00fchendada). Viide: <a href=\"https:\/\/docs.litellm.ai\/docs\/providers\">LiteLLM providers<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Konfiguratsioon: millised nupud on p\u00e4riselt olulised?<\/h2>\n\n\n\n<p>WP-Bench\u2019i n\u00e4idiskonfigist on h\u00e4sti n\u00e4ha, mis sind k\u00f5ige rohkem huvitab: dataset\u2019i allikas, grader\u2019i t\u00fc\u00fcp (Docker), suite\u2019i nimi, paralleelsus ja v\u00e4ljundfailid.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>dataset:\n  source: local              # 'local' or 'huggingface'\n  name: wp-core-v1           # suite name\n\nmodels:\n  - name: gpt-4o\n\ngrader:\n  kind: docker\n  wp_env_dir: .\/runtime      # path to wp-env project\n\nrun:\n  suite: wp-core-v1\n  limit: 10                  # limit tests (null = all)\n  concurrency: 4\n\noutput:\n  path: output\/results.json\n  jsonl_path: output\/results.jsonl\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">dataset<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  source<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">local<\/span><span style=\"color:#6A737D\">              # 'local' or 'huggingface'<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">wp-core-v1<\/span><span style=\"color:#6A737D\">           # suite name<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">models<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#E1E4E8\">  - <\/span><span style=\"color:#85E89D\">name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">gpt-4o<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">grader<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  kind<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">docker<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  wp_env_dir<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">.\/runtime<\/span><span style=\"color:#6A737D\">      # path to wp-env project<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">run<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  suite<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">wp-core-v1<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  limit<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#79B8FF\">10<\/span><span style=\"color:#6A737D\">                  # limit tests (null = all)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  concurrency<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#79B8FF\">4<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">output<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  path<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">output\/results.json<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  jsonl_path<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">output\/results.jsonl<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Kasulikud CLI k\u00e4sud igap\u00e4evaseks kasutuseks<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>wp-bench run --config wp-bench.yaml          # jooksuta config failiga\nwp-bench run --model-name gpt-4o --limit 5   # kiire test \u00fchele mudelile\nwp-bench dry-run --config wp-bench.yaml      # kontrolli config'i, ilma et API-sid kutsuks\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.yaml<\/span><span style=\"color:#6A737D\">          # jooksuta config failiga<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> run<\/span><span style=\"color:#79B8FF\"> --model-name<\/span><span style=\"color:#9ECBFF\"> gpt-4o<\/span><span style=\"color:#79B8FF\"> --limit<\/span><span style=\"color:#79B8FF\"> 5<\/span><span style=\"color:#6A737D\">   # kiire test \u00fchele mudelile<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">wp-bench<\/span><span style=\"color:#9ECBFF\"> dry-run<\/span><span style=\"color:#79B8FF\"> --config<\/span><span style=\"color:#9ECBFF\"> wp-bench.yaml<\/span><span style=\"color:#6A737D\">      # kontrolli config'i, ilma et API-sid kutsuks<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Kuidas testikomplektid (suites) on \u00fcles ehitatud?<\/h2>\n\n\n\n<p>WP-Bench\u2019i repo struktuur on jagatud loogilisteks osadeks: Python harness, WordPressi runtime, dataset\u2019id, notebook\u2019id visualiseerimiseks ja output kataloog tulemuste jaoks.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>.\n\u251c\u2500\u2500 python\/          # Benchmark harness (pip installable)\n\u251c\u2500\u2500 runtime\/         # WordPress grader plugin + wp-env config\n\u251c\u2500\u2500 datasets\/        # Test suites (local JSON + Hugging Face builder)\n\u251c\u2500\u2500 notebooks\/       # Results visualization and reporting\n\u2514\u2500\u2500 output\/          # Benchmark results (gitignored)\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#79B8FF\">.<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> python\/<\/span><span style=\"color:#6A737D\">          # Benchmark harness (pip installable)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> runtime\/<\/span><span style=\"color:#6A737D\">         # WordPress grader plugin + wp-env config<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> datasets\/<\/span><span style=\"color:#6A737D\">        # Test suites (local JSON + Hugging Face builder)<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u251c\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> notebooks\/<\/span><span style=\"color:#6A737D\">       # Results visualization and reporting<\/span><\/span>\n<span class=\"line\"><span style=\"color:#B392F0\">\u2514\u2500\u2500<\/span><span style=\"color:#9ECBFF\"> output\/<\/span><span style=\"color:#6A737D\">          # Benchmark results (gitignored)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Suite\u2019id elavad <code>datasets\/suites\/&lt;suite-name&gt;\/<\/code> ning on jaotatud kaheks:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><code>execution\/<\/code> \u2013 koodigeneratsiooni \u00fclesanded + assertion\u2019id (JSON failid kategooriate kaupa).<\/li>\n\n\n<li><code>knowledge\/<\/code> \u2013 valikvastustega k\u00fcsimused (samuti JSON failid kategooriate kaupa).<\/li>\n\n<\/ul>\n\n\n\n<p>Vaikimisi suite <strong>wp-core-v1<\/strong> katab WordPress core API-sid, hook\u2019e, andmebaasioperatsioone ja turvamustreid.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Dataset Hugging Face\u2019ist (kui tahad standardset allikat)<\/h3>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#24292e\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" style=\"color:#e1e4e8;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>dataset:\n  source: huggingface\n  name: WordPress\/wp-bench-v1\n<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki github-dark\" style=\"background-color:#24292e;color:#e1e4e8\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color:#85E89D\">dataset<\/span><span style=\"color:#E1E4E8\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  source<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">huggingface<\/span><\/span>\n<span class=\"line\"><span style=\"color:#85E89D\">  name<\/span><span style=\"color:#E1E4E8\">: <\/span><span style=\"color:#9ECBFF\">WordPress\/wp-bench-v1<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Hetkeseis ja piirangud, millega arvestada<\/h2>\n\n\n\n<p>WP-Bench on varajases faasis ning WordPressi tiim toob ise v\u00e4lja mitu kohta, kus arenguruumi on:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>Dataset\u2019i maht:<\/strong> praegune testide hulk on pigem v\u00e4ike. Et benchmark oleks \u201cp\u00e4riselt esinduslik\u201d, on vaja rohkem juhtumeid eri API-de ja mustrite kohta.<\/li>\n\n\n<li><strong>Versiooninihe:<\/strong> benchmark kaldub WordPress 6.9 uuemate teemade poole (nt Abilities API ja Interactivity API). See on osalt teadlik valik, sest just uute API-dega mudelid komistavad, kuid samas tekitab see kallutatust (paljud mudelid on treenitud vanema teadmise peal).<\/li>\n\n\n<li><strong>Benchmark\u2019i \u201csaturatsioon\u201d:<\/strong> varajased katsed n\u00e4itasid, et vanemate WP-kontseptsioonide peal saavad mudelid v\u00e4ga k\u00f5rgeid skoore. Seega peab k\u00fcsimusi\/\u00fclesandeid tegema nii, et need annaksid tugeva signaali, mitte ainult kinnitaksid ilmselget.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Kuidas panustada (kui tahad, et see m\u00f5\u00f5daks p\u00e4ris elu)<\/h2>\n\n\n\n<p>WP-Bench\u2019i v\u00e4\u00e4rtus s\u00f5ltub otseselt testjuhtumite kvaliteedist. WordPressi \u00f6kos\u00fcsteemis on k\u00fcmneid aastaid kogemust \u201ckohtadest, kus inimesed (ja AI) eksivad\u201d \u2013 keerulised hook\u2019ide interaktsioonid, edge-case turvakontrollid, valed eeldused WP_Query kohta, multisite erip\u00e4rad jne. Just sellised asjad on benchmark\u2019i jaoks kuld.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>Lisa testjuhtumeid (execution v\u00f5i knowledge).<\/li>\n\n\n<li>Jooksuta benchmark\u2019it mudelitega, mida p\u00e4riselt kasutad, ja vaata, kus nad l\u00e4bi kukuvad.<\/li>\n\n\n<li>Paranda grading-loogikat, et hinnang oleks rangem ja \u00fchtlasem.<\/li>\n\n\n<li>Panusta tulemustega avaliku edetabeli suunas (projekti eesm\u00e4rk on see avalikuks teha).<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Ressursid<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>WP-Bench GitHub Repository: <a href=\"https:\/\/github.com\/WordPress\/wp-bench\">https:\/\/github.com\/WordPress\/wp-bench<\/a><\/li>\n\n\n<li>AI Building Blocks for WordPress: <a href=\"https:\/\/make.wordpress.org\/ai\/2025\/07\/17\/ai-building-blocks\/\">https:\/\/make.wordpress.org\/ai\/2025\/07\/17\/ai-building-blocks\/<\/a><\/li>\n\n\n<li>#core-ai Slack channel: <a href=\"https:\/\/wordpress.slack.com\/archives\/C08TJ8BPULS\">https:\/\/wordpress.slack.com\/archives\/C08TJ8BPULS<\/a><\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Kokkuv\u00f5te<\/h2>\n\n\n\n<p>WP-Bench on sisuliselt katse tuua WordPressi arenduse AI-t\u00f6\u00f6riistadesse sama, mis meil on mujal: v\u00f5rreldav m\u00f5\u00f5dik. Eriti tugev on l\u00e4henemine, kus kood ei j\u00e4\u00e4 tekstiks chat\u2019i aknasse, vaid pannakse WordPressi runtime\u2019is reaalselt t\u00f6\u00f6le ja hinnatakse automaatselt. Kui see projekt kasvab dataset\u2019ide ja rangema hindamise m\u00f5ttes, v\u00f5ib sellest saada \u00fcks olulisemaid signaale, mille p\u00f5hjal valida mudelit just WP t\u00f6\u00f6deks.<\/p>\n\n\n<div class=\"references-section\">\n                <h2>Viited \/ Allikad<\/h2>\n                <ul class=\"references-list\"><li><a href=\"https:\/\/make.wordpress.org\/ai\/2026\/01\/14\/introducing-wp-bench-a-wordpress-ai-benchmark\/\" target=\"_blank\" rel=\"noopener noreferrer\">Introducing WP-Bench: A WordPress AI Benchmark<\/a><\/li><li><a href=\"https:\/\/github.com\/WordPress\/wp-bench\" target=\"_blank\" rel=\"noopener noreferrer\">WP-Bench GitHub README<\/a><\/li><li><a href=\"https:\/\/make.wordpress.org\/ai\/2025\/07\/17\/ai-building-blocks\/\" target=\"_blank\" rel=\"noopener noreferrer\">AI Building Blocks for WordPress<\/a><\/li><li><a href=\"https:\/\/wordpress.slack.com\/archives\/C08TJ8BPULS\" target=\"_blank\" rel=\"noopener noreferrer\">#core-ai Slack channel<\/a><\/li><li><a href=\"https:\/\/docs.litellm.ai\/docs\/providers\" target=\"_blank\" rel=\"noopener noreferrer\">LiteLLM Providers<\/a><\/li><\/ul>\n            <\/div>","protected":false},"excerpt":{"rendered":"<p>Kui kasutad koodiassistenti WordPressi pluginas v\u00f5i ehitad AI-toega funktsioone, on \u00fcks k\u00fcsimus v\u00e4ltimatu: kui h\u00e4sti see mudel tegelikult WordPressi m\u00f5istab? WP-Bench proovib sellele anda m\u00f5\u00f5detava vastuse.<\/p>\n","protected":false},"author":48,"featured_media":142,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[25,76,77,9,8],"class_list":["post-143","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-wordpressi-okosusteem","tag-ai","tag-arendustooriistad","tag-benchmark","tag-wordpress","tag-wp-cli"],"_links":{"self":[{"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/posts\/143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/users\/48"}],"replies":[{"embeddable":true,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/comments?post=143"}],"version-history":[{"count":0,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/posts\/143\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/media\/142"}],"wp:attachment":[{"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/media?parent=143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/categories?post=143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/helloblog.io\/et\/wp-json\/wp\/v2\/tags?post=143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}