pokutuna.com

pokutuna

Web Developer / Software Engineer

Hyogo, Japan

Publications

Podcast - Backyard Hatena
2022-04-08
#8 id:pokutuna に聞くココピーの現在と未来
はてなのエンジニア組織である技術グループでやっているポッドキャストで、ビジネスプラットフォームチームの活動や Chrome 拡張「ココピー」について話しました。
TechTalk - Google Cloud Born Digital Summit
2022-03-17
はてな広告配信システムのクラウドネイティブ化への道のり
広告配信システムを Google Cloud へ移転しました。Elasticsearch で行っていた配信ログの集計を BigQuery へ置き換える過程を中心に GCP の各種プロダクトの活用事例を紹介しました。

Products

npm package
@pokutuna/envelop-response-cache-firestore
2022-09-15
Firestore cache implementation for @envelop/response-cache plugin
Chrome Extension
cocopy
2020-09-23
a chrome extension to copy text by your codepost: js を書いて URL やページの内容を加工してコピーできる Chrome 拡張ココピーのご紹介
npm package
@pokutuna/requestlog-cloudfunctions
2020-04-08
a middleware to write request logs for Google Cloud Functionspost: Cloud Functions でログをグルーピングする

ぽ靴な缶
2025-11-17
松尾研LLM講座申し込み締切もうすぐ!! & 2024 年講座の思い出
日記AI
大規模言語モデル講座　応用編　　2025 Autumn - 東京大学松尾・岩澤研究室（松尾研）- Matsuo Lab 2025/11/19(水) AM10:00 まで!!! 社会人枠もあります。自分は前年に参加してとても良かったのでおすすめしています。各社の LLM API を雰囲気で利用していて、理解が足りてないな、もどかしいなと感じて応募したのですが、そういった Web エンジニアの方々は多くいるのではないでしょうか。前年の講義スライドは公開されている。 LLM 大規模言語モデル講座2024講義スライド - 東京大学松尾・岩澤研究室（松尾研）- Matsuo Lab あ、応募に名刺が要る気がするな、締切2日前に言われても準備できないかもな、まあいいか 2024 年講義の最終課題コンペで優秀賞を頂きました自分は去年の講義に参加させて頂いて、一般8位、コントリビューション3位の優秀賞を頂きました。この自慢をする機会を逸し続けていた。枠が NN になってるのが良い講義の内容も良かったのだが、特に自分で手を動かす機会としてコンペがあったのは良かったです。コンペの内容は事前学習済みモデルを訓練し ELYZA-tasks-100 の改変版に対し高いスコアを出す LLM を開発する 2024年9月以降の日本のテレビ番組内容で各タスクのトピックを置き換えて若干難易度を上げたもの制約として、利用可能な事前学習済みモデルは LLM-jp-3 または Gemma2 (自分で事前学習しても良い) L4 GPU 1 枚の環境で 1 時間以内に 100 問に回答モデルが公開可能であること(データとモデルのライセンス遵守) 予選は自動評価、決勝は参加者による評価というものでした。自分は、 gemma-2-9b をベースに継続事前学習 → SFT → DPO したものをメインに使う Gemma がことわざ・慣用句などの日本語文化問題に弱かったので、llm-jp-3-13b を SFT したものをサブモデルに用意という構成で勝負。問題文からルーティングする分類器を作るつもりだったけど、これ問題文のバリエーション無いなと思ったので正規表現で済ませたり、締め切り直前に苦手な問題の学習データを自分で作ってみたり pokutuna/tasks-ime-and-kakko-jp、人間の目視評価ってことは TeX 記法や Markdown が丸出しなのは見栄え悪いよな...と思って置換してみたり、意味あることもないことも色々試す機会になって楽しかったです。最初は GPU 代を払うことにケチケチしていたけど、社会人にあって学生にないもの、それは金... GPU が使える...安すぎる... RunPod に円を注入することを覚えました。早く来すぎた個人的に衝撃を受けたのが表彰式 & 懇親会で、めちゃくちゃ色んな人が来ていたことですね。高校生から60代、寝たきりの方、仕事で LLM 訓練している人、研究者、医師まで居て、属性が多様すぎる。たった 1 年前の話なのに AI の話題が色々ありすぎてもう懐かしい。これは当時のな〜〜んにもわからないところからスタートした際のメモ書き、下から上に書いてある。 2024/11 LLM チューニング日記 - pokutuna ちょうど Twitter で盛り上がっている話題として松尾研発スタートアップ - 東京大学松尾・岩澤研究室（松尾研）- Matsuo Lab 胡散臭いと思われる理由の1つとして、講座を受講しただけで「松尾研卒です!!」みたいなこと言ってる例はたまにあるよな... 毎年 3000-4000 人が参加していてそういうこともある、自分はその間口の広さによって良い経験をさせてもらえたので、不届き者をシバきつつ今の感じで続いてほしいなと思います。
ぽ靴な缶
2025-08-04
開いているブラウザの内容を読める MCP サーバー
AIツール作った
を作りました @pokutuna/mcp-chrome-tabs デモ AppleScript を使っている都合上 macOS でのみ動きます。なぜ作ったか既に LLM にブラウザを操作させる技術は色々ある。browser-use、playwright-mcp、mcp-chrome などなど。これらの Tool, MCP は便利だけど、様々なツールが入っていてコンテキストへの圧迫が大きく、Chrome の debugging port を開けたり、普段のユーザディレクトリは使えなかったり、ブリッジとなるブラウザ拡張入れたりなど、まあまあめんどく、普段遣いしたいかというと否である。なのでタスクに応じて MCP サーバーのセットを切り替えて暮らしている。ただ今開いているページを LLM に読ませたい普段はブラウザ操作をしたいわけではなく、今見ているページを LLM に読ませたいことがほとんどだなと気づいた。いまや大抵の AI アプリケーションは URL を取得して読んでくれるし、検索エンジンを使えたり便利になっているものの、様々な煩わしさがある。 URL を貼っても読んでくれず「fetch して」と追加で指示取得し始めるものの robots.txt に阻まれる "代わりに別のページを検索します" やめてくれ〜検索したら「これは別の話だな...」「古いの参照してるな...」と実行を停める認証情報がなく見られない、与えるのもひと手間長大なページを読んで Prompt too long エラーで死ぬただ俺が今見てるページ読んでくれ!! という気持ちで作ったのが @pokutuna/mcp-chrome-tabs です。機能・特徴 Apple Script 経由でタブのコンテンツを取得する追加の HTTP リクエスト送らず、ページの内容をそのまま参照本文部分を取得し Markdown に変換 @mozilla/readability の代替を目指す defuddle を使うツール 3 つのみ、説明も出力も短めで普段遣いできるように list_tabs: 現在開いているタブの一覧を取得する read_tab_content: 指定したタブの本文部分を Markdown で取得する open_in_new_tab: AI 側からユーザのブラウザで URL を開けるように実験的に MCP Resources に対応 Claude Code の @ 補完にタブの候補が出ます単純な MCP サーバーで似たものもいくつかあるけど、意外と満足行くものがなく1作ることにしました。名前に chrome と入れてしまったけどオプションを指定することで Safari でも動く。使い方以下のような感じで MCP サーバーとして追加します: { "mcpServers": { "chrome-tabs": { "command": "npx", "args": ["-y", "@pokutuna/mcp-chrome-tabs@latest"] } } } Claude Code の場合は以下のコマンドで: $ claude mcp add -s user chrome-tabs -- npx -y @pokutuna/mcp-chrome-tabs@latest その後、Chrome のメニューから表示 > 開発 / 管理 > Apple Events からのJavaScript を許可 AppleScript 経由からの Chrome の操作を許可してください。2 実際の使用例作ってからほぼ毎日使っています。こんな感じで活用中。議事録開きながら「決定事項を箇条書きにして」コーディングエージェントが新しすぎて LLM 側に情報がないような作業するときに、公式ドキュメントやリファレンスを開いておいて「タブ一覧から関連する内容を確認してから作業して」リポジトリにない設計ドキュメントを開いておいて「現在のタブの内容から重要なものを CLAUDE.md に反映して」手で丁寧に渡すなどの代替手段でできることではありますが、いきなり指示して読めるのが便利。 defuddle の本文抽出都合でうまく取れないケースもあり改善したいものの、まあ大抵はうまくいきます。本文テキストを取り出すものなので、具体的には GitHub のコードのページなんかはうまくいかないけど、URL を確認してから gh を使ってくれたり / 使うよう促したり。 MCP Resources プロトコルに対応 Tools だけでなく MCP の Resources プロトコルにも対応してみました。 Claude Code では @file で特定のファイルを参照させることができるけど、その補完候補に Resource を登場させることができます。@cur で tab://current の参照を補完させたり、@{ページタイトルやホスト名の一部} など引っかかるワードで補完候補から選べます @ で補完候補に出て選べるタブ集合の差分を検出してリソースリストの再取得を促す listChanged notification を送る実装もしていますが、まだ Claude Code 側が対応しておらず起動時のタブリストから更新されない。まあそのうち対応されるでしょう... まとめ使ってみてねおまけ情報: playwright-mcp 公式のブリッジ拡張が来そうリポジトリを見ると見つかる。まだ動きはするけど PoC かな? 再接続ができなかったり親切さゼロだったり。 playwright-chrome-extension - pokutuna Chrome コネクタも似たような仕組みではあるものの、実装がいまいちで本文部分の切り出しがなかったり。↩ Chrome が制限厳しくしたのに横穴開けている感はあるので注意は必要↩
ぽ靴な缶
2025-07-04
書いてない記事2025夏
日記生活
ブログに書きたいと思いつつ書いてない話の供養です Google Open Source Peer Bonus Award 頂きました Dataform へのコントリビュートで貰った、2023 年末の話... Majestouch Xacro M10SP 良い自作に手を出したくないが分割キーボードは欲しい、このマクロキーの配置は良くめっちゃ使ってる Google Processional Cloud Security Engineer を取得した本来は資格延長のためのバウチャーぽかったけど新しいことを知りたかったので、正直使わない知識(US の法規制とか)も多かったけど面白かったはてなインターン 2024 で AI に関する講義をした時流にあわせて AI の話をする、歴史っぽい話から、ベンチマーク鵜呑みにするなよという話東京大学松尾・岩澤研「大規模言語モデル2024」の公開講義を受講した最終課題は受講生同士でバトルするコンペ、3000人?中、一般8位&コントリビューション3位になった、これはちゃんと書きたい Modern App Summit '25 基調講演で話した Google Cloud イベントの基調講演中の 10 分枠で発表した、短いし楽かと思いきや自分のトークを聞きに来てない聴衆に話すのは難しい、ふわっとした話になって反省もっと細かく書きたいとは思っています。思ってはいる。
ぽ靴な缶
2025-04-14
Emacs 時代の愛用テーマを VSCode テーマにした
VSCode作ったAIOSS活動
AI と一緒に作った。このようなテーマ。 pokutuna.vscode-gnome2like-theme marketplace.visualstudio.com このテーマは Emacs の color-theme.el に含まれる color-theme-gnome2 を起源とした配色です。故郷学生の頃、2008 年あたりにから Emacs を使い始め、仕事でも長く Emacs を使っていたけど、TypeScript の書く体験の良さから徐々に VSCode の比率が上がっていったり Live Share でペアプロをする必要が出たりして、今では VSCode がメインエディタになってしまった。今ではほぼ Emacs を起動していないが、Emacs を故郷《ふるさと》にように思っている。当時 Emacs では color-theme.el の color-theme-gnome2 を愛用していた。 emacsmirror/color-theme-modern より引用この Dark でも Light でもない緑ベースのテーマの中で、あらゆるプログラミング活動を行っていたよな。この配色が好きで VSCode に持っていきたいと思っていたが、指定する色数が多く腰が重かった。 Emacs に比べると VSCode テーマで指定しなきゃいけない色数はめちゃくちゃ多く、過去のメモによると 2019 年ごろにもトライして飽きている。せっかくコーディングエージェントが流行っているので改めてやってみようと思い立って実装した。 Roo Code と sonnet 3.7 と gemini-2.0-pro-exp (書いてる時は 2.5 出てなかった) に指示を出すと、あれよあれよとできていき、完成度 70% ぐらいまですぐ行けた。今や VSCode がメインエディタとなって久しいが、Emacs 時代にずーーっと愛用していた color-theme.el の gnome2 テーマを Roo Code と一緒に VSCode に再現している、整っていくにつれ懐かしさで泣きそう pic.twitter.com/BSRclQTgRb — pokutuna (@pokutuna) 2025年3月2日とはいえここからが長く、UI 上の指定が足りないところや syntax highlight のトークンごとの指定、一貫性に欠ける部分を修正していく必要があった。仕事しながら2週間ぐらい使って不満があらかた消えたので、公開することにした。こちらからどうぞ。 GNOME2-like Theme - Visual Studio Marketplace 黒すぎず白すぎず、目に優しいユニークで好ましいテーマですね。作業中に使っている都合で Cline/Roo でもそれなりの見た目です。 badge.background や badge.foreground を token 数とか出てるパネルに使うのは違うんじゃないかと思うが、バッジとしてもパネルとしても違和感ない色に着地させる。 RooCode 当初はベタ移植を考えていて color-theme.el 全部 VSCode に持ってきたら面白いんじゃないかと考えたけど、VSCode の UI に合わせて色数をだいぶ補う必要があり、思い入れのないテーマの色を変に AI で補って完成させても誰も嬉しくないだろなぁとやめた。自分が愛用していた gnome2 のみを、色を補ったりアレンジした点もあるので -like suffix をつけ公開することにした。これはこれでウィンドウマネージャの GNOME 側に失礼な気もしなくはないが、そこは color-theme.el 時代からそうだということで。半ばこの類の緑色がアイデンティティ化していて、このブログのテーマもそうだし、キャラエディットできる系だと緑を入れちゃうんだよな。覚えやすい ■#008080 を使いがち。 AC6 ノウハウ VSCode カラーテーマ JSON の色部分を消して、テンプレートとして埋めてくださいと指示する Emacs の色名は hex とのマッピングを csv で与えるとよい pokutuna/vscode-gnome2like-theme@main - resources/colornames.csv color-theme.el 巨大なので必要な部分だけ切って与える、GPL なので派生著作物も GPL にするテーマ = VSCode 拡張の開発は Debugger で行うのだが、テーマ JSON そのままは拡張だと思ってくれないので以下を .vscode/launch.json に置くとよい { "version": "0.2.0", "configurations": [ { "name": "Extension", "type": "extensionHost", "request": "launch", "args": [ "--extensionDevelopmentPath=${workspaceFolder}" ] } ] } UI から色の設定名を特定するには Theme Color | Visual Studio Code Extension API をじっくり見る VSCode 中の Developer Tools を開いて var() や Computed から辿るシンタックスハイライトにおいて特定の言語に依存したトークンの記述は最小限にしたいのでそう指示する & ある程度できたら置き換えさせるストア公開用のスクショを取るためのスクリプトも AI に書いてもらったユーザディレクトリ切り替えていい感じにできない? と頼むと Portable mode を使いつつ、開発中のテーマに symlink 張って開いてサンプルコードを開くスクリプトを書いてもらえたこれは自分で作るとちょっと面倒くさかっただろうなと思う、これを実行してウィンドウ分割してコード並べてスクショ撮るのが俺の仕事 pokutuna/vscode-gnome2like-theme@main - resources/samples/setup_screenshot.sh AIエディタCursor完全ガイド ―やりたいことを伝えるだけでできる新世代プログラミング― 作者:木下雄一朗オーム社 Amazon Cursor完全入門エンジニア&Webクリエイターの生産性がアップするAIコードエディターの操り方作者:リブロワークスインプレス Amazon

langchain-ai/langchainjs2024-08-18
Replacement Character(�) appears in multibyte text output from Google VertexAI
Checked other resources I added a very descriptive title to this issue. I searched the LangChain.js documentation with the integrated search. I used the GitHub search to find a similar question and didn't find it. I am sure that this is a bug in LangChain.js rather than my code. The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package). Example Code Make the model output long texts containing multibyte characters as a stream. import { VertexAI } from "@langchain/google-vertexai"; // Set your project ID and pass the credentials according to the doc. // https://js.langchain.com/docs/integrations/llms/google_vertex_ai const project = "YOUR_PROJECT_ID"; const langchainModel = new VertexAI({ model: "gemini-1.5-pro-preview-0409", location: "us-central1", authOptions: { projectId: project }, }); // EN: List as many Japanese proverbs as possible. const prompt = "日本のことわざをできるだけたくさん挙げて"; for await (const chunk of await langchainModel.stream(prompt)) { process.stdout.write(chunk); } Error Message and Stack Trace (if applicable) (No errors or stack traces occur) Output Example: Includes Replacement Characters (�) ## ��：知恵の宝庫日本のことわざは、長い歴史の中で培われた知恵や教訓が詰まった、短い言葉の宝庫で��いくつかご紹介しますね。 **人生・教訓** * **井の中の蛙大海を知らず** (I no naka no kawazu taikai wo shirazu): 狭い世界しか知らない者のたとえ。 * **石の上にも三年** (Ishi no ue ni mo san nen): ��強く努力すれば成功する。 * **案ずるより産むが易し** (Anzuru yori umu ga yasushi): 心配するよりも行動した方が良い。 * **転��** (Korobanu saki no tsue): 前もって準備をすることの大切さ。 * **失敗は成功のもと** (Shippai wa seikou no moto): 失敗から学ぶことで成功��る。 **人��関係** * **類は友を呼ぶ** (Rui wa tomo wo yobu): 似た者同士が仲良くなる。 * **情けは人の為ならず** (Nasake wa hito no tame narazu): 人に親切にすることは巡り巡��て自分に良いことが返ってくる。 * **人の振り見て我が振り直せ** (Hito no furi mite waga furi naose): 他人の行動を見て自分の行動を反省する。 * **出る杭は打たれる** (Deru kui wa utareru): 他人より目��つ��叩かれる。 * **三人寄れば文殊の知恵** (Sannin yoreba monju no chie): みんなで知恵を出し合えば良い考えが浮かぶ。 ... Description This issue occurs when requesting outputs from the model in languages that include multibyte characters, such as Japanese, Chinese, Russian, Greek, and various other languages, or in texts that include emojis 😎. This issue occurs due to the handling of streams containing multibyte characters and the behavior of buffer.toString() method in Node. langchainjs/libs/langchain-google-gauth/src/auth.ts Line 15 in a1ed4fe data.on("data", (data) => this.appendBuffer(data.toString())); When receiving a stream containing multibyte characters, the point at which a chunk (readable.on('data', ...) is executed) is may be in the middle of a character’s byte sequence. For instance, the emoji "👋" is represented in UTF-8 as 0xF0 0x9F 0x91 0x8B. The callback might be executed after only 0xF0 0x9F has been received. buffer.toString() attempts to decode byte sequences assuming UTF-8 encoding. If the bytes are invalid, it does not throw an error, instead silently outputs a REPLACEMENT CHARACTER (�). https://nodejs.org/api/buffer.html#buffers-and-character-encodings To resolve this, use TextDecoder with the stream option. https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/decode Related Issues The issue has been reported below, but it persists even in the latest version. #4113 The same issue occurred when using Google Cloud's client libraries instead of LangChain, but it has been fixed. googleapis/nodejs-vertexai#78 googleapis/nodejs-vertexai#86 I will send a Pull Request later, but I am not familiar with this codebase, and there are many google-related packages under libs/ which I have not grasped enough. Any advice would be appreciated. System Info macOS node v20.12.2 langchain versions $ npm list --depth=1 | grep langchain ├─┬ @langchain/community@0.0.54 │ ├── @langchain/core@0.1.61 │ ├── @langchain/openai@0.0.28 ├─┬ @langchain/google-vertexai@0.0.12 │ ├── @langchain/core@0.1.61 deduped │ └── @langchain/google-gauth@0.0.12 ├─┬ langchain@0.1.36 │ ├── @langchain/community@0.0.54 deduped │ ├── @langchain/core@0.1.61 deduped │ ├── @langchain/openai@0.0.28 deduped │ ├── @langchain/textsplitters@0.0.0 │ ├── langchainhub@0.0.8
pokutuna opened on 2024-05-04
langchain-ai/langchainjs2024-08-16
google[patch]: fix: handling multibyte characters in stream for google-vertexai-web
Fixes #6501 I have fixed this issue similarly to #5286. The approach is the same, but we need to use components that work in the Browser environment instead of Node. I previously fixed the same issue for @langchain/google-vertexai in #5285. Although I don't use @langchain/google-vertexai-web myself, I've also fixed this package as it was requested in the issue.
pokutuna opened on 2024-08-12
langchain-ai/langchainjs2024-08-16
Replacement Character(�) appears in multibyte text output from Google VertexAI Web
Checked other resources I added a very descriptive title to this issue. I searched the LangChain.js documentation with the integrated search. I used the GitHub search to find a similar question and didn't find it. I am sure that this is a bug in LangChain.js rather than my code. The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package). Example Code Make the model output long texts containing multibyte characters as a stream. import { VertexAI } from "@langchain/google-vertexai-web"; const langchainModel = new VertexAI({ model: "gemini-1.5-pro-001", location: "us-central1", }); // EN: List as many Japanese proverbs as possible. const prompt = "日本のことわざをできるだけたくさん挙げて"; const stream = await langchainModel.stream(prompt); const reader = stream.getReader(); let buf = ""; while (true) { const { done, value } = await reader.read(); if (done) break; buf += value; } console.log(buf); This code can be executed by creating a service account key from the Google Cloud Console and running it with the following command: $ GOOGLE_WEB_CREDENTIALS=$(cat ./key.json) npx tsx sample.ts Error Message and Stack Trace (if applicable) (No errors or stack traces occur) Output Example: Includes Replacement Characters (�) ## ��本の諺 (ことわざ) - できるだけたくさん！ **一般的な知��** * 石の上にも三年 (いしのうえにもさんねん) - Perseverance will pay off. * 七転び八起き (ななころびやおき) - Fall seven times, stand up eight. * 継続は力なり (けいぞくはちからなり) - Persistence is power. * 急がば回れ (い��がばまわれ) - Haste makes waste. * 井の中の蛙大海を知らず (いのなかのかわずたいかいをしらず) - A frog in a well knows nothing of the great ocean. * 良��は��に苦し (りょうやくはくちにくい) - Good medicine tastes bitter. * 猿も木から落ちる (さるもきからおちる) - Even monkeys fall from trees. * 転石苔を生ぜず (てんせきこけをしょうぜず) - A rolling stone gathers no moss. * 覆水盆に返らず (ふくすいぼんにかえらず) - Spilled water will not return to the tray. * 後生の祭り (ごしょうの��り) - Too late for regrets. * 習うより慣れろ (ならうよりなれろ) - Experience is the best teacher. * 鉄は熱いうちに打て (てつはあついうちにうて) - Strike while the iron is hot. ... Description This is the same issue as #5285. While #5285 is about @langchain/google-vertexai, this issue also occurs in @langchain/google-vertexai-web. The problem occurs when a stream chunk is cut in the middle of a multibyte character. For detailed reasons, please refer to #5285. I will submit a Pull Request with the fix shortly. System Info macOS node v20.12.2 langchain versions $ npm list --depth=1 | grep langchain ├─┬ @langchain/google-vertexai-web@0.0.25 │ ├── @langchain/core@0.2.23 │ └── @langchain/google-webauth@0.0.25 ├─┬ @langchain/google-vertexai@0.0.25 │ ├── @langchain/core@0.2.23 deduped │ └── @langchain/google-gauth@0.0.25 ├─┬ langchain@0.2.15 │ ├── UNMET OPTIONAL DEPENDENCY @langchain/anthropic@* │ ├── UNMET OPTIONAL DEPENDENCY @langchain/aws@* │ ├── UNMET OPTIONAL DEPENDENCY @langchain/cohere@* │ ├── UNMET OPTIONAL DEPENDENCY @langchain/community@* │ ├── @langchain/core@0.2.23 deduped │ ├── UNMET OPTIONAL DEPENDENCY @langchain/google-genai@* │ ├── @langchain/google-vertexai@0.0.25 deduped │ ├── UNMET OPTIONAL DEPENDENCY @langchain/groq@* │ ├── UNMET OPTIONAL DEPENDENCY @langchain/mistralai@* │ ├── UNMET OPTIONAL DEPENDENCY @langchain/ollama@* │ ├── @langchain/openai@0.2.6 │ ├── @langchain/textsplitters@0.0.3
pokutuna opened on 2024-08-12
kubeflow/pipelines2024-07-16
[sdk] Bug when trying to iterate a list of dictionaries with ParallelFor
Environment KFP SDK version: kfp==2.0.0b16 All dependencies version: kfp==2.0.0b16 kfp-pipeline-spec==0.2.2 kfp-server-api==2.0.0b1 Steps to reproduce When running the code snippet below the following error is raised: kfp.components.types.type_utils.InconsistentTypeException: Incompatible argument passed to the input 'val_a' of component 'add': Argument type 'STRING' is incompatible with the input type 'NUMBER_INTEGER' @dsl.component() def add(val_a: int, val_b: int) -> int: return val_a + val_b @dsl.pipeline() def model_training_pipeline() -> None: with dsl.ParallelFor( items=[{"a": 1, "b": 10}, {"a": 2, "b": 20}], parallelism=1 ) as item: task = add(val_a=item.a, val_b=item.b) compiler.Compiler().compile( pipeline_func=model_training_pipeline, package_path="/app/pipeline.yaml" ) Expected result According to the ParallelFor documentation, the code sample above should compile without errors. The add component should receive the values of the dictionaries as integer arguments. Materials and Reference The code snippet below is a modification of the code snippet above, changing the add component to accept string arguments. @dsl.component() def add(val_a: str, val_b: str) -> int: return int(val_a) + int(val_b) @dsl.pipeline() def model_training_pipeline() -> None: with dsl.ParallelFor( items=[{"a": 1, "b": 10}, {"a": 2, "b": 20}], parallelism=1 ) as item: task = add(val_a=item.a, val_b=item.b) compiler.Compiler().compile( pipeline_func=model_training_pipeline, package_path="/app/pipeline.yaml" ) The pipeline compiles without errors with this modification, however it fails to run in Google Vertex Pipelines. The add component doesn't even run and throws the following error in the UI: Failed to evaluate the expression with error: INVALID_ARGUMENT: Failed to parseJson from string.; Failed to evaluate the parameter_expression_selector. As the component's code is not even executed, it seems that the problem occurs when executing the DAG. Here is the content of the pipeline.yaml that was compiled. # PIPELINE DEFINITION # Name: model-training-pipeline components: comp-add: executorLabel: exec-add inputDefinitions: parameters: val_a: parameterType: STRING val_b: parameterType: STRING outputDefinitions: parameters: Output: parameterType: NUMBER_INTEGER comp-for-loop-2: dag: tasks: add: cachingOptions: enableCache: true componentRef: name: comp-add inputs: parameters: val_a: componentInputParameter: pipelinechannel--loop-item-param-1 parameterExpressionSelector: parseJson(string_value)["a"] val_b: componentInputParameter: pipelinechannel--loop-item-param-1 parameterExpressionSelector: parseJson(string_value)["b"] taskInfo: name: add inputDefinitions: parameters: pipelinechannel--loop-item-param-1: parameterType: STRUCT deploymentSpec: executors: exec-add: container: args: - --executor_input - '{{$}}' - --function_to_execute - add command: - sh - -c - "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip ||\ \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\ \ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.0.0-beta.16'\ \ && \"$0\" \"$@\"\n" - sh - -ec - 'program_path=$(mktemp -d) printf "%s" "$0" > "$program_path/ephemeral_component.py" python3 -m kfp.components.executor_main --component_module_path "$program_path/ephemeral_component.py" "$@" ' - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\ \ *\n\ndef add(val_a: str, val_b: str) -> int:\n return int(val_a) +\ \ int(val_b)\n\n" image: python:3.7 pipelineInfo: name: model-training-pipeline root: dag: tasks: for-loop-2: componentRef: name: comp-for-loop-2 iteratorPolicy: parallelismLimit: 1 parameterIterator: itemInput: pipelinechannel--loop-item-param-1 items: raw: '[{"a": 1, "b": 10}, {"a": 2, "b": 20}]' taskInfo: name: for-loop-2 schemaVersion: 2.1.0 sdkVersion: kfp-2.0.0-beta.16 Impacted by this bug? Give it a 👍.
lucasvbalves opened on 2023-05-09

pokutuna.com

pokutuna

Links

Publications

#8 id:pokutuna に聞くココピーの現在と未来

はてな広告配信システムのクラウドネイティブ化への道のり

Products

@pokutuna/envelop-response-cache-firestore

cocopy

@pokutuna/requestlog-cloudfunctions

Blog Entries

松尾研LLM講座申し込み締切もうすぐ!! & 2024 年講座の思い出

開いているブラウザの内容を読める MCP サーバー

書いてない記事2025夏

Emacs 時代の愛用テーマを VSCode テーマにした

Contributions

Replacement Character(�) appears in multibyte text output from Google VertexAI

google[patch]: fix: handling multibyte characters in stream for google-vertexai-web

Replacement Character(�) appears in multibyte text output from Google VertexAI Web

[sdk] Bug when trying to iterate a list of dictionaries with ParallelFor