ホーダチ-Hodatsu | LLM Researcher × AI Engineer

11.1K posts

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

@hokazuya

AI Engineer 先端技術と興味が湧いたことと雑記しか呟きません | My face-photo is made with Midjourney | 個人開発：github.com/hokar3361 | huggingface.co/hokar3361

北海道Joined May 2013

19.4K Followers

ホーダチ-Hodatsu | LLM Researcher × AI Engineer’s posts

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

ディズニーのVR用歩けるタイル、超絶進化しとる。椅子まで回転させれてる。というかルームランナーいらないレベル。

The media could not be played.

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

やばいのきた。画力なくても、近いところまで描いたら雰囲気から、AIがロゴを作ってくれる。私の低レベルな画力でもいけてるロゴやアイコンが簡単に作れる。

GIF

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

マジでこれを無償で提供とか、すごすぎる。衛星データから自動生成された、Unreal Engineで使える新宿(1/10) 日本中、世界中のマップがこのように再現できるようになったら、仮想現実の世界観はもっと広がる。夢しかない。ちなみに私は新宿の土地勘はゼロなので、どこが何なのかはわからん

1:23

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

Runwayが凄すぎる。画像の選択範囲に対してモーションをつけて動かしてる。なんだこれ、のレベル。

0:01 / 0:25

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

久しぶりに動画変換系のAIネタをリサーチしたら、Viggle A進化エグい。スタイルチェンジと言っているけど、元の動画に、差し替えたい人物の静止画一枚でこれ。出演者を変えなければいけないやむを得ない理由とか、リメイクとか。使えるシーンが多くある技術に感じた。

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

ナニコレ。Google Earthが神進化しとった。東京駅近郊空撮。 Google Earthに空撮機能が出てる。 UEでカメラワーク使うのとほぼ同じ感覚で実世界の映像作れた。しかも内部にある地理情報は全部出力されてる。一つの新しいコンテンツ制作方法かも。ちな、撮影のセンスはないのと土地勘もない。

0:01 / 0:15

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

ゲームにおけるモーション生成AIが爆進。 motoricaという、モーションマッチング用AIがすごい。モーションキャプチャの必要性がなくなり、アサシンクリードやラストオブアスのような高品質のアニメーションが数か月ではなく数分で。すごい時代やな。

0:01 / 0:14

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

好きすぎる

コードエディタに不要な爆発を実装しちゃったらしい

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

超使えるのが来た。ロゴじゃなくて超リアルな絵。このレベルのお絵描きレベルでいけるのか。絵心ゼロだから、ロゴも、絵も、AIに頼る。

0:01 / 0:34

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

なんだこれ。リアリティエグすぎ。しかも日本発のJ-Moshi、本気でこのレベルなら、OpenAIのリアルタイムAPIとは比にならないクォリティ。まじで人間が話してるとしか思えんレベル

From

Atsumoto Ohashi

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

なんじゃこれ。テレビの中の試合が目の前のボードに来とる。

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

最高すぎるぞ、Deep Research。家族旅行のplan立てるのにリサーチするのも、指示したら、使う公共交通機関と時刻表まで調べて、キッチリ出してくれた。文章だったけど、それをただコピってHTMLCanvasで表形式にさせたらこれ。後は家族に見せるだけ。。

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

Replying to

サイトはこちらから： Logo Diffusion logodiffusion.com 日中はできたものの今は人気爆発でサーバーパンク中の様子でGenerate失敗続く。（向こうは日中ですしね）使い方はいたって簡単：１．上記ページからサインイン２．左側のメニューでImageToImageを選択

AI Logo Maker | Logo Maker | Logo Diffusion

From logodiffusion.com

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

ほえー、元旦からスゴイもん出てきた

AI ペイントツールで、顔の向きや背景、服装など、指示と直感的なUIで作れちゃうらしいこれなら、ワシでも作れそう、という期待を与えてくれるサービスはかなり好き

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

これ、日本語 OCR触る人、皆試してみてほしい。 OSSの OCRプロダクト、「Yomitoku」。しばらく前に見て、使ってみようと思ってから、バタバタしてさわれていなかったのだけど、存在を知り合いの方に思い出させてもらって、そういや触っていなかったと思い触った。

The image shows a comparison between two documents: one labeled '入力画面' (Input Screen) and the other 'OCR の結果' (OCR Result). The left side displays the original document, while the right side shows the output after applying the OCR tool 'Yomitoku'. The post text highlights the effectiveness of Yomitoku, an open-source OCR product from Japan, praising its high accuracy in extracting Japanese text from PDFs, including tables and images. The user emphasizes the tool's quality, offline capability, and CPU efficiency, suggesting it's comparable to paid services like Azure Document Intelligence. The image is a clear demonstration of Yomitoku's capabilities, showcasing its potential for widespread adoption in Japan.

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

これ、ええ。今度は、MacBookに限らず、 Dockerさえ動けば、自分のPCを汚さず、サクッとAIエージェント自動化環境用のLinuxデスクトップが手に入る GUI付きコンテナがオープンソースで登場。 ──Bytebot

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

これ最高やん。 GitHub公式から、複雑なGitHubワークフローやら、アクションを自然言語で実現できる、MCPサーバー登場。地味にモジュールの話とか調べるのだるいし時間かかっていたから、調べてついでに実装するのもしんどかった。

The image is a screenshot of a webpage detailing the GitHub MCP (Machine Comprehension Platform) server, which allows users to implement complex GitHub workflows and actions using natural language. The page is in Japanese and includes sections for installation options (VS Code Install Server and VS Code Insiders Install Server) and features. The context from the post suggests that this tool is highly praised for simplifying the process of setting up and implementing GitHub actions, which was previously time-consuming and cumbersome. The post text indicates excitement about the official launch of this server by GitHub, highlighting its potential impact on API usage based on MCP compatibility.

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

【OpenAI Spring アップデートまとめ速報】 ①GPT-4o APIを発表　対GPT-4Turbo：　・２倍高速　・半額　・５倍まで利用できるようRateLimit緩和 ②音声の実装（Text-To-Speech）　・応対スピードの向上　・感情表現の実装（感情こもりすぎw）　・歌唱の実装（歌も歌えるんかい)

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

超実用的な、RAGの完成形か、と思えるOSS登場。。 55KスターのOSS「RAGFlow」が登場。曖昧なRAGを捨てて、“根拠に基づく回答”へ。 LLMの永遠の課題、「信憑性」「正確性」。どれだけLLMが進化しても、変化し続ける「情報」には勝てない。つまり正確性が重要な多くの業務では導入が足踏みになる。

The image showcases the RAGFlow open-source software, which is highlighted as a practical and advanced implementation of Retrieval-Augmented Generation (RAG). The screenshot includes the RAGFlow interface, displaying various components like 'finance', 'medical', 'websearch', 'weibo', and 'smalltalk' categories, along with a chat interface showing a sample conversation. The post text emphasizes RAGFlow's ability to provide 'evidence-based answers' by understanding document structures and linking responses to specific sources, making it suitable for critical areas like healthcare, legal, and internal operations. The image also features navigation options like 'Document', 'Roadmap', 'Twitter', 'Discord', and 'Demo', indicating community engagement and development resources.

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

リアルタイムオービス。 Visionでやってるみたい。レーダー当てるとかじゃなくて、動画からだから後追いもできそう。ドラレコや監視カメラにとも思ったけど、後追いできるならそれで十分そう。多方面でVisionは活躍しそう。

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

これ、吉報。いよいよ全開発者のスタンダードが変わる。 VSCodeのエージェントモードが、全ユーザ無料利用可能に。多くの開発者が利用する、VSCode についに「エージェントモード」が正式ローンチ。

The image is a diagram illustrating the architecture of an AI agent integrated into VSCode, referred to as 'Agent Mode'. It shows how a user interacts with the agent, which in turn communicates with a large language model (LLM) in a loop. The agent can perform various tasks such as reading files, editing files, running commands in the terminal, and more, by interfacing with different tools and APIs. The diagram includes components like the user, agent, LLM, tools, and various APIs (e.g., GitHub, VS Code Extension). This integration signifies a significant shift in standard development practices, making AI-assisted development a norm in the developer community, as highlighted in the post by ホーダチ-Hodatsu | LLM Researcher × AI Engineer (@hokazuya). The post emphasizes that this feature will become a standard tool for all developers, enhancing productivity by providing an 'additional self' within the editor.

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

やば。便利すぎやろ。 githubのコードベース、gitdiagramにurlを変えるだけで、これ。これは使うわ。

From

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

わかったわ。Grokの使い方。「Xの投稿」をベースにXXXX年からのPostを調べて、という形でリサーチをさせて、時系列でトレンドをまとめて、という言い方をすると、「いつそれについてPost」されたか、というのを比較的に正確に調べてくれる。これは、OpenAIのDeep Researchにはできない、