Akari Asai

2,008 posts
Opens profile photo
Akari Asai
@AkariAsai
Incoming Assistant Professor (Hiring Ph.D. students for Fall 2026) & research scientist OLMo. akariasai @ 🦋

Akari Asai’s posts

Pinned
1/ Hiring PhD students at CMU SCS (LTI/MLD) for Fall 2026 (Deadline 12/10) 🎓 I work on open, reliable LMs: augmented LMs & agents (RAG, tool use, deep research), safety (hallucinations, copyright), and AI for science, code & multilinguality & open to bold new ideas! FAQ in 🧵
1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚 With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts. Try out our demo! We also introduce ꜱᴄʜᴏʟᴀʀQᴀʙᴇɴᴄʜ,
The media could not be played.
🚨 I’m on the job market this year! 🚨 I’m completing my Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵
Overview of Akari's research. More information is at https://akariasai.github.io/
New paper 🚨 arxiv.org/abs/2211.09260 Can we train a single search system that satisfies our diverse information needs? We present 𝕋𝔸ℝ𝕋 🥧 the first multi-task instruction-following retriever trained on 𝔹𝔼ℝℝ𝕀 🫐, a collections of 40 retrieval tasks with instructions! 1/N
Image
(便乗してみる...) 東大に文科で入学して一度経済学部に進学しましたが工学部電子情報工学科を卒業してアメリカのCS博士課程でNLP/機械学習の研究をしています。特にプログラミングはもっと早く始めたかった(20歳まで未経験でした)とたまに思いますが楽しいです😀
Quote
五十嵐祐花
@00_
すごく今更だけど、昔から平均以下の数学の才能しかなくて(東大数学も二完以下だったし)高二まで文系行くかなと思ってた人間がMITの博士課程に進学するとか無謀過ぎて笑えてくる。どうしてこうなったんだっけ....。
𝗛𝗼𝘄 𝗰𝗮𝗻 𝘄𝗲 𝗯𝘂𝗶𝗹𝗱 𝗺𝗼𝗿𝗲 𝗿𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝗟𝗠-𝗯𝗮𝘀𝗲𝗱 𝘀𝘆𝘀𝘁𝗲𝗺𝘀? Our new position paper advocates for retrieval-augmented LMs (RALMs) as the next gen. of LMs, exploring the promises, limitations, and a roadmap for wider adoption. arxiv.org/abs/2403.03187 🧵
Image
New paper 🚨 Can LLMs perform well across languages? Our new benchmark BUFFET enables a fair eval. for few-shot NLP across languages in scale. Surprisingly, LLMs+Incontext learning (incl. ChatGPT) are often outperformed by much smaller fine-tuned LMs 🍽️tinyurl.com/BuffetFS
Image
Recently I gave a lecture about retrieval-augmented LMs like RAG, covering their advantages, an overview of diverse methods, and current limitations & opportunities, based on this position paper. akariasai.github.io/assets/pdf/aka video: shorturl.at/ahmq8 Feedback is welcomed :)
Quote
Akari Asai
@AkariAsai
𝗛𝗼𝘄 𝗰𝗮𝗻 𝘄𝗲 𝗯𝘂𝗶𝗹𝗱 𝗺𝗼𝗿𝗲 𝗿𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝗟𝗠-𝗯𝗮𝘀𝗲𝗱 𝘀𝘆𝘀𝘁𝗲𝗺𝘀? Our new position paper advocates for retrieval-augmented LMs (RALMs) as the next gen. of LMs, exploring the promises, limitations, and a roadmap for wider adoption. arxiv.org/abs/2403.03187 🧵
Image
Our paper got the ACL 2023 best video award (at EMNLP) The video by is available at youtu.be/hJbxW0xct2E?si This 5 mins video summarizes the interesting findings on (1) when LLMs hallucinate (and scaling may not help) how retrieval-augmented LMs alleviate it.
Image
Quote
Alex Mallen
@alextmallen
Our work "When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories" will appear in #ACL2023!! This is my first NLP conference paper and I'm very happy I got to pursue this project with these amazing people at UW! x.com/AkariAsai/stat…
This is a comprehensive list of the must-read papers on the recent progress of self supervised NLP models (or impressive capabilities of LLMs) and great summary slides! I also love the role-playing paper-reading seminar fromat! (colinraffel.com/blog/role-play)
Image
Quote
Daniel Khashabi
@DanielKhashabi
For my first course at @jhuclsp, I am leading a class on recent developments in "self-supervised models." Here is the list of the papers and slides we cover: self-supervised.cs.jhu.edu Would love to hear Twitter's suggestions for additional exciting developments to discuss!
A powerful retriever+pre-trained generator (eg. DPR+T5) often relies on spurious cues / generates hallucinations. Our 𝕖𝕧𝕚𝕕𝕖𝕟𝕥𝕚𝕒𝕝𝕚𝕥𝕪-guided generator learns to focus and generate on the right passages and shows large improvements in QA/fact verification/dialogue👇
Our #ICLR2020 camera-ready version, code, and blog are now available! paper: arxiv.org/abs/1911.10470 code: github.com/AkariAsai/lear blog: blog.einstein.ai/learning-to-re You can train, evaluate, and run an interactive demo on your machine. We also release the models for reproducibility.
Quote
Akari Asai
@AkariAsai
New work with Kazuma Hashimoto, @HannaHajishirzi, @RichardSocher, and @CaimingXiong at @SFResearch and @uwnlp! Our trainable graph-based retriever-reader framework for open-domain QA advances state of the art on HotpotQA, SQuAD Open, Natural Questions Open. 👇1/7
Image