Announcing Scry: A research tool over arXiv, EA Forum, etc. w/ SQL + Vibe/Vector Algebra

exopriors

Today I'm releasing ExoPriors Scry, a new research tool over the EA Forum, arXiv, HackerNews, community-archive.org, and more. Try it here.

TLDR

You and your agent can now query this rich dataset with the full expressive power of SQL (Structured Query Language) + embedding vector algebra. You DON'T have to be familiar with these, Claude writes the search queries for you. Just ask Claude whatever questions you have about the tool.
It's best driven via Claude Code (with Opus 4.5). I highly recommend getting acquainted with it. It can be driven by Claude in the browser--you just have to change one setting and paste in the prompt--but it's significantly slower and less agentic.
Even with all the fanciness of AI so far and Deep Research agents, this is a novel search/research experience, that allows you to do things you can't do anywhere else.
Exploration is encouraged! Text embeddings are kind of magical. This an easy way to get more familiar with them, like how to compose "vibes" like @FTX_controversy - (@guilt_tone - @guilt_topic), to find writing that was about the FTX crisis and distinctly without guilty tones, but that can mention "guilt".
There's an Alerts functionality. You can be emailed when some new post, comment, arXiv paper, etc., is ingested and meets your very nuanced criteria.

Some example usage:

> find the 4 posts over 100 karma that are most distant from each other in every way (not the average of them). we want to create 4 quadrants.

> what are the most whimsical and uplifting EA Forum posts?

> find me the most serious documents about civilizational refuges, and anyone who could be able to be rapidly mobilized to help with that.

> who flip flopped multiple times on FTX crisis, being against -> for sam -> against, or for sam -> against -> for sam, in time

> Which EA Forum posts predicted things about AI that turned out to be wrong, and what were the common failure modes?

> exhaustively explore who has tended to coin the most new terms

> the people who have had very uplifting and whimsical EA posts, how are they doing. does anyone seem like they could use support. followed up with: search for all of lesswrong, twitter, etc., of people who plausibly feel abandoned by EA. do sophisticated search.

^ I got a very meaningful response, but will leave out results for privacy.

Some cool aspects of this project:

hardening up a SQL database enough to let the public run queries. There's so much collective trauma about SQL injection attacks that most people have forgotten that this is possible.
I've built on syntactic sugar for using custom vectors. Agents can embed arbitrary queries and refer to them with @vector_handle syntax. This compactness helps agents reason efficiently, and let's us not have to pass around 8kb vectors.
product-as-a-prompt, agent-copilot-targeted UX as a paradigm. It was pretty cool realizing I could e.g. just describe my /feedback API endpoint in the prompt, to open up the easy communication channel with users and help me iterate on the project better.
There's currently over 23M embeddings and 600 GB of indexes for query performance, and long query time limits. You can learn a lot about the actual shape of the world with exhaustive search, like finding that no single person across Hacker News mentions a combination of particular topics across their comments.
there's an Alerts feature. The system ingests thousands of papers, posts, articles, comments a day, you can specify an arbitrary SQL query that will run multiple times a day, and get an email when the output changes. If you find a query you like, you can just tell Claude to submit that query as an Alert (and it'll use our API appropriately).

Bounty Offer

When I'm in the mindset of building this, I tend to struggle at coming up with interesting research questions. I will pay $12/$8/$4 to the three most interesting queries (rough prompt + SQL) provided in the comments, on January 1st.

Strategic Thoughts

I was able to build this in a few weeks with Opus 4.5 and GPT-5.2-codex. I would love to see EA getting more oriented to the new realities of software production, particularly since December 2025. Twitter is where the edge is regarding AI-assisted "hyper engineering" thought--I recommend following some of the builders I follow such as https://x.com/doodlestein and https://x.com/steipete.

I firmly see epistemic infrastructural software production as the premier mechanism of change in the world and witnessing existential security in action. Bits must be fought predominantly with bits (and good culture for guiding what gets built).

Happy to take any feedback!

P.S. If anyone wants to sponsor this, I could ingest a large fraction of all academic papers, Bluesky, and dozens of other sources that you point out.

35 Reactions

More posts like this

Comments12

Sorted by

New & upvoted

Click to highlight new comments since: Today at 11:05 AM

david_reinsteinJan 94

Trying this out for using for various Unjournal.org processes (like prioritizing research, finding potential evaluators, linking research to pivotal questions) and projects (assessing LLM vs human research evaluations). Some initial forays (comming from a conversation with Xyra). I still need to human-check it.

~prompt to Claude code about @Toby_Ord and How Well Does RL Scale?

``Toby Ord's writing -- what do the clusters look like? What other research/experts come closest to his post .... https://forum.effectivealtruism.org/posts/TysuCdgwDnQjH3LyY/how-well-does-rl-scale``

This interactive visual (with some extra prompts)

Key Takeaways

For “How Well Does RL Scale?”, the closest intellectual neighbors are:
ML researchers focused on compute efficiency and scaling laws (e.g. 1a3orn, Pablo Villalobos), and
- AI safety researchers analyzing inference scaling implications (e.g. Ryan Kidd, ryan_greenblatt).
Stuart Armstrong appears to be the only researcher in this corpus writing substantively across all three of Toby Ord’s main intellectual themes:
- AI scaling
- Existential risk / longtermism
- Infinite ethics and decision theory

Most Similar Researchers to“How Well Does RL Scale?”

Top Authors (by Semantic Similarity)

Author	Distance	Key Work
1a3orn	0.143	Parameter Scaling Comes for RL, Maybe; New Scaling Laws for LLMs
Pablo Villalobos	0.150	Trading off compute in training and inference
Matrice Jacobine	0.168	Does RL Really Incentivize Reasoning Capacity in LLMs?
Lukas Finnveden	0.169	Before smart AI, there will be many mediocre or specialized AIs
ryan_greenblatt	0.179	What’s going on with AI progress and trends?
Ryan Kidd	0.188	Implications of the inference scaling paradigm for AI safety

@1a3orn @Pablo Villalobos @Matrice Jacobine🔸🏳️‍⚧️ @Lukas Finnveden @Ryan Greenblatt @Ryan Kidd -- if you have a chance, let me know if this is accurate/relevant.

Most Similar arXiv Paper

Paper	Authors	Distance
The Art of Scaling RL Compute for LLMs	Khatri et al.	0.125
Webscale-RL: Automated Data Pipeline for Scaling RL	Cen et al.	0.142
AReaL: Large-Scale Asynchronous RL for Language Reasoning	Fu et al.	0.142
Does RLHF Scale?	Hou et al.	0.162

Toby Ord’s Writing Clusters

Based on semantic analysis, Toby Ord’s writing appears to fall into three main thematic clusters.

Cluster 1: AI Scaling & Timelines (Recent)

“How Well Does RL Scale?” (Oct 2025)
- Core claim: RL training scales poorly; most gains come from inference scaling (longer chains of thought), not training compute.
- Key implication: This likely lengthens AI timelines and affects governance and safety strategies.

Cluster 2: Existential Risk & Longtermism

The Precipice and related work
“The Precipice Revisited” (Jul 2024)
“Shaping Humanity’s Longterm Trajectory” (Jul 2023)
“Robust Longterm Comparisons” (May 2024)

Topics include:

Existential risk estimation
Discounting
Anthropic effects / anthropic shadow

Cluster 3: Infinite Ethics & Decision Theory

“Evaluating the Infinite” (Sep 2025) — hyperreal-based approach to divergent sums and integrals
“What’s wrong with the dogmatic solution to recklessness?” (Feb 2023)
“The Ethics of Copying Conscious States” (Aug 2025)

Topics include:

Pascal’s Mugging
Unbounded utility
Decision-making under infinity

Matrice Jacobine🔸🏳️‍⚧️Jan 91

This was a linkpost, I didn't write that paper.

david_reinsteinJan 92

Thanks, helpful to understand the limitations of this tool

EdoAradDec 26 20256

I'm curious about how representative the EA forum is to the EA community, particularly in regards to worldview. One thing you could try is to take the EA community surveys done by RP, and estimate how a random representative of "the forum" would answer, where the random representative might be a random user or a random comment/post weighted by karma or amount of text

exopriorsDec 27 2025*2

I fed your message to Scry! Seems the forum is significantly more longtermist.

Clara Torres Latorre 🔸Dec 27 20255

99% karma-weighted of tagged posts about AI seems wrong

if you check the top 4 posts of all time, the 1st and 3rd are about FTX, the 2nd about earning to give and the 4th about health, totalling > 2k karma

might want to check for bugs

exopriorsDec 27 2025*1

You're so right, I should have caught that. The database data is good, but Opus 4.5 did a rare error. I've since updated the response.

EdoAradDec 28 20252

Im still in awe of current AI capabilities

exopriorsDec 31 20251

It's hard to fathom what's straightforwardly possible already.

exopriorsJan 141

It's important to mention that this project has received $0 in funding from EAIF, LTFF, Manifund, and other EA/EA-adjacent orgs. That tells you A LOT about the epistemic inclination and technological orientation of funders in the community. If it were not for a random angel investor that just ASKED to invest $5k after this went viral on Hacker News, I would be in a bleak financial situation and plausibly unable to even maintain this tool.

david_reinsteinJan 92

Did a decent job for this academic paper, but I think it’s hampered by only having content from Arxiv and various EA/tech forums. Still, it generated some interesting leads.

https://gistcdn.githack.com/daaronr/b9447c40a7a6b948f399073496f98c37/raw/scanner_elasticity_experts.html

Prompt:

... find the most relevant authors and work for Observational price variation in scanner data cannot reproduce experimental price elasticities https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4899765 -- we're looking for methodological experts to evaluate this for The Unjournal to inform our pivotal question "How do plant-based products substitute for animal products (welfare footprint)?"

exopriorsJan 11

Update: The service has been improving a lot. Infra is more robust, higher query limits, more sources, a further optimized Claude Prompt with much improved guidance for meaningful embedding vector manipulation. Someone could write some interesting quicktakes from the research you're easily able to do with this.

[comment deleted]Jan 91

Deleted by The Unjournal (bot), 01/09/2026

Reason: meant to post from regular account