Listen here: Spotify, Google Podcasts, Pocket Casts, Apple. Or, just search for "Nonlinear Library" in your preferred podcasting app.
We are excited to announce the launch of The Nonlinear Library, which allows you to easily listen to top EA content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs.
In the rest of this post, we’ll explain our reasoning for the audio library, why it’s useful, why it’s potentially high impact, its limitations, and our plans. You can read it here or listen to the post in podcast form here.
Goal: increase the number of people who read EA research
An EA koan: if your research is high quality, but nobody reads it, does it have an impact?
Generally speaking, the theory of change of research is that you investigate an area, come to better conclusions, people read those conclusions, they make better decisions, all ultimately leading to a better world. So the answer is no. Barring some edge cases (1), if nobody reads your research, you usually won’t have any impact.
Research → Better conclusion → People learn about conclusion → People make better decisions → The world is better
Nonlinear is working on the third step of this pipeline: increasing the number of people engaging with the research. By increasing the total number of EA articles read, we’re increasing the impact of all of that content.
This is often relatively neglected because researchers typically prefer doing more research instead of promoting their existing output. Some EAs seem to think that if their article was promoted one time, in one location, such as the EA Forum, then surely most of the community saw it and read it. In reality, it is rare that more than a small percentage of the community will read even the top posts. This is an expected-value tragedy, when a researcher puts hundreds of hours of work into an important report which only a handful of people read, dramatically reducing its potential impact.
Here are some purely hypothetical numbers just to illustrate this way of thinking:
Imagine that you, a researcher, have spent 100 hours producing outstanding research that is relevant to 1,000 out of a total of 10,000 EAs.
Each relevant EA who reads your research will generate $1,000 of positive impact. So, if all 1,000 relevant EAs read your research, you will generate $1 million of impact.
You post it to the EA Forum, where posts receive 500 views on average. Let’s say, because your report is long, only 20% read the whole thing - that’s 100 readers. So you’ve created 100*1,000 = $100,000 of impact. Since you spent 100 hours and created $100,000 of impact, that’s $1,000 per hour - pretty good!
But if you were to spend, say 1 hour, promoting your report - for example, by posting links on EA-related Facebook groups - to generate another 100 readers, that would produce another $100,000 of impact. That’s $100,000 per marginal hour or ~$2,000 per hour taking into account the fixed cost of doing the original research.
Likewise, if another 100 EAs were to listen to your report while commuting, that would generate an incremental $100,000 of impact - at virtually no cost, since it’s fully automated.
In this illustrative example, you’ve nearly tripled your cost-effectiveness and impact with one extra hour spent sharing your findings and having a public system that turns it into audio for you.
Another way the audio library is high expected value is that instead of acting as a multiplier on just one researcher or one organization, it acts as a multiplier on nearly the entire output of the EA research community. This allows for two benefits: long-tail capture and the power of large numbers and multipliers.
Long-tail capture. The value of research is extremely long tailed, with a small fraction of the research having far more impact than others. Unfortunately, it’s not easy to do highly impactful research or predict in advance which topics will lead to the most traction. If you as a researcher want to do research that dramatically changes the landscape, your odds are low. However, if you increase the impact of most of the EA community’s research output, you also “capture” the impact of the long tails when they occur. Your probability of applying a multiplier to very impactful research is actually quite high.
Power of large numbers and multipliers. If you apply a multiplier to a bigger number, you have a proportionately larger impact. This means that even a small increase in the multiplier leads to outsized improvements in output. For example, if a single researcher toiled away to increase their readership by 50%, that would likely have a smaller impact than the Nonlinear Library increasing the readership of the EA Forum by even 1%. This is because 50% times a small number is still very small, whereas 1% times a large number is actually quite large. And there’s reason to believe that the library could have much larger effects on readership, which brings us to our next section.
Why it’s useful
EA needs more audio content
EA has a vibrant online community, and there is an amazing amount of well researched, insightful, and high impact content. Unfortunately, it’s almost entirely in writing and very little is in audio format.
There are a handful of great podcasts, such as the 80,000 Hours and FLI podcasts, and some books are available on Audible. However, these episodes come out relatively infrequently and the books even less so. There’s a few other EA-related podcasts, including one for the EA Forum, but a substantial percentage have become dormant, as is far too common for channels because of the considerable amount of effort required to put out episodes.
There are a lot of listeners
The limited availability of audio is a shame because many people love to listen to content. For example, ever since the 80,000 Hours podcast came out, a common way for people to become more fully engaged in EA is to mainline all of their episodes. Many others got involved through binging the HPMOR audiobook, as Nick Lowry puts it in this meme. We are definitely a community of podcast listeners.
Why audio? Often, you can’t read with your eyes but you can with your ears. For example, when you’re working out, commuting, or doing chores. Sometimes it’s just for a change of pace. In addition, some people find listening to be easier than reading. Because it feels easier, they choose to spend time learning that might otherwise be spent on lower value things.
Regardless, if you like to listen to EA content, you’ll quickly run out of relevant podcasts - especially if you’re listening at 2-3x speed - and have to either use your own text-to-speech software or listen to topics that are less relevant to your interests.
Existing text-to-speech solutions are sub-optimal
We’ve experimented extensively with text-to-speech software over the years, and all of the dozens of programs we’ve tried have fairly substantial flaws. In fact, a huge inspiration for this project was our frustration with the existing solutions and thinking that there must be a better way. Here are some of the problems that often occur with these apps:
- They're glitchy, frequently crashing, losing your spot, failing at handling formatting edge cases, etc.
- Their playlists don’t work or exist, so you’ll pause every 2-7 minutes to pick a new article to read, making it awkward to use during commutes, workouts, or chores. Or maybe you can’t change the order, like with Pocket, which makes it unusable for many.
- They’re platform specific, forcing you to download yet another app, instead of, say, the podcast app you already use.
- Pause buttons on headphones don’t work, making it exasperating to use when you’re being interrupted frequently.
- Their UI is bad, requiring you to constantly fiddle around with the settings.
- They don’t automatically add new posts. You have to do it manually, thus often missing important updates.
- They use old, low-quality voices, instead of the newer, way better ones. Voices have improved a lot in the last year.
- They cost money, creating yet another barrier to the content.
- They limit you to 2x speed (at most), and their original voices are slower than most human speech, so it’s more like 1.75x. This is irritating if you’re used to faster speeds.
In the end, this leads to only the most motivated people using the services, leaving out a huge percentage of the potential audience.
How The Nonlinear Library fixes these problems
To make it as seamless as possible for EAs to use, we decided to release it as a podcast so you can use the podcast app you’re already familiar with. Additionally, podcast players tend to be reasonably well designed and offer great customizability of playlists and speeds.
We’re paying for some of the best AI voices because old voices suck. And we spent a bunch of time fixing weird formatting errors and mispronunciations and have a system to fix other recurring ones. If you spot any frequent mispronunciations or bugs, please report them in this form so we can continue improving the service.
Initially, as an MVP, we’re just posting each day’s top upvoted articles from the EA Forum, Alignment Forum, and LessWrong. (2) We are planning on increasing the size and quality of the library over time to make it a more thorough and helpful resource.
Why not have a human read the content?
The Astral Codex Ten podcast and other rationalist podcasts do this. We seriously considered this, but it’s just too time consuming, and there is a lot of written content. Given the value of EA time, both financially and counterfactually, this wasn’t a very appealing solution. We looked into hiring remote workers but that would still have ended up costing at least $30 an episode. This compared to approximately $1 an episode via text-to-speech software.
On top of the time costs leading to higher monetary costs, it also makes us able to make a far more complete library. If we did this with humans and we invested a ton of time and management, we might be able to convert seven articles a week. At that rate, we’d never be able to keep up with new posts, let alone include the historical posts that are so valuable. With text-to-speech software, we could have the possibility of keeping up with all new posts and converting the old ones, creating a much more complete repository of EA content. Just imagine being able to listen to over 80% of EA writing you’re interested in compared to less than 1%.
Additionally, the automaticity of text-to-speech fits with Nonlinear’s general strategy of looking for interventions that have “passive impact”. Passive impact is the altruistic equivalent of passive income, where you make an upfront investment and then generate income with little to no ongoing maintenance costs. If we used human readers, we’d have a constant ongoing cost of managing them and hiring replacements. With TTS, after setting it up, we can mostly let it run on its own, freeing up our time to do other high impact activities.
Finally, and least importantly, there is something delightfully ironic about having an AI talk to you about how to align future AI.
On a side note, if for whatever reason you would not like your content in The Nonlinear Library, just fill out this form. We can remove that particular article or add you to a list to never add your content to the library, whichever you prefer.
Future Playlists (“Bookshelves”)
There are a lot of sub-projects that we are considering doing or are currently working on. Here are some examples:
- Top of all time playlists: a playlist of the top 300 upvoted posts of all time on the EA Forum, one for LessWrong, etc. This allows people to binge all of the best content EA has put out over the years. Depending on their popularity, we will also consider setting up top playlists by year or by topic. As the library grows we’ll have the potential to have even larger lists as well.
- Playlists by topic (or tag): a playlist for biosecurity, one for animal welfare, one for community building, etc.
- Playlists by forum: one for the EA Forum, one for LessWrong, etc.
- Archives. Our current model focuses on turning new content into audio. However, there is a substantial backlog of posts that would be great to convert.
- Org specific podcasts. We'd be happy to help EA organizations set up their own podcast version of their content. Just reach out to us.
- Other? Let us know in the comments if there are other sources or topics you’d like covered.
Who we are
We're Nonlinear, a meta longtermist organization focused on reducing existential and suffering risks. More about us.
(1) Sometimes the researcher is the same person as the person who puts the results into action, such as Charity Entrepreneurship’s model. Sometimes it’s a longer causal chain, where the research improves the conclusions of another researcher, which improves the conclusions of another researcher, and so forth, but eventually it ends in real world actions. Finally, there is often the intrinsic happiness of doing good research felt by the researcher themselves.
(2) The current upvote thresholds for which articles are converted are:
25 for the EA forum
30 for LessWrong
No threshold for the Alignment Forum due to low volume
This is based on the frequency of posts, relevance to EA, and quality at certain upvote levels.
I would love to be able to listen to Open Philanthropy research reports in this way.
Whatever happens with the discussions about copyright, I really hope this continues to exist. I listened to six forum posts at 5am today while walking a baby around to sleep... very good for parental mental health
3 months on, and this has become one of the most valuable EA/Alignment/Rationality dissemination innovations I've seen. Has replaced almost all my more vapid listening. Would get through an extra 10-20 hours of content a week. Thank you Nonlinear/Kat/Emerson
Thank you for creating this resource. I listened to the audio version of this post and was impressed with the quality of the speech synthesis. Virtually indistinguishable from a human, and with a very pleasant voice tone.
Yeah, they've gotten way better in the last year or so. A huge jump in quality.
I think a lot of people who were turned off by the robot-iness of TTS in the past should try again because they've come such a long way.
We’re not really planning on doing any moderation. In some rare cases we might decide to remove something if we consider it a very clear info hazard.
However, even if we convert pretty much all EA content, I doubt we’d ever remove a lower bar of at least 5 upvotes, which I think would allow the community itself to moderate as opposed to leaving it in our hands at Nonlinear.
Are you getting author's consent before turning their work into a podcast?
It’s a bot so it does it automatically. We figured since the vast majority of people will want to have their content read by more people that it made more sense to by default convert and let people opt out by filling out this form.
If someone deletes their original post, do you auto-remove it from the podcast as well? That would seem important to me.
I do think that there's an interesting fuzzy boundary here between "derivative work" and "interpretative tool".
e.g. with the framing "turn it into a podcast" I feel kind of uncomfortable and gut-level wish I was consulted on that happening to any of my posts.
But here's another framing: it's pretty easy to imagine a near-future world where anyone who wants can have a browser extension which will read things to them at this quality level rather than having visual fonts. If I ask "am I in favour of people having access to that browser extension?", I'm a fairly unambiguous yes. And then the current project can be seen as selectively providing early access to that technology. And that seems ... pretty fine?
This actually makes me more favourable to the version with automated rather than human readers. Human readers would make it seem more like a derivative work, whereas the automation makes the current thing seem closer to an interpretative tool.
If we had to ask each person before converting their text to audio, it just wouldn’t feasibly happen.
The way we thought about it was that >99.9% of people will be thrilled to have their writing read by more people. For the <0.01% who won’t, we made it easy for them to opt out, either for a particular article or for all their work. This way the whole community and hundreds of EAs can have access to great content and the <0.01% can also keep their writing in only written format.
This way everybody wins. :) I think the utilitarian case for it is strong.
Especially once it’s more known, people will know that if their post gets enough upvotes it’ll be converted, so they can request to not have it converted beforehand if they want. The potential harm is small and mitigated and the potential upside is huge.
These responses do seem curiously blithe about the question of whether or not this is legal.
FWIW I think I endorse Kat's reasoning here. I don't think it matters if it is illegal if I'm correct in suspecting that the only people who could bring a copyright claim are the authors, and assuming the authors are happy with the system being used. This is analogous to the way it is illegal, by violating minimum wage laws, to do work for your own company without paying yourself, but the only person who has standing to sue you is AFAIK yourself.
Not a lawyer, not claiming to know the legal details of these cases, but I think this standing thing is real and an appropriate way to handle
I've seen this reasoning a lot, where EA organisations assume they won't get sued because the only people they're illegally using the data of are other EAs, and and as someone whose data has been misused with this reasoning, I don't love it!
This is a reason to fix the system! My point is that it reduces to "make all the authors happy with how you are doing things", there is not some spooky extra thing having to do with illegality
TBC I do not endorse using people's content in a way they aren't happy with, but I would still have that same belief if it wasn't illegal at all to do so.
My more detailed response is:
As such, flagrantly violating the law on a fairly large scale (and the scale is an important part of the pitch here) seems like a dubious idea. Especially if you also go on public record in a way that suggests you know it's illegal and don't care.
<0.01% is definitely overconfident given at that point I had already expressed misgivings and we do not have 10,000 authors on the EA Forum.
(I'm not against my writing being podcastified in principle but I want to check out any podcast services who broadcast my work in advance to decide if I'm happy to be associated with them. I'm strongly against someone else making that decision for me.)
If I were to receive such messages, I would likely fail to respond (unintentionally) at least 20% of the time.
I don't think this analysis is right. The character of use may be educational, but on the other hand, I'm not sure you're transforming it - you're simply reproducing the text as audio. The nature of work goes beyond presenting public domain things like facts and ideas, by quoting the text as a whole. The amount used is maximal. The effect on market is substantial, in that it prevents the author from selling their writing as audio.
As for precedents. Well... Righthaven v. Hoehn established that it is OK to present a full editorial article in the setting of a noncommercial online post that discusses it. But here you're just presenting the work in its entirety, and not using it for comment. Google Books scanning was deemed legal. But that was because they don't represent the whole book. Another relevant example, although it was settled out of court, is that Audible bought rights to some books, and then got sued for allowing the text to be read aloud by TTS AI. If you don't even own the text, then having the text read is I think a larger infringement.
So I doubt that narrating a whole post is fair use - rather it looks like a copyright infringement.
Would love if you could do this for the EA Intro Fellowship syllabus (e.g. here's one syllabus, but note that the syllabus is continuously updating between semesters and different universities use different syllabi)
This is cool! I have some specific thoughts and questions on the TTS software that you list for personal use, and how other TTS / podcast options might compete with this Library:
... (read more)
- What I personally do to listen to an article I'm on is I say "Read it" to the Google Assistant on my phone, and it reads aloud any article I view on my phone in a really nice AI voice. They have a few nice voices to choose from, with different accents. I think it's even a bit more human-like than the current one used by The Nonlinear Library, which is already pretty human-like. So I think I'm likely to stick to using my Google Assistant than the Library, although this could be useful for when I want to download and listen to EA articles when I'm offline.
My phone is a Google Pixel 4a running Android 11, but I think this read aloud feature is probably available on any Android phone with Google Assistant. Have you tried using it? If yes, what made you not mention it in this article? I haven't tried using Evie, and I think I tried @Voice before and wasn't pleased with it.
My guess is the Google Assistant is a bit better (in terms of ease of use and voice quality) than Evie or @Voice. It has these 2 other c
Thank you for making this! I’ve been using this generously the past few days. One suggestion I’ll make is I think it would be nice if the episode descriptions were something like:
Beginning of post (50 words?)
Link to post”
Having the post links immediately available would be super helpful for reading the comments while listening.
Thanks for doing this! I've found it useful, and I expect that it will increase my engagement with EA Forum/LW content going forward.
This is great and very helpful. Looking forward to listening.
Re 'a human reader', we are doing this for EA Forum posts on the EA Forum Podcast ... but only for selected posts
I also read and discuss some other EA-ish content (esp. relevant academic work) on my Found in the Struce podcast.
What a cool project! I listen to the vast majority of my reading these days and am perpetually out of good things to read.
The linked audio is reasonably high quality, and more importantly, it doesn't have some of the formatting artifacts that other TTS programs have. Well done.
Your story for why this is a potentially high impact project is plausible to me, especially given how much you've automated. I have independently been thinking about building something similar, but with a very different story for why it could be worth my time to do it. That means th... (read more)
After several months of use, I've found this tool incredibly useful and time-saving.
It seems like this feed is now capturing only a small portion of EA Forum posts meeting a 25 karma theshold or a 25 total positive votes threshold. (Not sure which of those thresholds footnote 2 is indicating you intended to use. But I notice that this post was included in the feed despite having <25 individual votes, while having >25 karma.)
E.g., this post wasn't included, and some of the recent "MIRI conversations" posts weren't captured.
I have become surprisingly dependent on this feed surprisingly quickly, so (a) I'm wondering if you ... (read more)
Could you include AI governance newsletters? In particular, I’d like to be able to listen to ChinAI, Import.ai, Charlotte Stix’s newsletter, and CSET’s newsletter. (Caveat that I haven’t read the thread about copyright and that I have no idea if these authors would be happy with their newsletters being added to this feed.)
It seems like on Spotify and Apple Podcasts, only episodes since Oct 20 are available? (But I think last week older things were available.) Do older eps automatically get deleted? Is there a way to find them still?
(Apologies if this was already covered in the post/comments - I only skimmed.)
Distinguish or isolate the intro and conclusion about the Nonlinear Library from the main content.
Ideally, use a different voice (eg, a human recording). A common solution is to insert a chime. Either of those would require splicing audio files. If you want to keep to just splicing text files, maybe there's a way of inserting something to make the TTL pause?
Thanks for this, I've found this very helpful for consuming more EA Forum content. Are there ways to only have EA Forum posts on our podcast provider's "subscriber" feed, as opposed to LW and AF posts too? Eg I find that if I have a lot of podcast episodes in my podcast feed, I am less likely to listen to any of them, as it's harder to find the podcasts that I really want to listen to.
I suspect this could look like different podcast "feed"/profile with only EA Forum posts (and maybe similarly for AF and LW posts, eg they could have their own feed/profile with only podcast episodes for posts on those platforms).
I've noticed some question posts get included in the feed. I did find these useful, but maybe it'd be good to have the narrator or title explicitly note that this is a question post, and maybe it'd be good to have the narrator read the top answer or two?
Did you consider creating an audiobook from a new EA handbook? I think that it would be very useful, especially if the handbook is used during introductory fellowships...
I see you've started including some text from the post in each episode description, which is useful! Could you also include the URL to the post, at the top of the episode description? I often want to check out comments on interesting posts.
Maybe posts that were published before the Nonlinear Library started uploading but which have (say) >100 karma, got Forum prizes, and/or got LW curation should also be included in the feed?
(Apologies if this was already suggested in the post or comments - I only skimmed)
Is there a way to have this feed sorted by some rough sense of value?
Likewise is there a way to trim the intro off the episodes.
As per the thread with Pablo, I think the podcast sounds pretty good. Having said that, I do have one small suggested improvement. When I look at the logo (a Sierpinski triangle), and think about what it's supposed to represent, it makes me think of a pyramid, or of growing replicators "one person recruits three, and so on". In particular, although this may seem kind-of unfair, it kinda reminds of this scene from the Office. Given that movement building is a major project of the org, that's probably not the connotation that you want. I realize ... (read more)
Uhh, cool! :)
Did you consider including some content from or info about the comments to the respective article? E.g. I could imagine it would safe me some time if such a reader would also read out top level comments above say 10 karma and how many sub-comments are below it. Roughly half of the time I want to check out the comment section after listening on Pocket and maybe half of those times would be covered with the above readouts.
Could post authors also get to first listen to what their post would sound like?
(For some posts, it might be perhaps difficult to know in advance whether the automatic narration would cause too many misunderstandings to be a net positive in audio form. This might be especially relevant for posts that were never meant... (read more)
Interesting! Of course, the experience might be, in some ways, quite confusing compared to a human narration. For example, the automatic narration does not seem to separate headings or quotes from the main text. Could the AI be taught to identify headings and quotes, and make them stand out?
(E.g., headings might be ideally narrated with longer pauses, and quotes perhaps even in a different voice.)
Another difference between automatic vs. human narration: The automatic narration does not notify the listener whenever they might miss some meaningful hyperlink ... (read more)
Exciting! But where's the podcast's URL? All I can find is a Spotify link.
Edit: I was able to track it down, here it is https://spkt.io/f/8692/7888/read_8617d3aee53f3ab844a309d37895c143
There are some recent posts, for example this one that are just the intro and outro (22 seconds long) and miss the main post. Would be great if this bug could be fixed.
Please include links to originals in podcast episode details.
Posted by Eliezer Yudkowsky a few days ago:
I'd be happy to help formatting the text if you want
Will you tag each EA forum post that is added to the Nonlinear library with the 'audio' tag? I want to make sure we don't duplicate these on the EA Forum Podcast or on Found in the Struce or elsewhere.
Of course sometimes a human reading might be worth doing on top of a machine-read one... but probably better to first focus on spreading the audio widely.
I'm quite liking this tool, and found myself consuming more content in both EA and LW.
Quick suggestion that I think could be helpful. I noticed many of the posts are quite short, less than 1 min long, might contain a few links, or be a quick question for commenters, making them less useful for listeners. I'm not sure what filter could be additionally applied, and considering it could be valuable
This is awesome! I did something similar for Astral Codex Ten (feed, post) a while back. The human version is also good, if you like that kind of thing.