Hide table of contents

As of August 2023, Speechify is probably your best option.

I’ve made a 3 minute demo to show how it works (and sounds).

Listening to PDFs

For PDFs, the key strengths of Speechify are:

  1. Multi-modal UI: listening to academic papers is nice, but often you really need to use your eyes—to read a graph, grok a formula, or skip to the interesting parts. Speechify's interface makes it easy to switch between listening and reading, whether on desktop, mobile phone or iPad. To skip to a particular sentence, just click on it.
  2. Solid mobile apps (iOS and Android): files are synchronised quickly, and it remembers your playback position across devices. You can read your PDFs, tap to skip to a given sentence, etc.
  3. Great voices: the best AI voices I've heard. The pronunciation of specialist terms is imperfect, but much better than any other service.
  4. Filtering: the app can filter out citations, URLs and parenthetical remarks.
  5. Speed: everything is fast; most of the UI is well-designed.
  6. OCR: you can listen to PDFs even when they are just scans of a physical book.

The main shortcomings are:

a. It sometimes narrates text it should obviously skip. There should be more skip options, e.g. skip all tables.

b. It can't really handle formulae.

c. You can't annotate PDFs in their app.

At TYPE III AUDIO, we’ve been thinking about building a “listen to PDF” feature that delivers a narration to your podcast app. We’re still considering this, but my current view is that a multi-modal UI is the ideal setup for proper engagement with academic PDFs, and that Speechify are well on course to nailing it.

Listening to a PDF with Speechify. See my 3 minute demo.

Listening to Google Docs

Option 1. Use the Speechify app (more features, less secure)

If you grant Speechify access to your Google Drive, then you can import Google Docs into their web app. They convert your doc into a PDF, and you get the same experience I described above.

The main problem is that you have to give Speechify access to all documents on your Google Drive. For some people, that will be an unacceptable security risk.  

Option 2. Use the browser extension (more secure, fewer features)

You can use the Speechify browser extension to listen to Google Docs directly on docs.google.com.

The listening experience is similar to PDFs, but:

  1. You can't filter out citations, URLs, etc.
  2. If you want to listen on mobile, you have to click the "Bookmark" icon. When you do this, all the formatting of the Google Doc is lost.

During installation the extension requests access to all your tabs. But, you can change this to “When you click the extension”. If you do that, then your browser will only give Speechify access to a particular document (or other URL) when you click the icon.

Note: Speechify may not be suitable for listening to extremely sensitive documents

Speechify sends the text of your document to their text-to-speech API. If you are dealing with extremely sensitive material, the security/convenience tradeoff may not be worth it. Ask your IT security team if you’re unsure.

Bonus: Listen to any web page

You can also use Speechify to listen to any web page.

Their browser extension is the best way to do this. If you want to listen on your mobile, click “Bookmark” to add the article to your library.

The Speechify app also has an “add via URL” function, but it’s weirdly bad at detecting the start of the article text. You’ll often have to skip through navigation elements and other useless stuff before you get to the start of the article. I imagine they'll fix this soon.

Have you tried Speechify?

If you’ve tried Speechify, I’d love to hear from you, especially if you’re listening to academic PDFs. Do you use it regularly? If not—why not? How could it be improved?

Comment below or write to peter@type3.audio.

25

0
0

Reactions

0
0

More posts like this

Comments8


Sorted by Click to highlight new comments since:
[anonymous]4
1
0

I also like Speechify. One tip:

If you do the free 3-day trial, and then don't subscribe immediately afterward, they may send you a 50% discount code to encourage you to subscribe, reducing the cost from $139 —> $70/year. (At least, this worked for me a few months ago.)

The most frustrating part of Speechify for me is that: a) big documents sometimes cause the app on my phone to crash (which is maybe more of a problem with my phone) and b) like Peter says, it will read random superfluous text, which is more of a problem with academic papers/books. (For instance, I'm listening to a 400-page book right now, and at the end of each page it says "Copyright 2023 University of Chicago Press, All Rights Reserved," which got old around page 3.) 

On Microsoft Edge (the browser) there's a "read aloud" option that offers a range of natural voices for websites and PDFs. It's only slightly worse than speechify and free – and can give a glimpse of whether $139/year might be worth it for you.

@peterhartree Feel free to ignore: Are there any updates in your workflow of listening to G Docs or PDFs? The post is now roughly 1 year ago.

- I have been using speechify for a year now. I think it's decent, but the UI and the frequent crashes are a bit annoying for me. So I just wanted to see if there are better products out there :)

So cool!!!! Thank you :)

Quick note/warning about speechify:

  • I've been using it for about 6 months. 
  • I've had several bugs using the app and customer support couldn't help. I spent a total of about 2 hours trying to fix them, but nothing worked.
  • The main problem was that I could not use it outside my home. Every 2 minutes or so it would detect "poor internet connection" and shut down. This is extremely annoying when you're trying to listen to a newspaper while walking. 
  • Hopefully they will fix this. I still use it, but only occasionally.

Experience regarding discount

  • I tried the 50 % discount, but they increased the original prize to 200 $, so I had to pay 100 $. I massaged the support and finally after several weeks they refunded 30 $. So discount works but maybe you have to contact the support :)

In case anyone else was wondering about pricing: most of the features described require the premium plan, which is $139 / year.

(There is a free version, but it only includes some low-quality voices and doesn't allow changing speed, so it's not very useful.)

Thanks for this, this is a topic I am very interested in -- to the the killer feature missing in Speechify is the ability to highlight and sync those highlights. Or more broadly, annotating in a multimodal way is difficult.

I instead use Goodreader, where you can have e.g. a Dropbox folder of all your PDFs synced across desktop and mobile; and you can annotate those PDFs while listening, then sync to Dropbox.

The downside of Goodreader is that the voice is pretty bad, and also that you can't reflow the text to make it easier to read on mobile while in audio mode.

PS: The Readwise Reader app seems to be working towards an excellent all-in-one reader with TTS capability, but the existing version I've found to be a little too slow / some other issues. But it's still in development and seems very promising.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Ronen Bar
 ·  · 10m read
 · 
"Part one of our challenge is to solve the technical alignment problem, and that’s what everybody focuses on, but part two is: to whose values do you align the system once you’re capable of doing that, and that may turn out to be an even harder problem", Sam Altman, OpenAI CEO (Link).  In this post, I argue that: 1. "To whose values do you align the system" is a critically neglected space I termed “Moral Alignment.” Only a few organizations work for non-humans in this field, with a total budget of 4-5 million USD (not accounting for academic work). The scale of this space couldn’t be any bigger - the intersection between the most revolutionary technology ever and all sentient beings. While tractability remains uncertain, there is some promising positive evidence (See “The Tractability Open Question” section). 2. Given the first point, our movement must attract more resources, talent, and funding to address it. The goal is to value align AI with caring about all sentient beings: humans, animals, and potential future digital minds. In other words, I argue we should invest much more in promoting a sentient-centric AI. The problem What is Moral Alignment? AI alignment focuses on ensuring AI systems act according to human intentions, emphasizing controllability and corrigibility (adaptability to changing human preferences). However, traditional alignment often ignores the ethical implications for all sentient beings. Moral Alignment, as part of the broader AI alignment and AI safety spaces, is a field focused on the values we aim to instill in AI. I argue that our goal should be to ensure AI is a positive force for all sentient beings. Currently, as far as I know, no overarching organization, terms, or community unifies Moral Alignment (MA) as a field with a clear umbrella identity. While specific groups focus individually on animals, humans, or digital minds, such as AI for Animals, which does excellent community-building work around AI and animal welfare while
Max Taylor
 ·  · 9m read
 · 
Many thanks to Constance Li, Rachel Mason, Ronen Bar, Sam Tucker-Davis, and Yip Fai Tse for providing valuable feedback. This post does not necessarily reflect the views of my employer. Artificial General Intelligence (basically, ‘AI that is as good as, or better than, humans at most intellectual tasks’) seems increasingly likely to be developed in the next 5-10 years. As others have written, this has major implications for EA priorities, including animal advocacy, but it’s hard to know how this should shape our strategy. This post sets out a few starting points and I’m really interested in hearing others’ ideas, even if they’re very uncertain and half-baked. Is AGI coming in the next 5-10 years? This is very well covered elsewhere but basically it looks increasingly likely, e.g.: * The Metaculus and Manifold forecasting platforms predict we’ll see AGI in 2030 and 2031, respectively. * The heads of Anthropic and OpenAI think we’ll see it by 2027 and 2035, respectively. * A 2024 survey of AI researchers put a 50% chance of AGI by 2047, but this is 13 years earlier than predicted in the 2023 version of the survey. * These predictions seem feasible given the explosive rate of change we’ve been seeing in computing power available to models, algorithmic efficiencies, and actual model performance (e.g., look at how far Large Language Models and AI image generators have come just in the last three years). * Based on this, organisations (both new ones, like Forethought, and existing ones, like 80,000 Hours) are taking the prospect of near-term AGI increasingly seriously. What could AGI mean for animals? AGI’s implications for animals depend heavily on who controls the AGI models. For example: * AGI might be controlled by a handful of AI companies and/or governments, either in alliance or in competition. * For example, maybe two government-owned companies separately develop AGI then restrict others from developing it. * These actors’ use of AGI might be dr