How to listen to PDFs & Google Docs

peterhartree

This is a linkpost for https://blog.type3.audio/how-to-listen-to-academic-papers/

As of August 2023, Speechify is probably your best option.

I’ve made a 3 minute demo to show how it works (and sounds).

Listening to PDFs

For PDFs, the key strengths of Speechify are:

Multi-modal UI: listening to academic papers is nice, but often you really need to use your eyes—to read a graph, grok a formula, or skip to the interesting parts. Speechify's interface makes it easy to switch between listening and reading, whether on desktop, mobile phone or iPad. To skip to a particular sentence, just click on it.
Solid mobile apps (iOS and Android): files are synchronised quickly, and it remembers your playback position across devices. You can read your PDFs, tap to skip to a given sentence, etc.
Great voices: the best AI voices I've heard. The pronunciation of specialist terms is imperfect, but much better than any other service.
Filtering: the app can filter out citations, URLs and parenthetical remarks.
Speed: everything is fast; most of the UI is well-designed.
OCR: you can listen to PDFs even when they are just scans of a physical book.

The main shortcomings are:

a. It sometimes narrates text it should obviously skip. There should be more skip options, e.g. skip all tables.
b. It can't really handle formulae.
c. You can't annotate PDFs in their app.

At TYPE III AUDIO, we’ve been thinking about building a “listen to PDF” feature that delivers a narration to your podcast app. We’re still considering this, but my current view is that a multi-modal UI is the ideal setup for proper engagement with academic PDFs, and that Speechify are well on course to nailing it.

Listening to a PDF with Speechify. See my 3 minute demo.

Listening to Google Docs

Option 1. Use the Speechify app (more features, less secure)

If you grant Speechify access to your Google Drive, then you can import Google Docs into their web app. They convert your doc into a PDF, and you get the same experience I described above.

The main problem is that you have to give Speechify access to all documents on your Google Drive. For some people, that will be an unacceptable security risk.

Option 2. Use the browser extension (more secure, fewer features)

You can use the Speechify browser extension to listen to Google Docs directly on docs.google.com.

The listening experience is similar to PDFs, but:

You can't filter out citations, URLs, etc.
If you want to listen on mobile, you have to click the "Bookmark" icon. When you do this, all the formatting of the Google Doc is lost.

During installation the extension requests access to all your tabs. But, you can change this to “When you click the extension”. If you do that, then your browser will only give Speechify access to a particular document (or other URL) when you click the icon.

Note: Speechify may not be suitable for listening to extremely sensitive documents

Speechify sends the text of your document to their text-to-speech API. If you are dealing with extremely sensitive material, the security/convenience tradeoff may not be worth it. Ask your IT security team if you’re unsure.

Bonus: Listen to any web page

You can also use Speechify to listen to any web page.

Their browser extension is the best way to do this. If you want to listen on your mobile, click “Bookmark” to add the article to your library.

The Speechify app also has an “add via URL” function, but it’s weirdly bad at detecting the start of the article text. You’ll often have to skip through navigation elements and other useless stuff before you get to the start of the article. I imagine they'll fix this soon.

Have you tried Speechify?

If you’ve tried Speechify, I’d love to hear from you, especially if you’re listening to academic PDFs. Do you use it regularly? If not—why not? How could it be improved?

Comment below or write to peter@type3.audio.

Effective Altruism Forum
EA Forum