PDF to Audiobook


Professional project
Python code solution and explanation
PyPDF2, gtts, audiobook, text-to-speech

Introduction

Using AI text-to-speech, a Python script can convert PDF text to speech, creating a free audiobook. The latest text-to-speech technology is so advanced, it can produce a more lifelike performance than a traditional audiobook.

Today we’re going to write a python script that takes a PDF file and converts it into speech, so you can read later, and listen more!

Solution

First, let’s import the essential libraries to get this to work. We will need PyPDF2 to read from the pdf. We will need gtts, a google text-to-speech library. And finally we need to import os to play the mp3.

from PyPDF2 import PdfReader
from gtts import gTTS
import os

Now, download the pdf and save it to the same directory as the python project. Then, specify the page number you want to read, and extract the text from that page.

filename = "file_to_convert.pdf"
reader = PdfReader(filename)
page = reader.pages[0]
text = page.extract_text()

Then, save the text and language you need to a variable. Put it through the gTTS function, with the following parameters.

my_text = text
my_language = "en"

file = gTTS(text=my_text, lang=my_language, slow=False)

Now, we can save the file as an mp3, and then play the mp3.

file.save("newaudiofile.mp3")
os.system("newaudiofile.mp3")
Final code

To view my full code, please visit my GitHub repository: https://github.com/Gursehaj-Singh/pdf-to-audiobook

from PyPDF2 import PdfReader
from gtts import gTTS
import os

filename = "file_to_convert.pdf"
reader = PdfReader(filename)
page = reader.pages[0]
text = page.extract_text()

my_text = text
my_language = "en"

file = gTTS(text=my_text, lang=my_language, slow=False)

file.save("newaudiofile.mp3")
os.system("newaudiofile.mp3")
Example run

contents of file_to_convert.pdf

Hello this is a pdf to audiobook converter. Enjoy!

newaudiofile.mp3

reading from pdf
,

2 responses to “PDF to Audiobook”