Pdf to Text Converter in Python with source code

PDF to TEXT

PyPDF2: PyPDF2 is a built-in library of python as a PDF-toolkit. It is also called pure python package which means it can run on any platform without any dependencies on external libraries.

It is very useful tool for websites that manage the Pdfs. PyPDF2 is capable of:

· extracting document information

· splitting documents page by page

· merging documents page by page

· cropping pages

· merging multiple pages into a single page

· encrypting and decrypting PDF files

· and more!

Installation: As PyPDF2 is a pure python package , so you can install it using pip command.

pip install PyPDF2

Extracting Text From PDF:

In this article You will learn how to extract text from pdf using

Built in python library PyPDF2.

Steps:

import PyPDF2

#8259 pdf name

pdf_obj=open("8259.pdf","rb")

pdf_reader=PyPDF2.PdfFileReader(pdf_obj)

#assign page no. at getPage(0)

pageObj=pdf_reader.getPage(0)

pageText= pageObj.extractText()

print(pageText)

PdfFileReader():-Initialize PdfFileReader object that contain the pdf name or path it self and a mode “rb” read.

getpage():-The getpage() methon return a page and it takes one parameter that is the page number to retrieve a page.

extracttext():-Fetch the specified or all pages in PDF file and extract text on the file as string type with extractText .

adsterra

Pdf to Text Converter in Python with source code

Post a Comment

0 Comments

AdBanner

Floating

Categories

Featured Post

Chal Mera Putt 2 Full Punjabi Movie (2020) Download |Amrinder gill |Simi Chahal

banners

adsterra

Pdf to Text Converter in Python with source code

You may like these posts

Post a Comment

0 Comments

AdBanner

Floating

Categories

Featured Post

Chal Mera Putt 2 Full Punjabi Movie (2020) Download |Amrinder gill |Simi Chahal

banners