Jump to content

PDF Parser


rfeio

Recommended Posts

Hi,

 

My site has several PDF documents that users can download. However, I would like to be able to index the content of those pdf files, so that the users could do a search for a given argument, and the site would return which pdf files would be relevant.

 

I was thinking that maybe the best way of doing this would be by parsing the content of the pdf files and save it on a MySQL table. When the user would do the search, the script would look in the table and return the pdf file names relevant for the search.

 

I would need some guidance on how I could parse a PDF file since I've never done this before. Also, would this be the best way of achieving what I want?

 

Thanks!

 

Rfeio

Link to comment
https://forums.phpfreaks.com/topic/152257-pdf-parser/
Share on other sites

Thanks guys!

 

dgoosens, for the look of the site you've mentioned it looks more like converting the html into PDF. What I would need would be the opposite I think.

 

bluejay002, are you referring to HTML tags? That wouldn't suit me since I would like to be able to search the content of the PDF files.

Link to comment
https://forums.phpfreaks.com/topic/152257-pdf-parser/#findComment-799623
Share on other sites

dgoosens, for the look of the site you've mentioned it looks more like converting the html into PDF. What I would need would be the opposite I think.

 

hi Rfeio,

 

my mistake...

I thought one could edit the PDFs as well with html2pdf...

but I can't find any info about it...

 

you might want to have a look at FPDF then...

http://fpdf.org/

 

Link to comment
https://forums.phpfreaks.com/topic/152257-pdf-parser/#findComment-800115
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.