Jump to content

Recommended Posts

Hi All,

I m new to php. I m working for job portal where people upload their resume. I m trying to read the content of the resume ( pdf,doc,docx,txt  files ) and display them in the users profile page once they upload the file.

 

Any help regarding how to read and display content from pdf/doc/docx/txt file ??

 

Thanks in advance.

there are many file system functions...take a look here

http://us.php.net/manual/en/ref.filesystem.php

The PHP file system functions are not going to solve this problem. Simply reading the contents of the file source is only going to provide a solution for txt documents. The format of the content in a PDF or Word document can be very complex and nearly impossible to extract in a meaningful way. The reason is you have no way to decipher what is what within the document. There will be different styles/formats on how elements are related. For example, did the user format Positions/Companies/Responsibilities using different header formats or was it formatted using a table layout?

 

Trying to build your own parsing engine will require an intimate understanding of all the possible constructs used in those different formats. It would take you week (more likely months) to get the knowledge needed to even start. But, even then, you would still have a problem restructuring the content into a meaningful way.

 

As I see it, you have three options:

 

1. Find third party scripts/utilities to read those document types for you. The good ones will cost you money and may require additional software to be installed on the server (which would require you to have a dedicated server, i.e. more $$$). If they are good, they will be able to reformat the content into an HTML format.

 

2. Have the user enter their Resume into a text-area. Many on-line recruiting sites require this. You could even try using one of the plug-ins that allow the user to submit formatted text (such as this site uses when you go to the "preview" page.

 

3. Allow the user to upload a document, but don't parse it, just include a link where the user needs to access the resumes.

Thanks for your response.

But the problem is

1)i cannot give a text area for the user to copy paste his resume as there are thousands of existing users and it is impossible to make them edit their profile at this point of time.

2) this site already has a link where the users can download their resume. We are looking forward to displaying the resume as text too . As this would help the employers search for the right person for their job in a easier way.

 

So would converting the word/pdf to txt files and then displaying them in the webpage help ?

 

 

So would converting the word/pdf to txt files and then displaying them in the webpage help ?

 

Well, a txt file is easily readable. But then your problem is how is that conversion going to be done and who is going to do it. Just as before there are tools that can do that extract the text from Word/PDF docs, but it will still likely require review and tweaking by a human.

perhaps I should have explained further, the link was for insight on the functions available for the txt file(s) specifically. I have not read into the relations of .pdf .doc files with PHP....that is all you mjdamato

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.