willyeb70 Posted January 4, 2017 Share Posted January 4, 2017 Dear all,I would like to submit a question to which unfortunately I can not find solution. I will briefly explain my problem.I would like to populate the database with the data that are present within a table in an html file and if possiblerepeat this for each html file, I have about 2000 files to process.I did extensive research on the internet and found some solutions based on Regex and others through aextension DOM Parser but neither worked properly.Unfortunately my situation is a little complex because the html file that contains the table has otherInformation that I do not need, or other html tag I have to eliminate and then, unfortunately,the table structure isn't always the same for all files. Basically I have at least 7-8 kinds of tablesand none of them has header tags <TH>. A sample structure is this:<Table><Tr ><Td >TABLE 1 </ td></ Tr><Tr ><Td> Column1 </ td><Td> Column2 </ td><Td> Column3 </ td><Td> COLONNA4 </ td><Td> COLONNA5 </ td><Td> COLONNA6 </ td><Td> COLONNA7 </ td></ Tr><Tr ><Td >1 </ td><Td> USER 1 </ td><Td> M </ td><Td> ROME </ td><Td> RM </ td><Td> 11111111 </ td><Td> 22222222 </ td></ Tr>........</ Table>That 's just an example because in some files columns are not 7 but a different number withdifferent names.Do you think I have a chance with PHP or other tools which may include the ability to extract dataand place them in a SQL table?My little project is obviously not for commercial purposes, it is non-profit and only for study.Thank you all for your attention.GreetingsWilly Quote Link to comment Share on other sites More sharing options...
Psycho Posted January 4, 2017 Share Posted January 4, 2017 Well, it depends. If you can come up with specific rules on how the tables should be processed, then yes. These types of problems should first be analyzed without any thought to how it would be coded. Start by trying to create instructions on how you would explain to person to process the data. If you can do that - THEN proceed to writing code to adhere to those instructions. Looking at the example above, I can *guess* at some possible rules. For example, the first table row (TR) contains the name of the table. Or, does that only apply when there is only one TD in the row? The second row contains the headers for the table. Rows three to the end contain the data associated with those headers. If those are accurate rules, then it is a simple task to read the data and correlate the data to the header names. I could write some sample code, but I;m not going to do that based on a guess of what the rules should be. Now, assuming you can define the rules for getting the data - storing it in the database is another matter. Since the HTML tables are different lengths and have different fields I have no way of knowing how it should be stored. I would have to have some idea on how the data is to be used in order to make an intelligent decision. Do the tables of data have any relationship to one another? Quote Link to comment Share on other sites More sharing options...
Barand Posted January 4, 2017 Share Posted January 4, 2017 If that is a sample of your actual HTML markup, you have other problems that could hamper processing Your closing tag names do not match the opening tags <Tr> ... </ tr> and the space after the "</" is invalid. should be "</tr>" (no space). Quote Link to comment Share on other sites More sharing options...
NotionCommotion Posted January 5, 2017 Share Posted January 5, 2017 Is this something that only you will do from your own computer, or other users will do from their potentially different browser types? If the former, I would consider a browser solution. Use JavaScript (or derivatives of such as jQuery) to understand the DOM, and just a form or Ajax to send it to your PHP/MySql server. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.