Jump to content

Parsing Text File


gigantorTRON

Recommended Posts

Hello,

I'm working with a text file that contains the names of websites and their corresponding URLs in the following fashion:

<a href="www.cnn.com">CNN News</a>

Each site and title is separated by line breaks. I'm not very experienced with fopen, fread, etc. and was wondering if anyone could give me pointers on how to go about reading the text file line by line and saving the URL and title in separate columns in a database.

 

Oh, and one more thing. The categories of these sites are also included every so often. I was hoping to save the category in a third field. Ex:

 

News (\n new line here)
<a href="www.cnn.com">CNN News</a> (\n)
<a href="www.bbc.com">BBC News </a>

etc.

 

Thanks!

Link to comment
https://forums.phpfreaks.com/topic/77361-parsing-text-file/
Share on other sites

Here's your pointer: file().

Then you could do this (read comments):

 

<?php
$array = file('filename.txt');
foreach ($array as $key => $value) {
$url = preg_replace("/.*href=\"(.*)\">.*/", "$1", $value, 1); //retrieve URLs
$title = preg_replace("/.*\">(.*)<\/a>.*/", "$1", $value, 1); //retrieve titles
mysql_query("INSERT INTO `table` (`url`, `title`) VALUES ('$url', '$title')"); //insert into database
}
?>

 

Dunno about your last question. Didn't test the code, hope it works.

 

EDIT: Just tested without the query part, had a small error in the last preg_replace; it's corrected now.

Link to comment
https://forums.phpfreaks.com/topic/77361-parsing-text-file/#findComment-391667
Share on other sites

And, to remove the extra space added to the titles/URLs (caused by line breaks some how), add an s to the first preg_replace parameter:

 

<?php
$array = file('filename.txt');
if ($array) {
foreach ($array as $key => $value) {
	$url = preg_replace("/.*href=\"(.*)\">.*/s", "$1", $value, 1); //retrieve URLs
	$title = preg_replace("/.*\">(.*)<\/a>.*/s", "$1", $value, 1); //retrieve titles
	mysql_query("INSERT INTO `table` (`url`, `title`) VALUES ('$url', '$title')"); //insert into database
}
}
?>

 

I also added a check for the file before retrieving the strings.

Link to comment
https://forums.phpfreaks.com/topic/77361-parsing-text-file/#findComment-391688
Share on other sites

I won't stop, will I? ;)

 

Made sure that the code I posted won't insert wrong stuff into the database when the category lines are passed:

 

<?php
$array = file('filename.txt');
if ($array) {
foreach ($array as $key => $value) {
	if (strpos($value, "</a>") === false) {continue;} //skip the category lines
	$url = preg_replace("/.*href=\"(.*)\">.*/s", "$1", $value, 1); //retrieve URLs
	$title = preg_replace("/.*\">(.*)<\/a>.*/s", "$1", $value, 1); //retrieve titles
	mysql_query("INSERT INTO `table` (`url`, `title`) VALUES ('$url', '$title')"); //insert into database
}
}
?>

Link to comment
https://forums.phpfreaks.com/topic/77361-parsing-text-file/#findComment-391720
Share on other sites

slight mod to the above code to get the categories

 

<?php
$array = file('filename.txt');
if ($array) {
foreach ($array as $key => $value) {
	if (strpos($value, "</a>") === false) {
                     $cat = trim ($value);
                }
                else { 
	$url = preg_replace("/.*href=\"(.*)\">.*/s", "$1", $value, 1); //retrieve URLs
	$title = preg_replace("/.*\">(.*)<\/a>.*/s", "$1", $value, 1); //retrieve titles
	mysql_query("INSERT INTO `table` (`cat`,`url`, `title`) VALUES ('$cat', '$url', '$title')"); //insert into database
               }
}
}
?>

Link to comment
https://forums.phpfreaks.com/topic/77361-parsing-text-file/#findComment-392207
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.