binumathew Posted March 8, 2009 Share Posted March 8, 2009 Can u people help how we get the content in a webpage avoiding the html tag??? Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/ Share on other sites More sharing options...
corbin Posted March 8, 2009 Share Posted March 8, 2009 http://php.net/file_get_contents (Or you could use cURL.... or fopen, or fsockopen....) And what do you mean "avoiding the HTML tag"? Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779365 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 think i want some text in the html file....such as the text we are viewing in webpage. i want to take that and add that to my data base Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779369 Share on other sites More sharing options...
corbin Posted March 8, 2009 Share Posted March 8, 2009 So you want to strip all formating and end up with just content? Seems like you could first off pull just <body> to </body> from the HTML, and then you could just parse out all HTML tags. Sounds like you probably don't have a legit reason though. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779371 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 Ya i want to avoid the tags and get the text,can u gave me a snippet to show how it works?? Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779372 Share on other sites More sharing options...
trq Posted March 8, 2009 Share Posted March 8, 2009 Take a look at file_get_contents. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779378 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 How can i avoid the HTML tags?? Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779379 Share on other sites More sharing options...
trq Posted March 8, 2009 Share Posted March 8, 2009 strip_tags. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779381 Share on other sites More sharing options...
Daniel0 Posted March 8, 2009 Share Posted March 8, 2009 Parse it using the DOM extension or use regular expressions. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779383 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 But in the strip_tags i want to gave which all...how can i say that they contain this all.... Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779386 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 Sorry i got it now...... Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779387 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 Can any body gave me a code snippet for that??? Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779509 Share on other sites More sharing options...
Daniel0 Posted March 8, 2009 Share Posted March 8, 2009 preg_replace('#</?html>#i', '', file_get_contents('http://www.phpfreaks.com')); I don't understand what you're trying to do though. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779521 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 i am trying to develop a search engine...for edu purpose Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779522 Share on other sites More sharing options...
trq Posted March 8, 2009 Share Posted March 8, 2009 There are code snippets in the relevent manual pages. You might start there. Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779531 Share on other sites More sharing options...
binumathew Posted March 8, 2009 Author Share Posted March 8, 2009 Thks Sir, Quote Link to comment https://forums.phpfreaks.com/topic/148446-how-can-we-get-all-the-text-in-the-webpage/#findComment-779640 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.