Jump to content

Need help parsing large ammount on json data


wee493

Recommended Posts

I recently got the twitter streaming API working, now I have tons of json data. I don't know how to parse json data. I've done some research, but when i try to do json_decode() i get an error because it's multiple lines.

 

this is what i tried to do, but I have no idea what i'm doing when it comes to json data & parseing

$data = fopen('2009121023.txt', 'r');

$tweets = json_decode($data);

echo $tweets->screen_name;

 

Here is an example of one set of data. There are about a couple hundred like these in each of my text files.

{"truncated":false,"text":"As in HCNOTMA: New CDs earlier this year from Buddy and Julie Miller http://tr.im/HgQS and Visqueen. http://tr.im/HgR3","favorited":false,"in_reply_to_user_id":null,"in_reply_to_status_id":null,"in_reply_to_screen_name":null,"source":"<a href=\"http://code.google.com/p/qwit/\" rel=\"nofollow\">Qwit</a>","created_at":"Fri Dec 11 05:21:52 +0000 2009","geo":null,"user":{"profile_background_color":"9ae4e8","favourites_count":0,"url":null,"profile_image_url":"http://a3.twimg.com/profile_images/568710581/karman_s_vortex_street_normal.png","notifications":null,"profile_text_color":"000000","time_zone":"Eastern Time (US & Canada)","description":"My name isn't Eddie and I'm not really all that large.","screen_name":"LargeEddie","statuses_count":9,"profile_link_color":"0000ff","profile_background_image_url":"http://s.twimg.com/a/1260393960/images/themes/theme1/bg.png","created_at":"Thu Dec 10 22:20:48 +0000 2009","profile_sidebar_fill_color":"e0ff92","geo_enabled":false,"profile_background_tile":false,"protected":false,"profile_sidebar_border_color":"87bc44","location":"Greensboro, NC, USA","name":"Tim Victor","following":null,"verified":false,"followers_count":0,"id":95997325,"utc_offset":-18000,"friends_count":1},"id":6557792502}

Link to comment
Share on other sites

I'll be honest I've never used json data before. Judging by json_decode it is designed to decode a single json string. From your description of the file it sounds like you have many json strings on multiple lines? That being the case it sounds like you will simply need to loop through the lines. You can either iterate through the file, you could use file to load the file as a loop and use foreach to iterate though it. You could use file_get_contents to load the file and explode it on newlines chars.

 

As I say, it may not work I've never worked with json strings, but the logic fits.

Link to comment
Share on other sites

I'll be honest I've never used json data before. Judging by json_decode it is designed to decode a single json string. From your description of the file it sounds like you have many json strings on multiple lines? That being the case it sounds like you will simply need to loop through the lines. You can either iterate through the file, you could use file to load the file as a loop and use foreach to iterate though it. You could use file_get_contents to load the file and explode it on newlines chars.

 

As I say, it may not work I've never worked with json strings, but the logic fits.

 

Thanks, I got it working with your advise, but there is still a small problem. I'm dealing with ~100Mb files here. For testing purposes I'm using ~0.5Mb files, but when I try to do a larger file php gives the out of memory error. after a search I've learned that someone suggested using something like file(), but the streams the file instead of reading the entire thing into memory. Is there anything you can suggest?

 

Fatal error: Out of memory (allocated 63700992) (tried to allocate 26738688 bytes)

 

Here is the code I'm using if anyone is interested

 

$data = file('2009121101.txt');

foreach($data as $tweets){

var_dump(json_decode($tweets));

$tweet = json_decode($tweets);

echo '<br><hr><br>';

echo 'Username: '.$tweet->user->screen_name.'<br>';
echo 'Tweet: '.$tweet->text.'<br>';
echo '<img src="'.$tweet->user->profile_image_url.'">';

echo '<br><hr><br>';
}

Link to comment
Share on other sites

If you have access you can change some of the settings in the php.ini such as memory_limit to allow you to deal with files of that size. Otherwise you can probably do it with something like fopen and fread which I think won't require full memory as you loop through the file.

Link to comment
Share on other sites

If you have access you can change some of the settings in the php.ini such as memory_limit to allow you to deal with files of that size. Otherwise you can probably do it with something like fopen and fread which I think won't require full memory as you loop through the file.

 

My host limits the php.ini memory setting to 64Mb, currently I have it set to use about 40mb from within the fread() statement. Now I'm using fread() and fopen() and I can only read the 40mb or so. But atleast it working somewhat.

 

Maybe someone else has some ideas?

 

Here's what I'm using.

$data = fopen('2009121110.txt', 'r+');
$data = fread($data, 188743680);
$data = explode('{"truncated"', $data);
// I'm using the explode to put it into an array, later I'm adding in the {"truncated to all of the arrays. 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.