Jump to content

Tracking Impressions


SchweppesAle

Recommended Posts

Should you use PHP, AJAX, or Server Log Files.

 

Granted most commerical analytical tools today will use Javascript(AJAX); does anyone know why these tools often report significantly lower page views than other methods,

 

ex: PHP or even filtered server logs(AWSTATS).

 

I already just started a test using 3 different methods for gathering the number of "page views"

 

1) PHP

2) PHP with Filtration of known Bots

3) AJAX(Post Request to PHP Script)

 

Seems #3 is already falling behind  :shrug:

Link to comment
Share on other sites

awestats on the server is "more accurate" because it will count it from a higher "a request was made to this page" level, so even if your page fails to load fully, or if there are other errors causing it to load fully, etc.. it will still count as a page view.

 

If the commercial tools (I assume you are talking about google analytics, yahoo web analytics, omniture sitecatalyst, webtrends, etc...) are reporting significantly less page views, then more than likely you

 

a) didn't install the tracking tool correctly

b) some other script(s) on the page(s) might have an error which causes the tracking tool to fail to load. 

 

edit:

 

or c) maybe you are looking at the wrong report(s).  Example.  page view is not the same as visit.  Apparently a lot of people get those two confused.

 

 

Link to comment
Share on other sites

awestats on the server is "more accurate" because it will count it from a higher "a request was made to this page" level, so even if your page fails to load fully, or if there are other errors causing it to load fully, etc.. it will still count as a page view.

 

If the commercial tools (I assume you are talking about google analytics, yahoo web analytics, omniture sitecatalyst, webtrends, etc...) are reporting significantly less page views, then more than likely you

 

a) didn't install the tracking tool correctly

b) some other script(s) on the page(s) might have an error which causes the tracking tool to fail to load. 

 

edit:

 

or c) maybe you are looking at the wrong report(s).  Example.  page view is not the same as visit.  Apparently a lot of people get those two confused.

 

 

 

I actually had to try placing the Google Analytics tracking Code directly into the header in one of our smaller sites.

http://nptechnews.com/

 

Pageviews are still not matching up with Awstats though. 

Link to comment
Share on other sites

In order for Ajax to work, everything has to go right.  The client has to

 

- Fully load the page

- Support javascript

- Successfully initialize the ajax socket call back to the server

- Successfully run the ajax post function

 

There's a lot of things that can go wrong there.

 

Needless to say, the gold standard has always been analytics built on top of the native webserver logs, although the roll your own serverside method, does have the advantage of awareness of the specifics of your application.  Sometimes those metrics are more important to you than what weblogs might have.  For example, a movie review site, can keep track of reviews read, and then provide statistics like "total reviews for Author Bob Jones" or "total reviews read of movies starring Brad Pitt".  You'll never get those stats directly out of a web analytics tool.

 

The other problem with analytics tools is that when you have a web farm, you have to really work on a scheme to combine logs, or deal with looking at each web server independently that is far from ideal. 

 

I've found the best combination personally to be a combination of a web log analytics tools, and my own serverside application specific logging.  I don't think that ajax is a good match at all for tracking page views.

Link to comment
Share on other sites

I actually had to try placing the Google Analytics tracking Code directly into the header in one of our smaller sites.

http://nptechnews.com/

 

Pageviews are still not matching up with Awstats though. 

 

you cannot put most of the commercial tracking codes into your header, because most of them boil down to outputting an image tag with data appended to the src url.  You cannot have image tags in the head tag.

Link to comment
Share on other sites

In order for Ajax to work, everything has to go right.  The client has to

 

- Fully load the page

- Support javascript

- Successfully initialize the ajax socket call back to the server

- Successfully run the ajax post function

 

There's a lot of things that can go wrong there.

 

Needless to say, the gold standard has always been analytics built on top of the native webserver logs, although the roll your own serverside method, does have the advantage of awareness of the specifics of your application.  Sometimes those metrics are more important to you than what weblogs might have.  For example, a movie review site, can keep track of reviews read, and then provide statistics like "total reviews for Author Bob Jones" or "total reviews read of movies starring Brad Pitt".  You'll never get those stats directly out of a web analytics tool.

 

The other problem with analytics tools is that when you have a web farm, you have to really work on a scheme to combine logs, or deal with looking at each web server independently that is far from ideal. 

 

I've found the best combination personally to be a combination of a web log analytics tools, and my own serverside application specific logging.  I don't think that ajax is a good match at all for tracking page views.

 

All good points.

 

That being said, suppose I was designing a banner serving web application and had to track impressions to each banner somehow.  Wouldn't I need AJAX to tally each impression after a banner has finished loading? 

 

Or for that matter, if I where to use page views as an indicator of impressions(I know there's a difference), are there any reliable repositories where I can periodically download a list of all known bots, etc? 

 

That way I would at least be able to filter out most bot user agents/IPs from our Raw Access Logs.

 

Feeling like I'm stuck between a rock and a hard place.  Ajax is unreliable and PHP/Access Logs have their limitations.

Link to comment
Share on other sites

Banner tracking is usually done as a series of redirects.

 

The loader tracks the impression serverside, and returns the image/flash/html what have you content. 

 

In terms of bot suppression, it's actually worse than you know, because not only do you need to look at the user agent data, but you also need to look at the IP, since there are annoying bots that cloak themselves by spoofing the user agent.  I might add that these are both reasons why serverside is a good way to do it, because the client does not have, nor can it be trusted to provide the IP address or the user agent.  With that said, neither can be entirely trusted, but at least you get what the webserver sees.

 

On one site where I do this type of logging, I implemented some code that checks both the IP range and user agent data against tables of known bots.  While I didn't start with a database of known bots, there is a lot of info that can be found via googling, and on webmaster's world (which I personally don't subscribe to), as well as other sites.

 

I found a site with a db that I could have started with, long after I'd already been manually updating my list based on bots that I saw.  http://www.robotstxt.org/db/all.txt

 

In my case, I have a page the summarizes visits by IP for a certain time period, and I allow drilling into that to see the useragent strings.  I added some simple code to allow me to click and add that string to the exclude list.  Same goes for IP addresses.  I used to look at these stats once a week, and my stat system allows me to exclude and delete all the click rows for that IP address.

 

It's worked fairly well over the years.  The only concern is that this puts a transaction load on the mysql database, but the site in question doesn't get enough traffic for me to be concerned about the overhead.

Link to comment
Share on other sites

Banner tracking is usually done as a series of redirects.

 

The loader tracks the impression serverside, and returns the image/flash/html what have you content. 

 

In terms of bot suppression, it's actually worse than you know, because not only do you need to look at the user agent data, but you also need to look at the IP, since there are annoying bots that cloak themselves by spoofing the user agent.  I might add that these are both reasons why serverside is a good way to do it, because the client does not have, nor can it be trusted to provide the IP address or the user agent.  With that said, neither can be entirely trusted, but at least you get what the webserver sees.

 

On one site where I do this type of logging, I implemented some code that checks both the IP range and user agent data against tables of known bots.  While I didn't start with a database of known bots, there is a lot of info that can be found via googling, and on webmaster's world (which I personally don't subscribe to), as well as other sites.

 

I found a site with a db that I could have started with, long after I'd already been manually updating my list based on bots that I saw.  http://www.robotstxt.org/db/all.txt

 

In my case, I have a page the summarizes visits by IP for a certain time period, and I allow drilling into that to see the useragent strings.  I added some simple code to allow me to click and add that string to the exclude list.  Same goes for IP addresses.  I used to look at these stats once a week, and my stat system allows me to exclude and delete all the click rows for that IP address.

 

It's worked fairly well over the years.  The only concern is that this puts a transaction load on the mysql database, but the site in question doesn't get enough traffic for me to be concerned about the overhead.

 

Thanks, that's actually pretty similar to what I'm doing right now. 

 

That link you provided looks useful though.  I should be able to write a script which parses each line then updates our database periodically. 

Link to comment
Share on other sites

I actually had to try placing the Google Analytics tracking Code directly into the header in one of our smaller sites.

http://nptechnews.com/

 

Pageviews are still not matching up with Awstats though. 

 

you cannot put most of the commercial tracking codes into your header, because most of them boil down to outputting an image tag with data appended to the src url.  You cannot have image tags in the head tag.

 

Still don't see any scripting errors on that page.  This is old news though :P

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.