WordPress search engine crawling record code

Write a blog for some time, why the search engine is not included in your page? Want to know what spiders are “visiting” your site every day? As a wordpress user, it is necessary to know what spiders crawl every day to your website in order to understand the search engine spider crawling frequency, the site for targeted SEO optimization.
In fact, very simple, just add the following code, and then call the document code on the OK, is not it convenient? Let’s get started.
Before I also looked for a few spider crawling tools PHP version, the results are unsatisfactory. And most of these PHP programs to be installed, but also to add the spider crawl records to MYSQL, too much trouble. Then look for a simple spider crawler
1. First, create a robots.php file in the wordpress theme root and write the following:
<? php
function get_naps_bot ()
{
$ useragent = strtolower ($ _ SERVER [‘HTTP_USER_AGENT’]);
if (strpos ($ useragent, ‘googlebot’)! == false) {
return ‘Googlebot’;
}
if (strpos ($ useragent, ‘msnbot’)! == false) {
return ‘MSNbot’;
}
if (strpos ($ useragent, ‘slurp’)! == false) {
return ‘Yahoobot’;
}
if (strpos ($ useragent, ‘baiduspider’)! == false) {
return ‘Baiduspider’;
}
if (strpos ($ useragent, ‘sohu-search’)! == false) {
return ‘Sohubot’;
}
if (strpos ($ useragent, ‘lycos’)! == false) {
return ‘Lycos’;
}
if (strpos ($ useragent, ‘robozilla’)! == false) {
return ‘Robozilla’;
}
return false
}
function nowtime () {
$ date = gmdate (“Y-n-j H: i: s”, time () + 8 * 3600);
return $ date;
}
$ searchbot = get_naps_bot ();
if ($ searchbot) {
$ tlc_thispage = addslashes ($ _ SERVER [‘HTTP_USER_AGENT’]);
$ url = $ _ SERVER [‘HTTP_REFERER’];
$ file = “robotslogs.txt”;
$ time = nowtime ();
$ data = fopen ($ file, “a”);
fwrite ($ data, “Time: $ time robot: $ searchbot URL: $ tlc_thispagen”);
fclose ($ data);
}
?>
Upload it in your theme directory.
2. In the appropriate location of Footer.php or header.php add the following code to call robots.php.
<? php include (‘robots.php’)?>
Program principle: through the spider identifier (such as Baiduspider, Googlebot) to judge, record the spider creep time, and generate the log file robotslogs.txt in the root directory.
Program Disadvantages: can not record the spider crawling the page, the function is relatively simple.