Generating latest sitemap of your WordPress blog for Google Webmaster without plugins

I was depending on a WordPress plugin to generate the sitemap of my blog. However I noticed that it is not getting updated automatically. I searched the forums and I could see many people posting this issue.  Some of the popular issues are new posts are not getting reflected in sitemap, pages are not shown in sitemap etc.

I could see various solutions like WordPress plugins, using crons to re-submit the sitemap every time etc.

My suggestion is to create a small PHP page which can render the sitemap dynamically. The good side of wordpress is it stores all the post and page meta data information in MySQL tables.

A sample sitemap file looks as follows

[code]

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.techthali.org/</loc>
<lastmod>2012-08-08T18:16:47+00:00</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.techthali.org/web-fonts-from-cross-domain-can-cause-problems-in-ie9-and-firefox/</loc>
<lastmod>2012-07-24T12:42:31+00:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.2</priority>
</url>
<url>
<loc>http://www.techthali.org/introduction-to-node-js/</loc>
<lastmod>2012-07-06T13:03:56+00:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.2</priority>
</url>
<url>
<loc>http://www.techthali.org/introduction-to-cloud-computing/</loc>
<lastmod>2012-07-05T04:51:50+00:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>

[/code]

After a quick scan of the WordPress database tables, I was able to define a query which can return all the published post and pages with the last modified timestamp. This following query will fetch posts and pages in the reverse chronological order.

[code]
select post_name,date_format(post_modified,’%Y-%m-%dT%H:%i:%S+00:00′) from wp_posts where post_status=’publish’ and post_type in (‘post’, ‘page’) order by id desc
[/code]

Next step was to create a small php page “sitemap.php” which will execute the above query and render in the format of a sitemap as follows.

[code]

<?
echo(‘<?xml version="1.0" encoding="UTF-8"?>’);
echo(‘<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">’);

$con = mysql_connect("{DB SERVER IP}", "{DB USER NAME}", "{DB USER PASSWORD}");
if (!$con)
 {
 die(‘Could not connect: ‘ . mysql_error());
 }

mysql_select_db("{WORDPRESS DB NAME}", $con);

$result = mysql_query("select post_name,date_format(post_modified,’%Y-%m-%dT%H:%i:%S+00:00′) post_modified from wp_posts where post_status=’publish’ and post_type in (‘post’, ‘page’) order by id desc");

while($row = mysql_fetch_array($result)) {
echo(‘<url><loc><a href="http://www./">http://www.</a>{YOUR DOMAIN NAME}/’);
echo($row[‘post_name’]);
echo(‘</loc><lastmod>’);
echo($row[‘post_modified’]);
echo(‘</lastmod>’);
echo(‘<changefreq>weekly</changefreq>’);
echo(‘<priority>1.0</priority></url>’);
}
mysql_close($con);
echo(‘</urlset>’);
?>

[/code]

The above code will render a sitemap xml containing all the posts and pages published in a wordpress site. Being an example, I have hardcoded the changefreq and priority. We can enhance this to add weightage based on categories , usage of tags associated with the post / page. There exists a lot of scope to enhance the output, but however I am leaving it to your creativity.

Save the above code as sitemap.php (for example)  and store it in the home directory of your site.

Test the Sitemap

First make sure that you can access the sitemap from a browser and it is rendering the sitemap as a XML correctly. Use the “view source” to check the XML if it is not getting rendered correctly by your browser.

Next step is to make sure that Google crawler – Google Bot can access this sitemap and it is well formed.  The best option is to use the Google Webmaster tools where you can test your sitemap.

Submit the URL of the sitemap and make sure it is well formed and it is acceptable. Once tested you can define it as the permanent sitemap for your wordpress site.

Next time you publish a new post or page , please check the sitemap.php and you could see it reflected in the sitemap.
The same sitemap URL can be submitted to BING webmaster tools.

Hurray !!! Freedom from Sitemap Updation!!!

Posted in: Internet, Programming

2 Comments

  1. Johne937 says:

    I appreciate, cause I discovered just what I used to be looking for. You have ended my four day long hunt! God Bless you man. Have a nice day. Bye cgkgggfckeef

  2. Johnf171 says:

    Thanks for another wonderful article. Where else could anybody get that kind of information in such an ideal way of writing? I have a presentation next week, and I’m on the look for such information.

Leave a Comment