Originals WTF? La Culture Geekery WWJD? The South Blog

Sitemap for Google - PHPBB 3 Mod

All things awesome.

Postby Liv » Tue Oct 31, 2006 2:19 pm

Modification Name: PHPBB Site Map for Google Mod
Modification Version:: 18 (22 Mar 2013) Validated for 3.0.11 and prior.
Author: L. Jones

Modification Description::
    PHPBB SiteMap for Google Mod produces an XML (by redirect) or pseudo XML compliant link tree for use with various sitemap submission softwares such as Google's Webmaster tools sitemap submission service or Yahoo's Site Explorer. The file is tiny, and simple to install.

Features:
    Simple installation, "upload and go" if your forum is in the root domain. Simply upload the PHP file (sitemap.php) and add the link to your index page. Otherwise, modify the domain variables in the top of the PHP file and off you go. The site-map provides real-time 24 hour data on topics to Search Engines down to the second.

Screenshots:
phpbb sitemap.png
Phpbb Sitemap Mod

Demo URL: sitemap.php

Modification Download:

Google Sitemap for PHPBB v18.zip
(2.5 KiB) Downloaded 156 times


Site Map Install Instructions:
    1) Download the above sitemap in Zip form and decompress on your computer. (above)

    2) If Your PHPBB3 Forum is installed in the root folder, for instance at http://greensboring.com then skip "a"
      a) modify the $folder or $subdomain variable to match your forum. An example is given at the top of the PHP code.
    3) Upload to your forum root folder. (where PHPBB's index.php is)
    4) Access your PHPBB's index.php from your forum root folder and open it for editing, find:
    Code: Select all
    'TOTAL_POSTS'   => sprintf($user->lang[$l_total_post_s], $total_posts),

    After that add:
    Code: Select all
    'SITE_MAP' => '<a target="_blank" href="sitemap.php" title="sitemap" rel="alternate" type="application/rss+xml">SiteMap</a> by <a target="_blank" href="http://www.livjones.com">Liv</a>',

    5) Go into your administrator control panel -> styles -> template -> edit -> "index_body.html" ->
    find in template:
    Code: Select all
    <!-- INCLUDE overall_footer.html -->


    before it add:
    (Running a SubSilver or a custom theme? No problem... just add it where you feel most comfortable in the template)
    Code: Select all
    {SITE_MAP}


    Click submit, and you're almost done...

    6) Submit to Google, at Google Sitemaps.


    7) Go get a lovely pint of Guinness and drink because you are officially done!

    EXTRA CREDIT!!!! (optional)
    Advanced users may wish to add to their .htaccess file the following command to make it fully xml compliant. This really has no effect on anything unless a random bot comes along looking for a sitemap.xml. I would honestly just recommend skipping this, unless it's important to you to have it as sitemap.xml

    add (or create) in .htaccess file
    Code: Select all
    Options +FollowSymlinks
    RewriteEngine on
    RewriteRule sitemap.xml sitemap.php [NC]


PHPBB3 Sitemap Change Log:
    Version: 18 (and prior) Changes:
    -Split Sql requests into forums for use in large forums (v18)
    -Added NOINDEX X-tag header (v17)
    -Fixed cURL extension issue, when extension is not installed.
    -Added Gzip compression which greatly increases the speed.
    -Refined SQL SELECTs to speed reduce resource overhead.
    -Updated code to handle parent forums and "1969" issue with Google sitemaps.
    -Depreciated file_get_contents() in favor of Curl. (hat-tip to Marcel)
    -Added Limits to the number of URLs, set default to Google's 50,000 Max.
    -Fixed glitch where forums with subforums wouldn't display topics or forums.
    -Added a echo statement for path if script die()'s because of faulty path.
    -XML Compliant Header added.
    -Corrected last mod tags format for W3c standards.
    -Added LastMod tags to sitemap with data pulled from phpbb3 database.
    -Only displays non-private content of both forums and topics.
    -Non-Indexable forums have been removed from the sitemap.
    -Changed Mysql_fetch_array to sql_fetchrow per Dave Turner to solve database compatibility issues.
    -Add subdomain variable for ease. Allows user to change domain if not set by server.
    -Pulls PHPBB prefix from config.php
    -removed the need to state domain and path. Made it more intuitive by only needing to state folder of PHPBB install.
    -Moved Forums to be listed last.
    -Changed viewtopic posts to only those that are approved.
    -List latest topics first rather than last.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Sj0din » Mon Sep 07, 2009 9:37 am

Hi.

Im using php3 seo mod and my urls are like phpbb3.com/a-new-topic/

in your sitemap mod the urls will go /viewtopic.php?f=6&t=3329

How can I change this?
Sj0din
 

Postby Liv » Mon Sep 07, 2009 12:18 pm

You can use the mod in the form it is if the pages are reachable by both URLs and you use a canacolization tag on your pages....

else you can revert back to the original PHPBB url architecture...

else I'd be glad to write you a custom version for a few hundred dollars.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby snow » Mon Sep 07, 2009 8:29 pm

Liv is there a way to remove certain forums from being generated in the sitemap?
And since this is constantly updated, do I have to resubmit everyday to Google?
How many times should I resubmit the sitemap?

Regards,
snow
snow
 
Posts: 2
Joined: Thu Aug 20, 2009 4:31 am

Postby Liv » Mon Sep 07, 2009 9:44 pm

snow wrote:Liv is there a way to remove certain forums from being generated in the sitemap?
And since this is constantly updated, do I have to resubmit everyday to Google?
How many times should I resubmit the sitemap?

Regards,
snow


Yes, under forum administration in your ACP change this option to "no" for the forums you don't want included:
Code: Select all
Enable search indexing:
If set to yes posts made to this forum will be indexed for searching.


And... No... you should never have to re-submit the sitemap after the first time, Google is intelligent enough to check the sitemap prior to crawls. If your site is popular enough you'll see it automatically updated with crawl dates every so many days. The only time you would want to resubmit the sitemap is if your site suffered a pro-longed outage resulting in Google not finding the sitemap for prolonged period of time.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby newbie111 » Sat Mar 20, 2010 3:58 am

I'm pretty new to all this, so please forgive me if this is an ignorant question.

My site uses the subsilver2 template. When I came to edit index_body.html I could not find the

<!-- INCLUDE overall_footer.html -->

line of code, which does exist in the prosilver template.

Is there a way to use this mod with the subsilver2 template?
newbie111
 

Postby Liv » Sat Mar 20, 2010 1:41 pm

That's okay... basically just add

Code: Select all
{SITE_MAP}


...anywhere at the bottom of the template and you'll be fine. You can even move it where you want.... it just adds a link to your sitemap so crawlers can find it, and a courtesy link to my site for all my hardwork. That's all.

As long as the info appears on your index page you'll be fine.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby newbie111 » Mon Mar 22, 2010 12:04 am

Thanks heaps, it works just great :D
newbie111
 

Postby jv » Mon Apr 12, 2010 1:28 pm

Hello,

Will this work with SEF urls?

Thanks!
jv
 

Postby Liv » Mon Apr 12, 2010 10:31 pm

Indirectly if you have the proper redirects.... Google should be able to find it, but unfortunately this does not alter PHPBB's URL structure on its own.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Fri Aug 20, 2010 1:14 pm

It appears fine for me. Google will show them in your sitemap dashboard differently but unless you`re getting an actual warning you`re fine. Just looking at your sitemap remotely shows they`re parsing correctly.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Sat Feb 26, 2011 11:11 pm

No, no problem... sitemap doesn't use any $_GET variables.... so it doesn't matter.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Sun Feb 27, 2011 12:29 am

It didn't for me, but I'd assume if it did at one time, it's a incorrect re-dirrect in your .htaccess. Check that.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Sun Feb 27, 2011 1:53 am

If that's your .htaccess then that's not it....

I can't imagine anything that would cause a redirection to the sitemap other than a "redirect". There must be some sort of conflict with the "portal".

I'd suggest attempting to make sure you're only making changes to the PHPBB forum software, and not the portal.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Sat Mar 05, 2011 6:03 pm

Gilly wrote:Hi there, how long does it take, until sitemap is generated and how often those it update?

Thank you


The sitemap is generated when the file is loaded, and provides real-time data. The only situation where this would not occur, is if you are cacheing the file, which is not done by default.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby Liv » Sat Mar 19, 2011 9:58 pm

jlf wrote:I am getting the red X - unsupported file format error for the url in Google Sitemaps.


What's the URL?
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby nilands55 » Tue Mar 22, 2011 4:57 pm

Hello
I had no problems installing the Modification I thought, but I am getting this error
Code: Select all
Something went wrong, check your path first: http://help.nilandsplace.com/campingforum/
We could not find an author citation link: http://livjones.com - Please refer to this webpage for full installation instructions:http://greensboring.com/viewtopic.php?f=23&t=1563 - Thanks!
The URL is correct. Did I miss something? I really don't know PHP coding so I have no idea what is wrong
James Niland
Laurinburg NC
nilands55
 
Posts: 1
Joined: Tue Mar 22, 2011 4:37 pm

Postby Liv » Tue Mar 22, 2011 5:04 pm

I don't see the URL to the author on your website.... It's possible you modified the pro_silver theme rather than the one you're using. Add a link to your theme... and it should work.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby BlancBim » Tue May 29, 2012 9:51 pm

thank you for your fast answer, I'm not really good at this (or even at reading english :mrgreen: ) so here is what I did...

I unzipped the new file, copyed the file "sitemap.php" past and overwrite the old one in the forum's directory.
then went to the index page of the forum and hit the sitemap link

first I got this error:
Code: Select all
Something went wrong, check your path first: http://tls3d.fr/
We could not find an author citation link: http://www.livjones.com on your index page. - Please refer to this webpage for full installation instructions:http://greensboring.com/viewtopic.php?f=23&t=1563 - Thanks!


so I changed the line:
$folder='';
to
$folder='/forum/';

and I get this error:
Code: Select all
Something went wrong, check your path first: http://tls3d.fr/forum/
We could not find an author citation link: http://www.livjones.com on your index page. - Please refer to this webpage for full installation instructions:http://greensboring.com/viewtopic.php?f=23&t=1563 - Thanks!


it now points to the right directory but still the same error, I also tryed with the line:
$path='/forum/';
but it doesn't seem to change anything till now
BlancBim
 

Postby Liv » Tue May 29, 2012 11:10 pm

BlancBim wrote:I'm not really good at this (or even at reading english :mrgreen: ) so here is what I did...


C'est cool! No problem.

I can only guess that cURL is still the issue. I personally believe it's to do with Gzip compression on some servers.

Here's a work-around:

Delete this in sitemap.php


Code: Select all
// Initiate Engines for countdown start

if (function_exists('curl_init')){

$file = $domainpath . 'index.' . $phpEx;
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$fil3 = curl_exec($ch);
curl_close($ch);
$url=base64_decode('aHR0cDovL3d3dy5saXZqb25lcy5jb20=');
if (strpos($fil3,$url) === FALSE)
 {echo 'Something went wrong, check your path first: '.$domainpath.'<br>';
die(base64_decode('V2UgY291bGQgbm90IGZpbmQgYW4gYXV0aG9yIGNpdGF0aW9uIGxpbms6IDxhIGhyZWY9Imh0dHA6Ly93d3cubGl2am9uZXMuY29tIj5odHRwOi8vd3d3LmxpdmpvbmVzLmNvbTwvYT4gb24geW91ciBpbmRleCBwYWdlLiAtIFBsZWFzZSByZWZlciB0byB0aGlzIHdlYnBhZ2UgZm9yIGZ1bGwgaW5zdGFsbGF0aW9uIGluc3RydWN0aW9uczo8YSBocmVmPSJodHRwOi8vdGhlc2VjdWxhcml0eS5jb20vdmlld3RvcGljLnBocD9mPTIzJnQ9MTU2MyI+aHR0cDovL3RoZXNlY3VsYXJpdHkuY29tL3ZpZXd0b3BpYy5waHA/Zj0yMyZ0PTE1NjM8L2E+IC0gVGhhbmtzIQ=='));}

}


Save, and re-upload. It should work!
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby htmanning » Mon Mar 18, 2013 6:28 pm

It works, thanks! I've used it on 3 forums, and I'm getting errors on the third forum so I have to conclude there is some entry in my database that is causing the problem.

I'm getting this message:
"error on line 21621 at column 16: Extra content at the end of the document"

It just stops at the end of a line. If you look at the source it shows a closing URL tag so that entry finished correctly. Looking at the next topic_id number down in the database doesn't show anything that I find odd.

Looking at the source there is no closing urlset tag. The sitemap just ends. I'm stumped.

I'm also getting the 1969 date error on about 12 entries, because those topics exist in the database but have no text in them. I can simply delete those.

I hope someone can help.

Thanks.

Tom
htmanning
 
Posts: 6
Joined: Mon Mar 18, 2013 6:24 pm

Postby Liv » Mon Mar 18, 2013 7:20 pm

If you have to many URLs that your server stops responding on the script, you can definitely adjust the number of URLs to a more reasonable level. This feature was added a few versions ago, so assuming you have the latest, you'd simply adjust this parameter:

Code: Select all
$urls=50000;


(Default is 50,000)

As far as the first question, I'm not sure how you have topics in the database but are saying the topic doesn't exist. There's no real option to do that in PHPBB (to my knowledge [if there is I'd be glad to add that into a future version, but I don't see it in the tables]).

Now if the posts are in a particular forum (f=) which is private, then it's possible you could get the error because the topic wouldn't exist for search engines.

In that case you'd simply make sure that in your control panel, under the private forums you've set ENABLE SEARCH INDEXING to NO for those forums, and they'll automatically not appear in the sitemap.

Go to FORUMS > EDIT FORUM > GENERAL FORUM SETTINGS

Code: Select all
Enable search indexing:
If set to yes posts made to this forum will be indexed for searching.


and set it to NO for private forums.

This was also a feature that's built into the last few versions.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby htmanning » Mon Mar 18, 2013 7:31 pm

Thanks for the quick response.

I adjusted the $urls number earlier and that made no difference. It still stops at the same line. I currently have it set to 100,000 although I also tried 70,000 and it made no difference.

I have one Announcements forum that only the admin can post for, but I have that same forum at other installs and it doesn't cause a problem. However, I did as you suggested and turned off indexing. Sitemap.php now stops at the same line number (21620), but at a different record. It was stopping at topic_id 901, but after turning off the indexing for that one forum it goes until topic_id 884.

1969 error: I seem to have a few topics in the database that were spam, and were removed by a moderator, but for some reason the title of the topic is still there (no post). So I'm wondering if maybe there is no corresponding entry in the database that would hold the body of the post.

Thanks.
htmanning
 
Posts: 6
Joined: Mon Mar 18, 2013 6:24 pm

Postby Liv » Mon Mar 18, 2013 8:02 pm

My guess is it's still an execution time-out, by your host.... try a number like 5,000 and see what happens, then creep up from there in 5 or 10,000 increments till you max out your servers time out parameter.

The good news is that it does list the newest topics first, so it still should work wonders for you if that works.

As far as the phantom topics, there's two tables in PHPBB that topics come from (basically)... the Topics_Table and the Posts_Table.

It is possible that you have information in the Topics_Table (where we draw the data from for this sitemap) but not the Posts_Table.

The only way I can imagine this happening is by a modification or table merger sometime in the past. If this is the case, you can either choose to fix it, or ignore it, as it shouldn't really matter as far as the modification goes. Even if you get errors, it will still function for Google (though fixing the time-out issue is probably important).

I guess the first place to start is check the topics in question, and if they don't actually exist (meaning you can't see them with admin privileges), then something is up with your database, and you'll likely want to run a script to clean out those rogue entries (if you want to put that much effort into it).

If they do exist, then LMK if I missed something, because I'd definitely love to check and see what's going on.
User avatar
Liv
Imagine What I Believe
 
Posts: 2741
Joined: Wed Oct 05, 2005 6:59 pm
Location: Greensboro, NC

Postby htmanning » Mon Mar 18, 2013 8:15 pm

Looks like you're on to something. Anything more than $urls=7000 and it chokes.
htmanning
 
Posts: 6
Joined: Mon Mar 18, 2013 6:24 pm

Next

Return to Geekery