Robots.txt: disallow a page

All WYSIWYG Web Builder support issues that are not covered in the forums below.
Forum rules
IMPORTANT NOTE!!

DO YOU HAVE A QUESTION OR PROBLEM AND WANT QUICK HELP?
THEN PLEASE SHARE A "DEMO" PROJECT.



PLEASE READ THE FORUM RULES BEFORE YOU POST:
http://www.wysiwygwebbuilder.com/forum/viewtopic.php?f=12&t=1901

MUST READ:
http://www.wysiwygwebbuilder.com/getting_started.html
WYSIWYG Web Builder FAQ
Post Reply
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

Is it possible to disallow a single page from a project?

Let's say that robots.txt looks like that:
User-agent: *
Allow: /
Allow: /aaa/
Allow: /bbb/
Allow: /ccc/


I would like it to be like that:
User-agent: *
Allow: /
Allow: /aaa/
Disallow: /aaa/a1.php
Allow: /aaa/a2.php
Allow: /bbb/
Allow: /ccc/

Could someone please advise on how can this be done?

Thanks
User avatar
BaconFries
 
 
Posts: 5880
Joined: Thu Aug 16, 2007 7:32 pm

Re: Robots.txt: disallow a page

Post by BaconFries »

Have you read the following Adding robots.txt to your website reading from Pages and Folders Under 'Pages and Folders' you can override rules for individual pages and folders
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Re: Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

I have read this tutorial and set rule to 'disallow index, disallow follow' for a page, but robots.txt has not changed - no pages are listed, just folders within newly generated robots.txt file. :(
User avatar
Pablo
 
Posts: 23255
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: Robots.txt: disallow a page

Post by Pablo »

Note that some options do not affect robots.txt. For example, ''disallow index, disallow follow'' controls meta tags of the page, those are not robots.txt options.
Rule -> Allow index/disallow index for files and folders is added to robots.txt

The reason why all these options are combined in this dialog, is that they can be set from one place. But they can also be set via the page properties for each page.
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Re: Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

OK, but I cannot set "allow/disallow index" for pages, only "allow/disallow index, allow/disallow follow" and "not set".
I can set "allow/disallow index" only for folders....

What am I doing wrong?
User avatar
Pablo
 
Posts: 23255
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: Robots.txt: disallow a page

Post by Pablo »

1. Select the page in the site tree
2. Select the rule
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Re: Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

OK... please let me explain... I will use the example from this tutorial: https://www.wysiwygwebbuilder.com/robots_txt.html
I followed the steps:
1. choose a PAGE in a project (Pages and Folders under the Website tree - done - easy :))
2. set Rule for the page: here I have exactly the same rules that are shown in the tutorial ("allow/disallow index, allow/disallow follow" and "not set" for the PAGES), so as I understand, after your comment - this is NOT possible to create an entry like: DISALLOW: /index.php, right?

Whatever I do, I cannot create an entry within robots.txt that contains a page definition :( - or I am too stupid to do this!
I do need this to restrict robots from indexing certain pages :(
User avatar
Pablo
 
Posts: 23255
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: Robots.txt: disallow a page

Post by Pablo »

"allow/disallow index, allow/disallow follow" and "not set" for the PAGE.
For pages, this option controls meta tags not the robots.txt.

Code: Select all

<meta name="robots" content="noindex, nofollow">
robots.txt sets the global rules
meta tags override the rules for individually pages.
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Re: Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

should I read it in this way that I do not need page entry in robots.txt because robots get the information in different way?
it makes sens :) if so...
User avatar
Pablo
 
Posts: 23255
Joined: Sun Mar 28, 2004 12:00 pm
Location: Europe
Contact:

Re: Robots.txt: disallow a page

Post by Pablo »

Correct, the meta tags in the page overrride the information in robots.txt Therefor the generated robot.txt does not include the same information.
bzc0fq@gmail.com
 
 
Posts: 30
Joined: Sat Feb 26, 2022 3:46 pm

Re: Robots.txt: disallow a page

Post by bzc0fq@gmail.com »

Perfect...
Thank you for the explanation :)
Post Reply