Outbound Link Summary:
22 years ago
p3k dots

klaus schallhorn omnisearch: "google gibt zwar vor, den robots.txt-standard zu beachten, ignoriert aber zumindest beim abruf von seiten über https genau das protokoll, das spidern den zugriff auf bestimmte server-bereiche verbietet".

by no means i am an expert in web security. i only fiddled around a little bit with ssl, certificates and public key encryption in the last weeks. but one thing i noticed: from the first results that you get by requesting google's indexed data originating from secure servers the documents from two thirds of the https servers were available via http as well, and one third of the servers returned self-signed, corrupted or expired certificates.

there's still a lot to do concerning these security issues. and still a lot of bad things will happen. it seems, they must.

btw. neither robots.txt nor secure server provide a general protection against third party access to a web server.

while the first one is imho only an agreement between server admins and robot creators, the https protocol in fact can reliably restrict access to confidential data (and therefor your data becomes somehow secret although not top secret), given that you are using client certificates.