robots.txt in quickgit.kde.org
Kevin Funk
kfunk at kde.org
Tue Jan 5 14:17:12 UTC 2016
On Wednesday, December 30, 2015 12:57:23 PM Ben Cooksley wrote:
> On Tue, Dec 29, 2015 at 11:16 PM, Kevin Funk <kfunk at kde.org> wrote:
> > On Tuesday, December 29, 2015 10:39:01 PM Ben Cooksley wrote:
> >> On Tue, Dec 29, 2015 at 7:59 AM, Lydia Pintscher <lydia at kde.org> wrote:
> >> > On Sun, Dec 27, 2015 at 12:35 PM, Ben Cooksley <bcooksley at kde.org>
wrote:
> >> >>> Is there some place where search engines can easily index our source
> >> >>> code or are we shooting ourselves in the foot here?
> >> >>
> >> >> We could probably make it available by publishing the source trees
> >> >> used by LXR / EBN.
> >> >> This would only have the main branches obviously rather than
> >> >> everything
> >> >> though.
> >> >>
> >> >> I haven't checked, but LXR may already make it's copy of the code
> >> >> accessible...>
> >> >
> >> > I think making our sourcecode available to search engines is pretty
> >> > important for the reasons already mentioned by others. Do you need
> >> > help for it? If you write down what's needed I can help find someone
> >> > to do it.
> >>
> >> I've now provisioned https://sources.kde.org/
> >
> > I'm not sure this is super useful, to be honest (as mentioned in #kde-
> > sysadmins already).
> >
> > This is really just plain file serving, with no cross-references to either
> > LXR (or apidocs). This is basically a dead-end when you follow a result
> > on Google.
> >
> > Wouldn't it be possible to let robots index https://lxr.kde.org/source/
> >
> > instead? We have the infrastructure...
>
> We'll give it a shot.
Just to stress again this would be *really* useful to have.
I answered a post on SO:
http://stackoverflow.com/a/34612692/592636
Tried to link kwallet's FindGpgpme.cmake into the answer; and there's *no*
easy way quickly get a link to KDE infrastructure serving the file via Google
(not even api.kde.org).
Try googling for "kwallet findgpgme.cmake" (very specific search after all):
https://www.google.de/search?q=kwallet+findgpgme.cmake
-> First result: Github..., rest: mildly interesting
Different issue I just noticed: There's no way to get the plain-text (raw)
representation of a given file on LXR, is there? Would be useful as well.
Cheers,
Kevin
> > Of course we need to blacklist all the pages allowing to actively *search*
> > LXR for robots, in order to avoid abuse.
>
> Note that despite robots.txt, many spiders (including Google, Yahoo
> and Bing) will actively disregard the instructions in there.
> While they may not return the results - or omit snippets of the page
> content - they have all been guilty (at least in the past) of
> disregarding our restrictions, resulting in downtime (which have in
> some cases necessitated full host reboots to fix) for numerous KDE.org
> subsites in the past.
>
> This is why QuickGit and WebSVN have extremely restrictive robots.txt
> policies, in addition to blacklist rules within our web server
> configurations.
>
> > Cheers,
> > Kevin
>
> Regards,
> Ben
>
> >> > Cheers
> >> > Lydia
> >>
> >> Regards,
> >> Ben
> >>
> >> > --
> >> > Lydia Pintscher - http://about.me/lydia.pintscher
> >> > KDE e.V. Board of Directors / KDE Community Working Group
> >> > http://kde.org - http://open-advice.org
> >> >
> >> >>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> >> >>> unsubscribe <<>>
> >> >>
> >> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> >> >> unsubscribe
> >> >> <<
> >
> > --
> > Kevin Funk | kfunk at kde.org | http://kfunk.org
--
Kevin Funk | kfunk at kde.org | http://kfunk.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/plasma-devel/attachments/20160105/7ce343a6/attachment-0001.sig>
More information about the Plasma-devel
mailing list