[kde-community] Official KDE mirror on github

Jos van den Oever jos at vandenoever.info
Mon Aug 17 08:24:46 BST 2015


On Monday 17 August 2015 04:05:45 Nicolás Alvarez wrote:
> 2015-08-17 3:55 GMT-03:00 Jos van den Oever <jos at vandenoever.info>:
> > On Monday 17 August 2015 07:46:44 Martin Graesslin wrote:
> > > Hi community,
> > > 
> > > over the last months I observed the following:
> > > * people not finding our git repositories
> > 
> > Searching on ixquick:
> > 'calligra git' https://community.kde.org/Calligra/Git
> > 'kde git' https://community.kde.org/Sysadmin/GitKdeOrgManual
> > 'kwin git' https://github.com/faho/kwin-tiling
> > 'plasma git' https://community.kde.org/Plasma/Active/Development
> > 
> > Searching on Google:
> > 'calligra git' https://community.kde.org/Calligra/Building/2
> > 'kde git' https://techbase.kde.org/Development/Git
> > 'kwin git'
> > http://blog.martin-graesslin.com/blog/2014/04/kwin-moved-to-an-own-reposit
> > ory/ 'plasma git' ->
> > https://aur.archlinux.org/packages/plasma-desktop-git/
> > 
> > On google the highest link to github was in position 4. Not too bad.
> > 
> > There was no link to https://projects.kde.org/ or
> > https://quickgit.kde.org/
> > 
> > What part of the KDE infrastructures can be fixed to make the repositories
> > easier to find?
> 
> http://quickgit.kde.org/robots.txt asks search engines not to index
> quickgit at all.
> 
> http://projects.kde.org/robots.txt asks search engines not to index the
> repository, but the project information, news, etc. should be indexable.
> See https://www.google.com/search?q=site:projects.kde.org for stuff that
> does get indexed currently.
> 
> Both of these blocks were done for server performance reasons: search
> engines wer crawling the hell out of dynamically-generated repository
> history and bringing our servers to their knees.

There is a (non-standard) instruction for robots.txt which reduces the crawl-
frequency.
E.g. "Crawl-delay: 10" says only 10 requests per second are allowed.
Neither projects.kde.org nor quickgit.kde.org are using this atm. 

 http://stackoverflow.com/questions/17377835/robots-txt-what-is-the-proper-format-for-a-crawl-delay-for-multiple-user-agent

If we do not let search engines index our primary product (source code), then 
it's not strange that people cannot find it.

Cheers,
Jos




More information about the kde-community mailing list