<div dir="ltr"><div dir="ltr">On Mon, Mar 7, 2022 at 1:16 PM Aleix Pol <<a href="mailto:aleixpol@kde.org">aleixpol@kde.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 5, 2022 at 8:36 AM Ben Cooksley <<a href="mailto:bcooksley@kde.org" target="_blank">bcooksley@kde.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Mar 4, 2022 at 12:49 AM Aleix Pol <<a href="mailto:aleixpol@kde.org" target="_blank">aleixpol@kde.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-size:small">I'd say wireshark is too low level for what the problem is here. We are talking about having too many HTTP requests for specific URLs.</div></div></blockquote><div><br></div>Correct, I guess the difference in our approaches comes from a "before release" to a "monitor after release" angle to things.<br>I'd like to see increased scrutiny during the development process as well to make sure that we release code that operates properly from Day 1.</div></div></blockquote><div><br></div><div><div style="font-size:small">A way to do this could be using commit hooks that do not allow to reach certain services. (which we discussed in private chat).</div><div style="font-size:small">We could also analyse at cmake time the knsrc files we install, but this has a very limited and specific scope.</div></div></div></div></blockquote><div><br></div><div>I've now applied two checks as part of the hooks which will hopefully catch anything new being introduced.</div><div>We still need to ensure that anything pre-existing is sorted out of course.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-size:small"></div><div style="font-size:small">I can think two main measures:</div><div style="font-size:small">- Trigger an alarm (an e-mail notification?) if there's a specific UserAgent that has a specific portion of the queries we have in a specific day in the services we care about.</div><div style="font-size:small">- Offer plots to see how queries by UserAgent evolve over the last couple of months (or couple of years).</div></div></blockquote><div><br></div><div>At the moment our ability to analyse our logs is somewhat limited by our Privacy Policy - <a href="https://kde.org/privacypolicy/" target="_blank">https://kde.org/privacypolicy/</a><br>Currently we don't have any provision for long term storage of this information even on an aggregated basis - so we would need to update this first.<br></div></div></div></blockquote><div><br></div><div style="font-size:small">Hopefully the NDA should help here and it doesn't seem all that far away. I know Neofytos and Ade have been working on it lately.<br></div></div></div></blockquote><div><br></div><div>The privacy policy will still need to be updated, but that can form part of the puzzle yes.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div style="font-size:small"></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>The second issue there is that we are transitioning users to contact a CDN based endpoint (which is substantially more scalable).</div><div>This does mean we lose visibility on data such as User Agents and the URLs being impacted though as we only get aggregated data unless we ask for raw logs - which makes implementing something like what you've described much harder.</div></div></div></blockquote><div><br></div><div style="font-size:small">That does seem like a stopper. Still, it seems like it's not that big of a problem when there is a CDN, so we better worry about the other cases.</div></div></div></blockquote><div><br></div><div>We should still be reasonable to the CDN of course, but it makes it much more managable yes.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div style="font-size:small"><br></div><div style="font-size:small">Aleix</div></div></div></blockquote><div><br></div><div>Cheers,</div><div>Ben </div></div></div>