<br><br><div class="gmail_quote">On Mon, Oct 1, 2012 at 2:05 AM, Sven Langkamp <span dir="ltr"><<a href="mailto:sven.langkamp@gmail.com" target="_blank">sven.langkamp@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="gmail_quote"><div class="im"><br></div><div>Really interesting solution. My idea was to shuffle the alpha (that would require less converts, but more other instructions) from the loaded pixel but this looks better. Unfortunately I don't have a cpu that has avx, so I can't test it. Would be interesting how this performs with SSE and integers instead of floats.</div>

</div></blockquote><div><br>Actually I initially wanted to do the integer solution, but I couldn't find the instruction for integer streamed division. Probably, you know one?<br><br>We have one (and the only one) division in the composite over: it scales the result by the new alpha value (srcBlend), so it looks like it cannot be transformed to multiplication. This single division takes about 20% of composing time.<br>

<br>Although, I guess, the integer solution should be a bit faster than the floating point one.<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote">

<div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">2) We still need to decide what to do with grayscale selections.<br>

</blockquote><div><br></div></div><div>My favorite is still the composite op solution. </div></div></blockquote></div><br>+1 ;)<br><br clear="all"><br>-- <br>Dmitry Kazakov<br>