Branchless Programming in C++ - Fedor Pikus - CppCon 2021

2 года назад

153,657 Просмотров

Комментарии:

@mytech6779 - 10.11.2023 05:25

Just ran perf stat on python3 hello world and got 6% branch misses. pypy3 had 3% but way more total branches...

g++ or clang++ with -O0 or -O3 and I still get 2.5% misses(though a lot less total branches, 800k vs 30M) using iostream with endl
same branches and misses for printf()
Also 3M instructions for printing hello world?
hmmm... 950k instructions and 220k branches for ` int main(){ return 0;} ` must be overhead from perf or the OS, the assembly is basically zero eax then ret

Ответить

@jean-luclachance7242 - 24.04.2023 20:05

Why can't compilers do this type of optimization?

Ответить

@doc7115 - 29.03.2023 05:28

That "session is over" was definitely not predicted.

Ответить

@jimr5855 - 24.03.2023 07:17

Really good presentation, thank you!

Ответить

@leshommesdupilly - 14.03.2023 03:02

Cpu and compiler engineers are mad geniuses lol

Ответить

@masondeross - 12.03.2023 19:02

I watched this talk sometime back around when it was first uploaded, and I am almost certain I missed that joke at the beginning: "it's a talk on performance; the closer you sit, the better the performance." I am so glad YT recommended it to me again.

Ответить

@lepidoptera9337 - 23.01.2023 03:19

You know that you have no real problems in your life if you are focused on the branch prediction pipeline of your CPU. ;-)

Ответить

@BenjaminBuch - 22.01.2023 18:02

Very great talk, thank you!

Ответить

@shilangyu - 12.01.2023 17:37

Why code like `rand() & 0x1` generates branches that can be missed? Doesnt this piece of code perform a branchless stream of instruction (with some bitwise-and call)?

Ответить

@paulfunigga - 21.12.2022 21:11

you write this kind of code and then some other person will look at your code and be like "was this guy on drugs when he wrote this?". I think all this makes C++ a horrid language

Ответить

@XKS99 - 19.11.2022 19:45

It’s crazy the quality of people Russia has lost

Ответить

@ciCCapROSTi - 06.11.2022 16:34

Really great topic and good info from Fedor, but his style takes some getting used to.

Ответить

@MarcoBergamin - 27.09.2022 21:44

Very interesting, thanks

Ответить

@edgararakelyan9326 - 10.08.2022 01:44

Awkward ending to say the least

Ответить

@serhii-ratz - 09.08.2022 20:08

“Predictor is good in prediction. We are not“ - nice :)

Ответить

@serhii-ratz - 09.08.2022 19:56

This is absolutely mind blowing session

Ответить

@bsuperbrain - 30.07.2022 23:48

the art of writing efficient programs is a very good introductory book into this performance focused universe, highly recommended

Ответить

@jieliu756 - 10.07.2022 20:50

Thank you CppCon. Is the presentation slides available anywhere?

Ответить

@douggale5962 - 08.07.2022 03:34

I suggest you look at `sudo perf top -e branch-misses` - it will just tell you exactly where there are too many mispredicts. Hit enter on the top thing and drill down to what function it is. Build with debug info.

Ответить

@douggale5962 - 08.07.2022 03:19

Nice to see a talk like this when there are so many people scoffing at branch free, because they saw one example where a perfectly predictable branch skipped a few ops and ended up slightly faster.

Ответить

@KX36 - 03.07.2022 17:38

My low-mid (second from the bottom level) level interview was 70% leadership to start with, followed by 30% extremely basic technical. I didn't expect leadership questions at such a low level, and I messed up that interview so hard, but fortunately they hired all 4 people that made it to interview so it's all good!

Ответить

@Minty_Meeo - 18.06.2022 08:30

In PowerPC, the branch conditional instruction has the prediction baked into it. I've seen, for example, MetroWerks compiler output where its static predictions were very primitive (99% of branches forward were unlikely, 99% of branches backward were likely). I've yet to use the C++20 attributes, but [[likely]] and [[unlikely]] probably give you manual control of the prediction bit for those branches, which is neat.

Debug assertions and nullptr checks were totally the first thing I thought of when I learned about these attributes. Even if the compiler is smart enough to recognize a nullptr check and mark it as unlikely (I'm sure it is), it is nice to be able to self-document it with an attribute.

Ответить

@KeyserTheRedBeard - 09.06.2022 23:52

cool upload CppCon. I smashed that thumbs up on your video. Maintain up the excellent work.

Ответить

@GeorgeTsiros - 07.06.2022 00:27

Always happy to see Fedor!

Ответить

@chriswysocki8816 - 26.05.2022 07:22

1. this video was made in 2021, with supposedly not an ancient CPU in the test system. I've heard so much about modern CPUs having redundant pipelines that keep evaluating both sides of the branch (and still keeping the branch prediction circuitry, in case the pipelines are "shorter" than the branch code paths). If that's true, why doesn't that make the handling of the ASM code generated by the compiler much more efficient? 10% (minority) wrong predictions should still lead to high efficiency.

2. why isn't the C compiler taking advantage of the SIMD in the fastest version of C branchless? Isn't there some compiler option you could have turned on to get the compiler to make code that performs closer to the optimal ASM impl?

.... are compiler implementors and CPU designers lying to us or are they optimizing to unrepresentative/narrow test cases?

Ответить

@younghsiang2509 - 10.05.2022 11:35

The session is over, thank you......

Ответить

@omarkhan6217 - 03.05.2022 06:25

give him another hour

Ответить

@binarytv2904 - 07.04.2022 12:18

"The closer you sit, better the performance" - I see what you did there :)

Ответить

@tolkienfan1972 - 18.02.2022 04:24

With x86 the biggest effect likely/unlikely has is to rearrange the instructions such that the unlikely branch is moved to a different area of the program. This makes the likely path a serial instruction stream, which is good for instrction cache. Also good for branch prediction in the case there is no branch prediction entry in the table. E.g. the first time thru

Ответить

@tomaszstanislawski457 - 17.02.2022 16:33

It can't optimize `c[i] = rand() >= 0` because function `rand()` is treated as "deterministic" random generator. The internal state of RNG must be advanced even though the returned value is discarded. The best one can get is `rand(); c[i] = 1;`

Ответить

@OmarChida - 15.02.2022 07:57

Great talk like always quality content from Fedor! I will definitely buy his book as I'm rly interested in these type of optimisations.
Also can someone explain why the optimisation with function pointers doesn't work when functions are inlined?

Ответить

@none_of_your_business - 13.02.2022 21:16

bool(x)+bool(y) is a bit scary, booleans shouldn't have arithmetic adition defined, it should have only boolean operations defined.

Ответить

@736939 - 12.02.2022 01:17

How to find a job for a C++ developer? I'm not joking, I'm a dotnet developer, and I'm asking the serios quesion. Companies (as usual) want to see professional C++ developers after they finished university, which is impossible.

Ответить

@AhmedSam - 03.02.2022 16:55

Loved it. Thanks Fedor!

Ответить

@Ryan1729 - 03.02.2022 16:34

Having looked at the comments before watching the entire talk, I was a little worried the talk would end before the speaker got to close the talk. So I'll mention here that the cut off happened during the question section at the end, after the last slide. Still somewhat abrupt!

Ответить

@LorenzBucher - 24.01.2022 21:03

Interesting and informative talk! I like the hands-on, example-driven approach.

What I don’t like is the constant interruptions (esp. ~ mins 30-40) from the audience questions. These are very hard to follow as a remote viewer and disrupt the flow.

Ответить

@JohnDoe-ly4ls - 19.01.2022 20:40

It was a great ride! Thanks Fedor & @CppCon 👍

Ответить

@bishop3000home - 16.01.2022 20:50

Serious question: if we repeat all the time ‘measure before changing’ and that compilers and processors may do better work then why do we think, that after we changed code once, it stays the fastest code? We made code less readable, we removed branch and added more work.
Then what if new processor came or new optimization in a compiler etc and original code would be better?

Ответить

@Thiago1337 - 15.01.2022 21:07

Session is over, thank you!

Ответить

@vladimir0rus - 15.01.2022 16:20

"Your session is over" :(

Ответить

@innovationscode9909 - 15.01.2022 01:25

Marvellous. Thanks. Deep thinking...

Ответить

@skybuck2000 - 14.01.2022 19:53

Do your cats own laptops ? LOL.

Ответить

@skybuck2000 - 14.01.2022 19:50

Why does the function pointer trick not work ? Perhaps expensive memory look up for the function ?

Ответить

@bobby9568 - 14.01.2022 16:10

Really well presented!

Ответить

@piotrarturklos - 13.01.2022 13:48

Excellent talk about branch predictor and the ways to take advantage of it. When I first saw the title though, for some reason I initially thought that the topic is about functional programming.

Ответить