Even though we often hear terms like L1, L2, cache block size, etc., most programmers have a limited understanding of what cache really is. This is a beginner-friendly primer on how cache works.
This blog introduces three mechanisms to investigate the execution time of a Maven build. Having a reliable way to measure build execution time can help identify bottlenecks. This in turn helps making effective improvements, thereby contributing to higher developer productivity. Find out how to effe...
ULL trading firms go to a lot of trouble to get their servers and switches within the same buildings as the exchanges they trade with to reduce latency. Some firms don’t even use layer 1 switches to be competitive.
I wrote 84 new matmul kernels to improve llamafile CPU performance.
Link Actions
My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.
Back in 2019, Nick Fitzgerald published always bump downwards, an article making the case that for bump allocators, bumping “down” (towards lower addresses) is better than bumping up. The biggest reasons for this are bumping up requires 3 branches vs 2 for bumping down and rounding down requires few...
We’re building an open source alternative to AWS. For IPv4 assignment and firewall rules, we use Linux’s Netfilter / Nftables. This subsystem provides a powerful way to handle packets addressed to the host. We recently came across flowtables - a network acceleration feature in the Linux kernel that ...
Grafana Beyla 1.2 offers improved Kubernetes support, including the ability to decorate metrics and traces with the metadata of Kubernetes pods and nodes.
This month’s PGSQL Phriday #015 topic is about UUIDs, hosted by Lætitia Avrot. Lætitia has called for a debate. No, no, no. I say let’s have an all-out war. A benchmark war. I have deci…
Calculate the optimal size for your bloom filter, see how many items a given filter can hold, or just admire the curvy graphs. Also borrow my MIT licensed Javascript for your own programs.