Inter Thread Latency
Message rates between threads are fundamentally determined by the latency of memory exchange between CPU cores. The minimum unit of transfer will be a cache line exchanged via shared caches or socket...
View ArticleFalse Sharing && Java 7
In my previous post on False Sharing I suggested it can be avoided by padding the cache line with unused long fields. It seems Java 7 got clever and eliminated or re-ordered the unused fields, thus...
View ArticleCode Refurbishment
Within our industry we use a huge range of terminology. Unfortunately we don’t all agree on what individual terms actually mean. I so often hear people misuse the term “Refactoring” which has come to...
View ArticleDisruptor 2.0 Released
Significantly improved performance and a cleaner API are the key takeaways for the Disruptor 2.0 concurrent programming framework for Java. This release is the result of all the great feedback we have...
View ArticleModelling Is Everything
I’m often asked, “What is the best way to learn about building high-performance systems”? There are many perfectly valid answers to this question but there is one thing that stands out for me above...
View ArticleAdventures with AtomicLong
Sequencing events between threads is a common operation for many multi-threaded algorithms. These sequences could be used for assigning identity to orders, trades, transactions, messages, events,...
View ArticleSingle Writer Principle
When trying to build a highly scalable system the single biggest limitation on scalability is having multiple writers contend for any item of data or resource. Sure, algorithms can be bad, but let’s...
View ArticleSmart Batching
How often have we all heard that “batching” will increase latency? As someone with a passion for low-latency systems this surprises me. In my experience when batching is done correctly, not only does...
View ArticleLocks & Condition Variables - Latency Impact
In a previous article on Inter-Thread Latency I showed how it is possible to signal a state change between 2 threads with less than 50ns of latency. To many developers, writing concurrent code using...
View ArticleJava Lock Implementations
We all use 3rd party libraries as a normal part of development. Generally, we have no control over their internals. The libraries provided with the JDK are a typical example. Many of these...
View ArticleBiased Locking, OSR, and Benchmarking Fun
After my last post on Java Lock Implementations, I got a lot of good feedback about my results and micro-benchmark design approach. As a result I now understand JVM warmup, On Stack Replacement (OSR)...
View ArticleJava Sequential IO Performance
Many applications record a series of events to file-based storage for later use. This can be anything from logging and auditing, through to keeping a transaction redo log in an event sourced design...
View ArticleFun with my-Channels Nirvana and Azul Zing
Since leaving LMAX I have been neglecting my blog a bit. This is not because I have not been doing anything interesting. Quite the opposite really, things have been so busy the blog has taken a back...
View ArticleInvoke Interface Optimisations
I'm often asked about the performance differences between Java, C, and C++, and which is better. As with most things in life there is no black and white answer. A lot is often discussed about how...
View ArticleApplying Back Pressure When Overloaded
How should a system respond when under sustained load? Should it keep accepting requests until its response times follow the deadly hockey stick, followed by a crash? All too often this is what...
View ArticleNative C/C++ Like Performance For Java Object Serialisation
Do you ever wish you could turn a Java object into a stream of bytes as fast as it can be done in a native language like C++? If you use standard Java Serialization you could be disappointed with the...
View ArticleMemory Access Patterns Are Important
In high-performance computing it is often said that the cost of a cache-miss is the largest performance penalty for an algorithm. For many years the increase in speed of our processors has greatly...
View ArticleCompact Off-Heap Structures/Tuples In Java
In my last post I detailed the implications of the access patterns your code takes to main memory. Since then I've had a lot of questions about what can be done in Java to enable more predictable...
View ArticleMechanical Sympathy Discussion Group
Lately a number of people have suggested I start a discussion group on the subject of mechanical sympathy, so I've taken the plunge and done it! The group can be a place to discuss topics related to...
View ArticleFurther Adventures With CAS Instructions And Micro Benchmarking
In a previous article I reported what appeared to be a performance issue with CAS/LOCK instructions on the Sandy Bridge microarchitecture compared to the previous Nehalem microarchitecture. Since...
View ArticleCPU Cache Flushing Fallacy
Even from highly experienced technologists I often hear talk about how certain operations cause a CPU cache to "flush". This seems to be illustrating a very common fallacy about how CPU caches work,...
View ArticlePrinting Generated Assembly Code From The Hotspot JIT Compiler
Sometimes when profiling a Java application it is necessary to understand the assembly code generated by the Hotspot JIT compiler. This can be useful in determining what optimisation decisions have...
View ArticleJava Garbage Collection Distilled
Serial, Parallel, Concurrent, CMS, G1, Young Gen, New Gen, Old Gen, Perm Gen, Eden, Tenured, Survivor Spaces, Safepoints, and the hundreds of JVM startup flags. Does this all baffle you when trying to...
View ArticleLock-Based vs Lock-Free Concurrent Algorithms
Last week I attended a review session of the new JSR166StampedLock run by Heinz Kabutz at the excellent JCrete unconference. StampedLock is an attempt to address the contention issues that arise in a...
View ArticleSimple Binary Encoding
Financial systems communicate by sending and receiving vast numbers of messages in many different formats. When people use terms like "vast" I normally think, "really..how many?" So lets quantify...
View Article