Computers don’t cause problems. People do.

leave a comment »

What trading systems look like

The past six months have seen a rash of headlines about computer glitches and failures affecting financial firms. First, BATS had to pull its IPO due to problems with its own trading platform. Then the start of trading in Facebook shares was delayed due to a NASDAQ systems failure. In June, a failed system upgrade affected hundreds of thousands of RBS customers in the UK. Finally, on August 1st, Knight Capital lost $440m in just 45 minutes due to problems with its market-making software.

Now the Chicago Fed has released a letter (PDF) containing the results of a survey, which reveals that “two-thirds of proprietary trading firms, and every exchange interviewed had experienced one or more errant algorithms”. I’m not surprised. I’ve done it myself, creating a ad-hoc market-making tool which malfunctioned when there were no other quotes in the market.

Admiral Hyman G. Rickover of the United States Navy is regarded as the “father of the nuclear navy”, having served as director of Naval Reactors for over three decades. He wrote:

Responsibility is a unique concept: it can only reside and inhere in a single individual. You may share it with others, but your portion is not diminished. You may delegate it, but it is still with you. You may disclaim it, but you cannot divest yourself of it. Even if you do not recognize it or admit its presence, you cannot escape it. If responsibility is rightfully yours, no evasion, or ignorance, or passing the blame can shift the burden to someone else. Unless you can point your finger at the person who is responsible when something goes wrong, then you have never had anyone really responsible.

We love blaming the computer when things go awry but the truth is that the root cause of failures like these can be traced back to people. It might be a lack of proper testing or the absence of appropriate controls to catch “fat finger” errors. Poor outage management procedures and crisis planning can exacerbate the impact of relatively minor problems. Overly-automated, opaque “black box” systems can make it difficult to diagnose problems. Building increasingly complex, interconnected systems without also ensuring that the resources are in place to effectively manage those systems can result in an effective loss of control.

While retail and commercial banks exhibit a reluctance to ditch their monolithic, batch-based core banking systems in favour of modern technologies, investment banks and trading firms operate at the cutting edge of technology. In the zero-sum game of the financial markets, technology can be a strategic competitive advantage and there’s a lot more innovation in financial technology that most people realise. The following is an extract from a lengthy comment I posted on Fred Wilson’s blog late last year:

There is plenty of innovation in, has been for years. And there have been loads of startups. Some of them have been so ridiculously successful that we forget that they started life as a tech startup (anyone ever heard of a company called Bloomberg?). Others are focused on large corporate clients, so they never cross most people’s radar (MarketAxess, TradeWeb, ION Trading). You don’t see OpenGamma in TechCrunch or TheNextWeb often because they’re not trying to be – their market doesn’t read those blogs. Some are so damned successful at what they do that they simply get bought out, like The Prediction Company.

The real challenge with and the reason that so many people fail to understand (and, hence, fear) it, is the fact that the interplay between technology and the underlying business models is incredibly complex and has evolved over many years. If you don’t grasp all the factors that come into play, if you approach it purely from a technology viewpoint, then it’s as if you’re looking at a 3D object with 2D eyes – you see a circle and you think you see everything but, in fact, it’s a sphere (or, more likely, a 4D mobius torsoid).

And, because the numbers are so big and it’s pretty much a zero-sum game, if you get it wrong, the consequences can be disastrous. That’s one of the reasons you often see old, crappy, legacy systems in financial institutons – they’ve evolved organically over years and it’s really difficult to replace them, so it’s sometimes less risky to keep the inefficient legacy system in place and pay people to manually make up for its deficiencies, than it is to spend millions building a system which may not work. “Mission-critical” takes on a new meaning if the mission is running ATM machines and bank counter terminals.

Of course, if you get it right, the rewards can be huge – spend a couple of million building a system that works, it can be like a silver bullet – an uptick in revenue measured in tens of millions, better service for customers and a strategic competitive advantage that can last for years. That’s another reason innovation is often invisible – a company has no incentive to let the rest of the world know about the breakthrough that has enabled them to drastically reduce their cost-per-transaction because they’re now able to price at a point just below the competition (which attracts more business), while making twice or three times the profit on each transaction.

While most executives have come to realise that technology can be a revenue generator, the traditional classification of IT as a cost centre appears to lead them to focus more on managing costs than on managing risk. As companies’ reliance on technology grows, it’s important that they recognise that increasing, for example, the degree of automation and the speed of execution brings asymmetric risks that can result in a massive downside when things go wrong, as exemplified by Knight Capital

What’s more, as the business (i.e. the revenue-generating operations, as opposed to the support operations) becomes more dependent upon, and intertwined with technology, the humans who build, operate and use the technology become an increasingly important part of the overall system. The interplay between the different elements, the increase in complexity and interconnectedness can give rise to truly complex systems. Managing such systems requires an understanding of both business and technology, and the ability to ensure that all parts of such systems are managed robustly. To do otherwise is a recipe for disaster. I have no doubt that UBS’ front office trading systems are top-of-the-line but inadequate risk management systems allowed Kweku Adoboli to create fictitious positions to mask his rogue trading which ended up costing UBS $2.3 billion.

Risk management systems don’t generate revenue, so they’re not a high priority when it comes to allocating budgets. Developers love writing code but they hate writing documentation. Writing trading algorithms is cool but exhaustively testing them isn’t. Trimming technology service, support and operations departments seems like an easy win when it’s time to reduce headcount but if it simply results in fewer people supporting the same systems, it may increase operational risk. The easiest way to deal with an outage is to prevent it from happening in the first place but you need resources in place to do that, and to prepare and plan for crises so that the impact is minimised.

With technology, the details matter. Until financial services companies’ leadership recognise that fact and realise that they need to focus as much on risk as they do on revenue and costs, we’re likely to continue to see headlines implicitly blaming computers for massive outages affecting thousands of customers and unexpected multi-million dollar losses.

Written by jackgavigan

September 21, 2012 at 12:52 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: