Tough Times For The Tokyo Stock Exchange’s CIO

by drjim on July 18, 2012

Image Credit The Tokyo Stock Exchange needs its computers to stay up…

The Tokyo Stock Exchange needs its computers to stay up…

If I asked you what a CIO’s #1 job responsibility was, what would you say? After a lot of hemming and hawing, I think that you’d probably end up telling me that at the end of the day, a CIO has the ultimate responsibility for making sure that the company’s IT resources are available for the rest of the company to use at all times – that’s almost a part of the definition of information technology. That’s why what happened at the Tokyo Stock Exchange is such a big deal…

The Day That Tokyo Went Away

So what happened over at the Tokyo Stock Exchange? The way that things are set up at the TSE, there are eight main servers that support the exchange. These servers are used to distribute stock data to traders who then use the information to make decisions about trades that they want to make.

One of the servers failed. This is an event that the TSE had anticipated. When this happens, a switching of servers is supposed to occur so quickly that traders using the system won’t even know that a failure has occurred.

However, this time the switch didn’t happen. Instead, a hardware problem in the TSE’s information-distribution system failed to do what it was supposed to do. What this meant is that the server went off-line and trading was affected.

Specifically, 241 different securities that are traded on the TSE were unable to be traded for over 2.5 hours. In the world of IT, this amount of downtime while not being good isn’t that big of a deal. However, in the world of finance 2.5 hours of not being able to trade is a very expensive mistake.

This outage affected some big company names. During the outage, firms like Sony and Hitachi could not be traded. This isn’t the first problem that the TSE has had. Awhile back the TSE had had a number of outages that resulted in them having to replace their entire ancient trading platform.

Why Server Availability Is Such A Big CIO Deal

Making sure that your servers are up and working correctly is one of the most important jobs that a CIO has. Yes, we all know that servers will fail, but that’s why we have backup systems in place.

At the TSE it’s clear that although they had a backup system in place, they had not tested it to ensure that it would work properly. Clearly what happened is that some networking component was not properly configured to deal with the server failure.

How can a CIO expect to be included in the company’s strategic decision making processes if the very basic IT task of keeping the various IT systems up and running hasn’t been taken care of? The TSE did get their server back up and running in time to support the afternoon trading session; however, by then the damage had been done.

The reason that this outage is such a big deal is because it may end up having an impact that extends far beyond the IT department. The TSE is planning on merging with the Osaka Securities Exchange. However, one of the challenges that that proposed merger is facing is regulator doubt that the two exchanges could successfully combine their trading platforms. A 2.5 hour outage at the TSE won’t make the people who need to approve the merger any more confident that it can be done successfully.

What All Of This Means For You

In the IT sector CIOs exist for a number of different reasons. We like to spend a lot of time talking about how a CIO can become a member of the team that is mapping out the company’s strategic future. However, before that can happen, the so-called “blocking and tackling” day-to-day tasks of an IT department need to be taken care of. This includes keeping the company’s servers up and running.

Over at the Tokyo Stock Exchange they had an incident in which one of its main servers became unavailable. This meant that trading in many different companies could not be performed. Clearly, one of the basic jobs of the IT department was not being performed.

Events like this can have far reaching impacts that extend far outside of the IT department. In the case of the Tokyo Stock Exchange, a planned merger may now be in doubt because this glitch has exposed how difficult merging the two companies’ different trading platforms would really be.

When you are CIO, you need to make sure that the basic tasks that the IT department is responsible for are taken care of. It’s the successes that you have here on which you’ll be able to build your next steps. Companies are only now starting to recognize the true importance of information technology. Make sure that you have a solid IT base to grow your career from!

– Dr. Jim Anderson
Blue Elephant Consulting –
Your Source For Real World IT Department Leadership Skills™

Question For You: Server outages will always occur, what do you think that a CIO should do in order to be ready for them when they happen?

Click here to get automatic updates when The Accidental Successful CIO Blog is updated.

P.S.: Free subscriptions to The Accidental Successful CIO Newsletter are now available. Learn what you need to know to do the job. Subscribe now: Click Here!

What We’ll Be Talking About Next Time

Yea Cloud Computing! Everyone is in the process of falling in love with cloud computing and its sister Software-As-A-Service (SaaS). What CIO wouldn’t love an opportunity to no longer have to buy and pay to maintain computing hardware that was only going to become obsolete overtime (this is almost a part of the definition of information technology)? Although SaaS does offer a lot of benefits, it’s starting to become clearer that there are some serious drawbacks to this solution also…

Be Sociable, Share!

{ 2 comments… read them below or add one }

Marcus Emmanuel Barnes July 28, 2012 at 12:52 am

Given that server outages happen, one should simulate as many failures as possible (at all possible failure points), documenting the recovery steps involved in detail. This will help staff diagnose failures and allow them to practice executing various recovery processes quickly should a real failure occur. Think of this as emergency preparedness for IT.

There is a big problem here though – simulating as many failures as possible takes considerable time and resources. Additionally, such simulations need to be ongoing to keep up to date with changes in hardware/software and in order to ensure staff are ready to respond to issues and fix them swiftly.

The CIO will likely have to create a detailed cost-benefit analysis to present to those on the executive team in order to secure the resources necessary to carry out such a systematic and ongoing program of “emergency preparedness for IT”. After that, it’s up to the executive team to determine what risks they are willing to take given the scale of their organization and the impact downtime would have, and the CIO to organize the IT staff to do the best given the resources available to them.

Reply

Dr. Jim Anderson August 3, 2012 at 3:16 pm

Marrus: As you point out, more simulation can lead to better fault detection. However, simply making sure that you have a plan on what to do when you have an outage will help to minimize the damage that your next outage causes even if you didn’t predict it by using simulations…

Reply

Leave a Comment

Previous post:

Next post: