• This is a political forum that is non-biased/non-partisan and treats every person's position on topics equally. This debate forum is not aligned to any political party. In today's politics, many ideas are split between and even within all the political parties. Often we find ourselves agreeing on one platform but some topics break our mold. We are here to discuss them in a civil political debate. If this is your first visit to our political forums, be sure to check out the RULES. Registering for debate politics is necessary before posting. Register today to participate - it's free!

Trading halted on NYSE floor

wait this is a hoot....


"NYSE and NYSE MKT began the process of canceling all open orders, working with customers to reconcile orders and trades, restarting all customer gateways and failing over to back-up trading units located in our Mahwah, NJ datacenter so trading could be resumed in a normal state"


Mahwah is the backup now? (it's not, SHHHHH)

and it took them 3 ****ing hours to fail over?

Many of my customers have fault tolerance and redundancy, "failover" ranges from near instant, to about 10 mins.

How many of your customers have that kind of traffic volume?
 
You have a system that 20,0000 people connect to and has 15 functions. If one person cannot execute one function but everything else works is the system up or down? How does that failure get factored in uptime calculations?


This was more than one person. I won't get specific to who we provide service for.


I'll refrain from commenting on anything specific to NYSE since I work for them - though not in the area affected yesterday. I will say we don't as general rule make routine production changes during the week. That doesn't mean we don't ever make them should circumstances warrant. It also doesn't mean anything that the release didn't use the words "emergency"


So you won't tell me whether mahwah is the main site, or as they claim the "backup site"?
 
I mean, I don't know what kinds of systems you manage, so I can't really say, but here, take an example. Say that in one of the applications you manage, somebody submits a helpdesk ticket saying that their user profile page isn't loading. How long might it be before anybody checks into the ticket at all? If the page in fact is not loading, how long before somebody diagnoses the problem? How long before you get a fix on the system typically?
Minutes, typically



Now, if instead of a user's page not loading, that was a situation where somebody's trade wasn't being recorded, that could leave the NYSE user in a situation where they lose $600,000 when the stock goes up 1% before they realize that the trade wasn't actually recorded. A similar severity of bug as might cause the user above's profile page not to load might cause a trade not to be recorded, but where in the profile page scenario, you consider the time that bug is in place to be uptime, and the bug may be in place for weeks, months or even years, the NYSE would immediately shut down and would not come back up until they found and fixed the bug and corrected any data that had been corrupted by it.

Would they not have the ability to roll back, or fail over, etc?


They admitted they failed over to a DR site, that takes 3 hrs?


Generally speaking, places that shoot for 99.999% uptime are places where users can deal with a few bugs and where the system behaving slowly for brief periods still counts as it being up. There are not a lot of environments that I'm aware of with 99.999% uptime where "up" means zero bugs and nearly instantaneous response times. And I can't think of any that hit that kind of zero bugs, instantaneous response, uptimes with a massive worldwide distribution of networks, servers and software doing extremely complex things running all custom software.

NYSE was awarded for not being down for three years someplace seems like that would work.



For example, Google's search engine is massive and equally complex and it hits 99.999% uptime. But, it does it by being very fault tolerant. If you do the same search 100 times, you won't actually get quite the same results each time because different databases will be getting fed new data at different times and this thing won't be quite up to date on that one, but will on this other, etc. Many of the stats they give you, for example about search queries, the data you get back is in the form of "approximately 500 - 540" and whatnot because they're often using more scalable, faster, techniques when they don't need to be 100% confident in 100% accuracy. Other large complex systems often hit 99.999% by being slow. For example, most ecommerce platforms have no problem making a user wait even 5 or 10 seconds while they process things. The NYSE can't do either of those things, and I can't think of another equally complex system that can't do either of those things which hits 99.999% with no bugs.

SO you are saying the NYSE does not have the resources to be fault tolerant?
 
This was more than one person. I won't get specific to who we provide service for.





So you won't tell me whether mahwah is the main site, or as they claim the "backup site"?

1 - You didn't answer my question as to how you compute uptime in a complex system that may not be either wholly up or wholly down or may be down from the perspective of one user but not from another. Or may be up but not meeting SLAs.

2 - I won't tell you anything about anything that isn't already public knowledge.
 
1 - You didn't answer my question as to how you compute uptime in a complex system that may not be either wholly up or wholly down or may be down from the perspective of one user but not from another. Or may be up but not meeting SLAs.

2 - I won't tell you anything about anything that isn't already public knowledge.



1. simple availability of the product to the end users.

2. NYSE Euronext data center in Mahwah, NJ IS the NYSE. ;)
 
1. simple availability of the product to the end users.

2. NYSE Euronext data center in Mahwah, NJ IS the NYSE. ;)

And if you have 20000 users and the product isn't available to one of them because someone fat-fingered a firewall rule are you up or down?

If you have 20 functions and one isn't available because the database that it hits isn't up but the other 19 functions are available are you up or down?
 
And if you have 20000 users and the product isn't available to one of them because someone fat-fingered a firewall rule are you up or down?

If you have 20 functions and one isn't available because the database that it hits isn't up but the other 19 functions are available are you up or down?

1. depends on the function of that particular user.

2. depends on that function.
 
so it's no so simple afterall. The real world seldom is.


It's simple to not think things like major institutions like NYSE, WSJ, United coincidentially had problems the same day.

oh and now it seems the USG was hacked again.


And what of this anonymous post about the NYSE?
 
It's simple to not think things like major institutions like NYSE, WSJ, United coincidentially had problems the same day.

oh and now it seems the USG was hacked again.


And what of this anonymous post about the NYSE?

Thousands of companies out there. I think the odds of two or more having a major problem the same day are pretty good.

What anonymous NYSE post?
 
systems like that that? If the IT department wasn't maintaining %99.999 uptime, they would be fired. It's industry standard.
I think your sysadmin is exaggerating by some fractions of a percentage point.
I think the standard is closer to 99.9%

And how many systems are there?
What percentage is 3 of them out of those?
That's more the point.
 
Back
Top Bottom