Software can spot, then stop, brewing storms in cloud computing
Cloud computing can improve efficiency by tapping the power of multiple “virtual machines.” But what if one of those machines encounters a problem that then cascades through the entire system?
Researchers at North Carolina State University believe they have an answer: software that can identify and respond to potential anomalies in the cloud before they become serious problems.
Their program watches a wide variety of factors — memory used, network traffic, CPU usage, etc. — and uses those to determine what is “normal” for each virtual machine in a cloud environment.
Whenever a virtual machine deviates from that definition of normal, the software launches diagnostics tests, which can then be used to trigger the appropriate response and prevent a problem from escalating.
One advantage of this approach is that it doesn’t require users to provide so-called “training data” about what constitutes abnormal behavior. That’s important, because training data can be hard to obtain in cloud systems. The new software strategy can also predict anomalies that have never been seen before.
“If we can identify the initial deviation and launch an automatic response, we can not only prevent a major disturbance, but actually prevent the user from even experiencing any change in system performance,” said Helen Gu, an assistant professor of computer science at NC State and co-author of a paper describing the research. “Also, it’s important to note that this program does not access any user’s individual information. We’re looking only at system-level behavior.”
Gu said the researchers’ next step is to incorporate more detailed “white box” diagnostic tools into the software so they can identify anomaly-causing software bugs and correct them.