As the amount of data gathered grows exponentially, so does the evolution of the technology used to process the data. According to the International Data Corporation, the volume of digital content in the world will grow to 2.7 billion terabytes in 2012, up 48 percent from 2011, and will reach 8 billion terabytes by 2015. That will be a lot of data!
The flood of data is coming from both structured corporate databases and unstructured data from Web pages, blogs, social networking messages, and other sources. Currently, for example, there are countless digital sensors worldwide in industrial equipment, automobiles, electrical meters, and shipping crates. Those sensors can measure and communicate location, movement, vibration, temperature, humidity, and even chemical changes in the air. Today, Big Business wields data like a weapon. Giant retailers, such as Walmart and Kohl’s, analyze sales, pricing, economic, demographic, and weather data to tailor product selections at particular stores and determine the timing of price markdowns.
Logistics companies like United Parcel Service mine data on truck delivery times and traffic patterns to fine-tune routing. A whole ecosystem of new businesses and technologies is springing up to engage with this new reality: companies that store data, companies that mine data for insight, and companies that aggregate data to make them manageable. However, it is an ecosystem that is still emerging, and its exact shape has yet to make itself clear.
Even though Big Data has been around for some time, one of the biggest challenges of working with it still remains, and that is assembling data and preparing them for analysis. Different systems store data in different formats, even within the same company. Assembling, standardizing, and cleaning data of irregularities—all without removing the information that makes them valuable—remain a central challenge.
Currently, Hadoop, an open source software framework derived from Google’s MapReduce and Google File System papers, is being used by several technology vendors to do just that. Hadoop maps tasks across a cluster of machines, splitting them into smaller subtasks, before reducing the results into one master calculation. It’s really an old grid-computing technique given new life in the age of cloud computing. Many of the challenges of yesterday remain today, and technology is just now catching up with the demands of Big Data analytics. However, Big Data remains a moving target.
As the future brings more challenges, it will also deliver more solutions, and Big Data has a bright future, with tomorrow delivering the technologies that ease leveraging the data. For example, Hadoop is converging with other technology advances such as high-speed data analysis, made possible by parallel computing, in-memory processing, and lower-cost flash memory in the form of solid-state drives.
The prospect of being able to process troves of data very quickly, in-memory, without time-consuming forays to retrieve information stored on disk drives, will be a major enabler, and this will allow companies to assemble, sort, and analyze data much more rapidly. For example, T-Mobile is using SAP’s HANA to mine data on its 30 million U.S. customers from stores, text messages, and call centers to tailor personalized deals.
What used to take T-Mobile a week to accomplish can now be done in three hours with the SAP system. Organizations that can utilize this capability to make faster and more informed business decisions will have a distinct advantage over competitors. In a short period of time, Hadoop has transitioned from relative obscurity as a consumer Internet project into the mainstream consciousness of enterprise IT.
Hadoop is designed to handle mountains of unstructured data. However, as it exists, the open source code is a long way from meeting enterprise requirements for security, management, and efficiency without some serious customization. Enterprise-scale Hadoop deployments require costly IT specialists who are capable of guiding a lot of somewhat disjointed processes. That currently limits adoption to orga-nizations with substantial IT budgets.
As tomorrow delivers refined platforms, Hadoop and its derivatives will start to fit into the enterprise as a complement to existing data analytics and data warehousing tools, available from established business process vendors, such as Oracle, HP, and SAP. The key will be to make Hadoop much more accessible to enterprises of all sizes, which can be accomplished by creating high availability platforms that take much of the complexity out of assembling and preparing huge amounts of data for analysis.
Aggregating multiple steps into a streamlined automated process with significantly enhanced security will prove to be the catalyst that drives Big Data from today to tomorrow. Add those enhancements to new technologies, such as appliances, and the momentum should continue to pick up, thanks to easy management through user-friendly GUI.
The true value of Big Data lies in the amount of useful data that can be derived from it. The future of Big Data is therefore to do for data and analytics what Moore’s Law has done for computing hardware and exponentially increase the speed and value of business intelligence. Whether the need is to link geography and retail availability, use patient data to forecast public health trends, or analyze global climate trends, we live in a world full of data. Effectively harnessing Big Data will give businesses a whole new lens through which to see it.
However, the advance of Big Data technology doesn’t stop with tomorrow. Beyond tomorrow probably holds surprises that no one has even imagined yet. As technology marches ahead, so will the usefulness of Big Data. A case in point is IBM’s Watson, an artificial intelligence computer system capable of answering questions posed in natural language. In 2011, as a test of its abilities, Watson competed on the quiz show Jeopardy!, in the show’s only human-versus-machine match to date. In a two-game, combined-point match, broadcast in three episodes aired February 14–16, Watson beat Brad Rutter, the biggest all-time money winner on Jeopardy!, and Ken Jennings, the record holder for the longest championship streak (74 wins).
Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage, including the full text of Wikipedia, but was not connected to the Internet during the game. Watson demonstrated that there are new ways to deal with Big Data and new ways to measure results, perhaps exemplifying where Big Data may be headed.
So what’s next for Watson? IBM has stated publicly that Watson was a client-driven initiative, and the company intends to push Watson in directions that best serve customer needs. IBM is now working with financial giant Citi to explore how the Watson technology could improve and simplify the banking experience. Watson’s applicability doesn’t end with banking, however; IBM has also teamed up with health insurer WellPoint to turn Watson into a machine that can support the doctors of the world.
According to IBM, Watson is best suited for use cases involving critical decision making based on large volumes of unstructured data. To drive the Big Data–crunching message home, IBM has stated that 90 percent of the world’s data was created in the last two years, and 80 percent of that data is unstructured. Furthering the value proposition of Watson and Big Data, IBM has also stated that five new research documents come out of Wall Street every minute, and medical information is doubling every five years.
IBM views the future of Big Data a little differently than other vendors do, most likely based on its Watson research. In IBM’s future, Watson becomes a service—as IBM calls it, Watson-as-a-Service— which will be delivered as a private or hybrid cloud service.
Watson aside, the health care industry seems ripe as a source of prediction for how Big Data will evolve. Examples abound for the benefits of Big Data and the medical field; however, getting there is another story altogether. Health care (or in this context, “Big Medicine”) has some specific challenges to overcome and some specific goals to achieve to realize the potential of Big Data:
- Big Medicine is drowning in information while also dying of thirst. For those in the medical profession, that axiom can be summed up with a situation that most medical personnel face: When you’re in the institution and you’re trying to figure out what’s going on and how to report on something, you’re dying of thirst in a sea of information. There is a tremendous amount of information, so much so that it becomes a Big Data problem. How does one tap into that information and make sense of it? The answer has implications not only for patients but also for the service providers, ranging from nurses, physicians, and hospital administrators, even to government and insurance agencies. The big issue is that the data are not organized; they are a mixture of structured and unstructured data. How the data will ultimately be handled over the next few years will be driven by the government, which will require a tremendous amount of information to be recorded for reporting purposes.
- Technologies that tap into Big Data need to become more prevalent and even ubiquitous. From the patient’s perspective, analytics and Big Data will aid in determining which hospital in a patient’s immediate area is the best for treating his or her condition. Today there are a huge number of choices available, and most people choose by word of mouth, insurance requirements, doctor recommendations, and many other factors. Wouldn’t it make more sense to pick a facility based on report cards derived by analytics? That is the goal of the government, which wants patients to be able to look at a report card for various institutions. However, the only way to create that report card is to unlock all of the information and impose regulations and reporting. That will require various types of IT to tap into unstructured information, like dashboard technologies and analytics, business intelligence technologies, clinical intelligence technologies, and revenue cycle management intelligence for institutions.
- Decision support needs to be easier to access. Currently in medical institutions, evidence-based medicine and decision support is not as easy to access as it should be. Utilizing Big Data analytics will make the decision process easier and will provide the hard evidence to validate a particular decision path. For example, when a patient is suffering from a particular condition, there’s a high potential that something is going to happen to that patient because of his or her history. The likely outcomes or progressions can be brought up at the beginning of the care cycle, and the treating physician can be informed immediately. Information like that and much more will come from the Big Data analytics process.
- Information needs to flow more easily. Currently from a patient’s perspective, health care today limits information. Patients often have little perspective on what exactly is happening, at least until a physician comes in. However, the majority of patients are apprehensive about talking to the physician. That becomes an informational blockade for both the physician and the patient and creates a situation in which it becomes more difficult for both physicians and patients to make choices. Big Data has the potential to solve that problem as well; the flow of information will be easier not only for physicians to manage but also for patients to access. For example, physicians will be able to look on their tablets or smartphones and see there is a 15-minute emergency-room wait over here and a 5-minute wait over there. Scheduling, diagnostic support, and evidence-based medicine support in the work flow will improve.
- Quality of care needs to be increased while driving costs down. From a cost perspective and a quality-of-care point of view, there are a number of different areas that can be improved by Big Data. For example, if a patient experiences an injury while staying in a hospital, the hospital will not be reimbursed for his or her care. The system can see that this has the potential to happen and can alert everyone. Big Data can enable a proactive approach for care that reduces accidents or other problems that affect the quality of care. By preventing problems and accidents, Big Data can yield significant savings.
- The physician–patient relationship needs to improve. Thanks to social media and mobile applications, which are benefiting from Big Data techniques, it is becoming easier to research health issues and allow patients and physicians to communicate more frequently. Stored data and unstructured data can be analyzed against social data to identify health trends. That information can then be used by hospitals to keep patients healthier and out of the facility. In the past, hospitals made more money the sicker a patient was and the longer they kept him or her there. However, with health care reform, hospitals are going to start being compensated for keeping patients healthier. Because of that there will be an explosion of mobile applications and even social media, allowing patients to have easier access to nurses and physicians. Health care is undergoing a transformation in which the focus is more on keeping patients healthy and driving down costs. These two major areas are going to drive a great deal of change, and a lot of evolution will take place from a health information technology point of view, all underpinned by the availability of data.
Health care proves that Big Data has definite value and will arguably be the leader in Big Data developments. However, the lessons learned by the health care industry can readily be applied to other business models, because Big Data is all about knowing how to utilize and analyze data to fit specific needs.
Taken from : Big Data Analytics: Turning Big Data into Big Money
0 comments:
Post a Comment