Multiple sources are responsible for a growth in data that is appl-cable to Big Data technology. Some of these sources represent entirely new data sources, while others are a change in the resolution of the existing data generated. Much of that growth can be attributed to industry digitization of content.
With companies now turning to creating digital representations of existing data and acquiring everything that is new, data growth rates over the last few years have been nearly infinite, simply because most of the businesses involved started from zero.
Many industries fall under the umbrella of new data creation and digitization of existing data, and most are becoming appropriate sources for Big Data resources. Those industries include the following:
- Transportation, logistics, retail, utilities, and telecommunications. Sensor data are being generated at an accelerating rate from fleet GPS transceivers, RFID (radio-frequency identification) tag readers, smart meters, and cell phones (call data records); these data are used to optimize operations and drive operational BI to realize immediate business opportunities.
- Health care. The health care industry is quickly moving to electronic medical records and images, which it wants to use for short-term public health monitoring and long-term epidemiological research programs.
- Government. Many government agencies are digitizing public records, such as census information, energy usage, budgets, Freedom of Information Act documents, electoral data, and law enforcement reporting.
- Entertainment media. The entertainment industry has moved to digital recording, production, and delivery in the past five years and is now collecting large amounts of rich content and user viewing behaviors.
- Life sciences. Low-cost gene sequencing (less than $1,000) can generate tens of terabytes of information that must be analyzed to look for genetic variations and potential treatment effectiveness.
- Video surveillance. Video surveillance is still transitioning from closed-caption television to Internet protocol television cameras and recording systems that organizations want to ana-lyze for behavioral patterns (security and service enhancement).
For many businesses, the additional data can come from self-service marketplaces, which record the use of affinity cards and track the sites visited, and can be combined with social networks and location-based metadata. This creates a goldmine of actionable consumer data for retailers, distributors, and manufacturers of consumer packaged goods.
The legal profession is adding to the multitude of data sources, thanks to the discovery process, which is dealing more frequently with electronic records and requiring the digitization of paper documents for faster indexing and improved access. Today, leading e-discovery companies are handling terabytes or even petabytes of information that need to be retained and reanalyzed for the full course of a legal proceeding.
Additional information and large data sets can be found on social media sites such as Facebook, Foursquare, and Twitter. A number of new businesses are now building Big Data environments, based on scale-out clusters using power-efficient multicore processors that leverage consumers’ (conscious or unconscious) nearly continuous streams of data about themselves (e.g., likes, locations, and opinions).
Thanks to the network effect of successful sites, the total data generated can expand at an exponential rate. Some companies have collected and analyzed over 4 billion data points (e.g., web site cut-and-paste operations) since information collection started, and within a year the process has expanded to 20 billion data points gathered.
Taken from : Big Data Analytics: Turning Big Data into Big Money
0 comments:
Post a Comment