Big Data are not just the domain of big enterprises, they are the creation of individual users interconnecting their lives and functional roles. Consumers are increasing data creation and manipulation through their interconnected devices. The Big Data "volume, variety, and velocity" paradigm drives a demand not only for more data storage, but also for more ways to collect, manipulate, and execute data which is driving steady demand in the industry.
Increasingly, the term "Big Data" is creeping into mainstream discussions outside of the IT sector. What does this term mean and why are data being called big now? Large government agencies and commercial enterprises have been managing and employing large data sets for many years. The difference now is that Big Data is increasingly part of everyday life – a tool and a challenge for individuals, households, educators, health care providers, small businesses, and many others who are creating and using data. This expanding universe of data generators and consumers is having an explosive impact on the hardware side of the industry. So, while Big Data may sound like an IT challenge, it's also a major growth driver in terms of a new array of data storage hardware solutions spanning enterprise to personal devices.
The era of "Big Data"
To start, what does it mean to call something "Big Data?" A recent IBM book, Understanding Big Data, offers this definition: "the term Big Data applies to information that can't be processed or analyzed using traditional processes or tools" (p. 3) Adding Wikipedia's definition rounds out the term with more detail: Big Data is data sets that are characterized by the sheer amount of information in concert with the complexity of the data which can require hundreds of servers working in parallel when relational queries and business analytics programs are run on the data.
Data points are collected at so many instances in our daily lives: using toll way devices, keycard swipes, bank and credit cards, restaurant waitlist buzzers, cash registers, surveillance systems (municipal, private, business, etc.), smartphones, digital cameras, MP3 players, voice-memo recorders, intranets and internet traffic, etc. It suddenly becomes little wonder that, according to IBM's research, "[e]very day, we create 2.5 quintillion bytes of data – so much that 90% of the data in the world today has been created in the last two years alone."
Similarly, enterprise data is growing at equally astounding rates, and the largest data collectors among us, at commercial, research, government, and academic institutions, are seeing Big Data growth that has pushed the limits and capabilities of most systems for both hardware and IT. Consumers comprise a significant sector of today's data generators. As consumers become more demanding of collecting, sharing, and storing various aspects of their personal lives while they increasingly take their business lives more mobile, the amount of data generation will only continue to proliferate, creating challenges for service providers that rely on solutions leveraging new hardware configurations.
For example, the online photo site Shutterfly holds an image archive in excess of 30 petabytes of data (one petabyte equals one thousand terabytes or one million gigabytes), according to Neil Day, Shutterfly’s senior vice president and chief technology officer, as cited in an recent interview with CIO. Cloud service providers like Shutterfly are increasing as more digital data are created by consumers, but enterprises are also creating their own big data sets as well. Faced with the question and concern of intellectual property as well as security, the conundrum for many in enterprise IT is how to store, back up, and then access such data.
Big data is not just about size; it is about "the three V's," as coined by IBM: "[t]hree characteristics define Big Data: volume, variety, and velocity" (Understanding Big Data, p.5).
- Volume is just that: the sheer size of the growing tera- and petabyte data sets, regardless of the type of data being generated.
- Variety refers to the numerous different formats of both relational and non-relational data that are generated and stored.
- Velocity refers to the need for speed when accessing and manipulating these data, particularly when running business analytics on the data to gain timely business insight and intelligence.
Importantly, the true challenge comes not just with amassing the hardware to store all of the data, but with having the right storage configuration along with the right architecture and programs to handle the demands of reliability, availability, and access speeds demanded within the corporation and by the customers, as further discussed in this CIO article.
Big Data drives semi growth
The semiconductor market benefits from increased demand for chips all along the big data stream. The "volume, variety, and velocity" paradigm drives a demand not only for more data storage, but also for more ways to collect, manipulate, and execute on the data. In other words, the chips that go into consumer devices are seeing the same increase in demand as the chips that go into large-scale storage appliances.
Consumers' role in Big Data
On the consumer electronics front, perhaps unsurprisingly, is an array of the hottest devices today: smart wireless devices (SWDs) consisting of smart- and feature phones, digital cameras (whether part of a handset or stand-alone), tablets and eReaders, ultrabooks, and other similar portable devices. These devices are continually adding features to meet consumer demands, demands that revolve around increased memory and connectivity. According to a recent iSuppli report:
New smartphones and tablets will act as key catalysts for continued healthy growth of the mobile memory semiconductor space […] with revenue growing […] 6 percent this year. […] An even bigger 9 percent increase is anticipated in 2013 for mobile memory as more smartphone and tablet products requiring higher memory densities come into the marketplace, with revenue climbing to [US] $16.2 billion.
These increases to the mobile memory space are coming from growth in NAND and mobile DRAM, growing 14% and 12%, respectively, from 2011 levels. The link between the increased demand for smart wireless devices (SWDs), how they are used, what data they are collecting, and how these data contribute to the Big Data situation growing at the cloud service providers' end is important.
The rise in the interconnection of devices, "The Internet of Things," is perhaps the central driving force behind Big Data growth. Alongside the increased connectivity of devices to comprise the Internet of Things (IoT) is the simple fact that the devices must be connectable, meaning there is an inherent growth driver for both components that enable connectivity and for the data storage required to do so. Simultaneously, there is a demand driver for the next generation end-devices that leverage this feature set and capability (learn more about this exciting driver in the companion piece on the Internet of Things in this issue of MarketWatch Quarterly).
With the increase in SWD, we not surprisingly see the amount of mobile data traffic increasing at astounding rates. EETimes cites Ericsson's prediction of "a five-fold growth in mobile data subscribers over the next five years [that] will drive a ten-fold increase in mobile traffic. What's more, the portion of that traffic generated in urban and metro areas will rise from about 25 percent to nearly 60 percent over that period […]." This situation is spurring rapid growth and innovation among cellular service providers and enterprise networks on the back-end, where there is a race to build cellular enterprise cloud systems, mobile backhaul networks, and related carrier space solutions.
These new solutions are demanding new hardware to handle the Big Data being transmitted in real-time, Big Data that meet and challenge enterprises with the volume, velocity, and variety hallmarks. The hardware demands include next generation hybrid array solutions for enterprise storage systems leveraging the "more sophisticated flash-based memory technology […]. Add-on SSDs [solid state drives] for arrays, PCIe-based Flash memory modules for servers, SSD-based cache devices sitting between servers and storage and even all-SSD or all-Flash arrays are now available to help customers increase the performance of their applications far beyond what they could get with spinning disk." (See CRN's lineup of nine such products).
Of course, along with these increased demands for meeting the volume, variety, and velocity requirements of Big Data are increases in IT budgets. Small- and medium-sized businesses (with fewer than 1,000 employees) in particular are seeing 15% YoY increases in their IT budgets for the first half of 2012, according to Spiceworks' State of SMB IT survey, summarized in ComputerWorld. According to the survey report, "SMBs are spending more on technology across the board, from hardware and devices, to cloud services and virtualization. For example, 62% have deployed or plan to deploy tablets within the next six months."
Big Data and component challenges
The hard disk drive (HDD) segment faced significant challenges last year with the catastrophic flooding in Thailand, which wiped out significant amounts of inventory, as well as production and assembly lines in Thailand. As a result, average selling prices (ASPs) jumped by roughly 28% by 4Q11, and, while production has returned to normal, iSuppli concludes that, due to the limited supplier situation, an oligopoly remains and is able to control prices at these higher levels. The combination of the limited number of suppliers in conjunction with rising demand for data storage and rising PC-sales will likely hold HDD prices above "pre-flood levels until 2014."
Meanwhile, HDDs will continue to maintain their position in the market, but with dramatic increases in areal density. As this iSuppli report notes, the density increase for HDDs is significant and underscores the Big Data influence on hardware, driving a doubling of HDD densities by 2016:
Just five years ago, HDD storage capacity per platter was at a maximum of 180 gigabits per square inch. Platters crossed the terabyte (TB) level for the first time in 2007, with hard disk drives comprising two or more platters becoming more common as HDD storage capacities increased. Now with the 1 TB per-platter milestone already reached, 5-TB hard disk drives using five platters could be available on the market later this year.
The CE demand for capturing more video and digital pictures is pushing the significant growth forecasted for storage capacity in CE devices. As this report from iSuppli forecasts, "from 2011 to 2016, the five-year compound annual growth rate (CAGR) for HDD areal densities will be equivalent to 19 percent. For this year, HDD areal densities are estimated to reach 780 Gb per square inch per platter, and then rise to 900 Gb per square inch next year." As the report goes on to note, the digital video recording (DVR) demand is likely to both propel growth in HDD, requiring higher density to meet the higher data volume that is being stored, especially from storing high-definition (HD) digital video, whether downloaded or personally created.
SSD embedded storage
As the new lineup of ultrabooks comes to market, we find that the latest PC data storage solutions are adopting cache SSD embedded solutions. Based on a recent iSuppli storage market report, and as reported by SolidState Technology:
Cache SSDs are the leading storage form factor in ultrabooks, growing to 23.9 million units shipped in 2012, a 2,660% increase over 2011. […]
Cache SSDs are a discrete, separate memory component alongside the device’s HDD, with both elements housed separately. A sample cache SSD configuration from Acer’s Aspire S3 ultrabook carried a 20GB SSD next to 320GB of hard disk space. Cache SSD shipments will jump to 67.7 million units next year, exceed 100 million in 2015, and hit 163 million by 2016.
The continued research and development in the SSD space underscore the continued growth for the sector, such as Toshiba's recent announcement of its latest SSD lineup, based on the 19-nm process (the smallest to date in the industry) and touting the THNSNF SSD line as the densest, fastest, and thinnest, according to this review by ComputerWorld.
Size, weight, and density all matter when it comes to competing in the SWD space, but price matters as well. Increased, onboard storage density demands continue to rise, and this acts as a demand driver for both the HDD-SSD and memory markets within semi. Importantly, at this data generation and use level, the user then moves the data content to a longer-term, archival storage base, most commonly a cloud service today, which, in turn, contributes to the creation of the Big Data sets that service enterprises are wrangling today. In this manner, the data generation-store-use-restore cycle continues to grow rapidly.
Considering NAND growth, analysts from publications such as iSuppli anticipate that "the NAND flash memory market can expect healthy growth this year […] up 8 percent [from 2011 levels]. […] NAND revenue will climb continually during the next few years, hitting approximately [US] $30.9 billion by 2016 […]." The drivers are increases in NAND content for the hottest selling CE devices, smartphones, tablets, and ultrabooks. According to DRAMeXchange (as reported here), because of present macroeconomic conditions, particularly the ongoing EU debt crisis coupled with the delayed release of Ivy Bridge, the NAND market will likely not see the contract price recovery until 3Q12. It is likely that, with the added catalyst of the Windows 8 release, the new processors will not only help ultrabook sales, but also those of tablet PCs.
On the DRAM side, while smartphone demand is the greatest of the industry, and while these end-devices are increasing the amount of DRAM share in each device, DRAM continues to see a continued decline in the share of the bill of materials (BOM) cost, down 6.3% for 1Q12 on a year-over-year (YoY) basis, dropping to half of the total BOM cost from a year ago, according to this recent DRAM report by iSuppli. There are, however, hopes that, in the wake of the Elpida bankruptcy and Micron's acquisition of Elpida, there may be some abatement of DRAM price declines (consider this market review from SolidStateTechnology). Also supporting positive DRAM trend likelihood is the next generation of mobile DRAM, for which specifications were recently published. According to this ComputerWorld review, "The JESD209-3 low-power double data rate 3 (LPDDR-3) specification also increases the density of memory chips, [… and] is targeted at the latest generation of smartphones, tablets, ultra-thin notebooks and similar connected devices on the newest, high-speed 4G networks."
Big Data means a bigger semi future
The challenges posed by Big Data are only beginning for the enterprise market. We will continue to encounter both hardware and software demands as consumers call for more interconnected devices with feature-rich capabilities (and thereby create more data that service providers can offer data handling solutions for) and as businesses seek to gain a competitive edge and increased market insight from business analytics run on more varied data sets at quicker paces and with more comprehensive security and backup capabilities. Importantly, as we have seen across every market sector we have mined for patterns, the semiconductor and electronics industries are seeing increased penetration rates with heightened demand for solutions that leverage designs based on collaboration between the hardware and software to maximize capabilities and functionality. Big Data is no different.
To address the challenges at many-terabyte to petabyte levels of data with the speed and dexterity of analytics expected by today's end-user, whether consumer or enterprise, the density of storage infrastructure must not only be available to the hosts of cloud services, but must also meet the budget and personnel limits of small- to mid-sized businesses, as well as the consumer's home computer set up. What that means is positive increases for semi penetration across verticals and markets, because Big Data are not just the domain of big enterprises; they are the creation of individual users interconnecting their lives and functional roles, and doing so through their increasingly interconnected devices (see the related article on "The Internet of Things" in this issue of MarketWatch Quarterly).