Saturday, October 29, 2011

Big Data Starts with ABCs














If you haven't noticed Big Data has created a lot of buzz lately.  Much of the buzz is from the absolute wow factor of how big is big.  With the number of smart phones nearing 6 billion all creating content, Facebook generating over 30 billion pieces of content a month and data expected to grow at 40% year on year it's easy to imagine big really is BIG.

In fact the digital universe has recently broken the zettabyte barrier which is approximately equal to a thousand exabytes or a billion terabytes.  How big is that?  To give you an idea of scale it would take everyone on the planet posting to Twitter 7*24 for 100 years to generate a zettabybe.

So you get the idea - it’s really big. 

As an IT organization you may be thinking that your own data growth will soon be stretching the limits of your infrastructure. A way to define big data is to look at your existing infrastructure, the amount of data you have now, and the amount of growth you're experiencing.  Is it starting to break your existing processes? If so, where?

“Big” refers to a size that's beyond the ability of your current tools to affordably capture, store, manage,and analyze your data. This is a practical definition since “big” might be a different number for each person trying but unable to extract business advantage from their data.


When we talk to our customers, we find that their existing infrastructure is breaking on three major axes:

  1. Complexity.  Data is no longer about text and numbers; It includes real-time events and shared infrastructure. Data is now linked at high fidelity and includes multiple types. The sheer complexity of data is skyrocketing. Having to apply normal algorithms for search, storage and categorization is a lot more complex.
  2. Speed.  How fast is the data coming at you? High definition video, streaming over the Internet to storage devices, to player devices, full motion video for surveillance – all of these have very high ingestion rates. You have to be able to keep up with the data flow. You need the compute, network and storage to deliver high definition to thousands of people at once, with good viewing quality. For high performance computing you need systems that can perform trillions of operations and store pedabytes of data per second.
  3. Volume.  For all of the data you are collecting and generating you have store it securely and make it available for ever. IT teams today are having making decisions about what is “too much data”. They might flush all data each week and start again. But there are certain applications like healthcare where you can never delete the data. It has to live forever.

These trends in data growth are something we at NetApp have been following for quite a while now.  We’ve been enhancing ONTAP to deal with the scale needed to handle large repositories of data and we have also made strategic acquisitions anticipating the need for high density high performance (Engenio) and infinite content repositories (Bycast).

In conversations with our customers dealing with the onslaught of data we have noticed 3 important use cases that are stretching the limits of their existing infrastructure.

We’ve named these axis’ the ABCs of Big Data.

  • Analytics.  - Analytics for extremely large data sets to gain insight and take advantage of that digital universe, and turning it into information. Giving you insight about your business to make better decisions.
  • Bandwidth - Performance for data-intensive workloads at really high speeds.
  • Content - Boundless secure scalable data storage that allows you to keep in forever.




Thursday, June 23, 2011

Moving from Clouds to Big Data



As many of you know I joined NetApp about a year ago.  I've spent most of my time here developing the go to marketing strategy for NetApp's OnCommand Management Software.  Last week the new portfolio was announced as part of the our Cloud launch and I posted how OnCommand supports the Cloud with four fundamental elements in my NetApp blog called Clouds OnCommand.

It was a lot of fun and I learned a lot about the IT management market segment and how Virtualization is causing us all to look at new creative and innovative ways to manage the ensuing complexity.  Check out the OnCommand story on YouTube.

It's been a refreshing change working at NetApp, a really great company with innovative products that is growing and doing well. I am thoroughly enjoying it.

Now I have taken on a new assignment to define our Big Data marketing strategy.  Big Data is getting a lot of visibility as the proliferation of data dominates the landscape.  The best defination of Big Data I've heard is that “Big Data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, mange and analyze"

I'm looking forward to digging into this space and understanding how I can help NetApp define a winning strategy.  More to come soon.