{"id":643823,"date":"2013-02-25T13:00:02","date_gmt":"2013-02-25T18:00:02","guid":{"rendered":"http:\/\/gigaom.com\/?p=613686"},"modified":"2013-02-25T13:00:02","modified_gmt":"2013-02-25T18:00:02","slug":"emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/643823","title":{"rendered":"EMC to Hadoop competition: \u201cSee ya, wouldn\u2019t wanna be ya.\u201d"},"content":{"rendered":"<p>If, like many industry watchers, you\u2019ve been confused about EMC Greenplum\u2019s Hadoop strategy over the past couple years, Scott Yara has a message for you: \u201cWe\u2019re all in on Hadoop, period.\u201d<\/p>\n<p>Yara, Greenplum\u2019s co-founder and senior vice president of products, has a not-so-coded message for his big data market competitors, too. Put simply, he doesn\u2019t think they stand a chance against his company, and he served notice on Monday morning with the unveiling of the company\u2019s new Pivotal HD Hadoop distribution and Project Hawq in a staged event at San Francisco\u2019s Dogpatch Studios.<\/p>\n<p>Pivotal HD is a completely re-architected Hadoop distribution that has been natively fused with Greenplum\u2019s analytic database (that\u2019s the Project Hawq part), but Yara thinks it\u2019s a bigger deal than <a href=\"http:\/\/gigaom.com\/2013\/02\/21\/sql-is-whats-next-for-hadoop-heres-whos-doing-it\/\">just another SQL-on-Hadoop play<\/a>. In an interview last week, Yara told me that Project Hawq is the manifestation of Greenplum\u2019s <a href=\"http:\/\/gigaom.com\/2010\/07\/06\/emc-buys-greenplum\/\">decision to sell itself to EMC in 2010<\/a>, a move he thought would would kickstart his company\u2019s founding vision of becoming the leading big data platform.<\/p>\n<h2 id=\"building-a-data-platform-costs\">Building a data platform costs money, and lots of it<\/h2>\n<p>But before the details, a little history. Greenplum\u2019s flagship product is an analytic database powered by a massively parallel processing (MPP) and query engine. The company had raised nearly $100 million in venture capital around this technology since launching in 2003, but doing business in the enterprise software world is hard and expensive, and Greenplum needed more money.<\/p>\n<div id=\"attachment_502146\" class=\"wp-caption alignleft\" style=\"width: 310px\"><img loading=\"lazy\" decoding=\"async\" alt=\"Rob Me of Pivotal Labs, Scott Yara of EMC, and Om Malik of GigaOM at Structure:Data 2012\" src=\"http:\/\/gigaom2.files.wordpress.com\/2012\/03\/1z5o1154.jpg?w=300&#038;h=200\" width=\"300\" height=\"200\" class=\"size-medium wp-image-502146\"><\/p>\n<p class=\"wp-caption-text\">Yara (left) with Pivotal Labs CEO Rob Me and Om Malik at Structure: Data 2012 (c) 2012 Pinar Ozger. pinar@pinarozger.com<\/p>\n<\/div>\n<p>\u201cI thought it was going to take another couple hundred million dollars in investment for us to complete the technical vision we had and go to market,\u201d Yara explained. But finding that kind of money wasn\u2019t so easy in an investment environment where everyone was gaga over social apps like Facebook and Zynga. When EMC approached with a deal like it gave VMware in 2003 \u2014 essentially near complete independence bolstered by a huge R&#38;D and marketing budget \u2014 Greenplum couldn\u2019t refuse.<\/p>\n<p>Yara said Greenplum had known for a while that Hadoop was the key to any big data strategy going forward, but that it would take some time to build up its own technology. So, in 2011, it <a href=\"http:\/\/gigaom.com\/2011\/05\/09\/emc-hadoop\/\">entered into a reseller agreement with Hadoop startup MapR<\/a> to offer a premium product to appease enterprise customers while Greenplum\u2019s engineers got to work on what would become Pivotal HD. That deal with MapR is still in place, but it\u2019s no longer the focal point of Greenplum\u2019s Hadoop strategy.<\/p>\n<h2 id=\"big-investment-big-aspirations\">Big investment, big aspirations<\/h2>\n<p>The technology inside Pivotal HD is what companies should come to expect from a Hadoop distribution, Yara explained. It\u2019s essentially the Greenplum Database with its POSIX file system ripped out and replaced by the Hadoop Distributed File System. Whatever users can do on Greenplum\u2019s flagship database, they can do on Pivotal HD, only they can run Hadoop MapReduce jobs and house an HBase database, too.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" alt=\"hawq\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/02\/hawq.jpg?w=708&#038;h=386\" width=\"708\" height=\"386\" class=\"aligncenter size-large wp-image-613705\"><\/p>\n<p>And when SQL-like features become an important part of Hadoop because it\u2019s so broadly installed that users are now seeking out broader utility, \u201cthat\u2019s when the bar gets raised in terms of the amount of capability that\u2019s required,\u201d Yara said. He said Pivotal HD includes years worth of investment in Hadoop cluster-management technology and professional support, too, and that they will cost half as much as what Cloudera and Hortonworks charge. It\u2019s designed to run smoothly wherever customers want it to \u2014 physical servers, virtual servers or even cloud servers.<\/p>\n<p><a href=\"http:\/\/structuredata2013-editgraphic.eventbrite.com\/\"><img decoding=\"async\" alt=\"Structure:Data: Put data to work. 60+ big data experts speaking. March 20-21, 2013, New York City. Register now.\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/02\/structure-data_in-article-banners_300x2001.png?w=708\" class=\"alignright size-full wp-image-610577\"><\/a>Because they\u2019re so new, he said, competitive SQL-on-Hadoop offerings such as <a href=\"http:\/\/gigaom.com\/2012\/10\/24\/cloudera-makes-sql-a-first-class-citizen-in-hadoop\/\">Cloudera\u2019s Impala<\/a> can only handle about 20 percent of real-world workloads. Looking back at the capital investment in analytics and big data technologies past, things like Netezza, Teradata and Aster Data, Yara proffered, \u201cI don\u2019t think you could build [a full SQL-on-Hadoop] system for less than $25 to $50 million over five years.\u201d (Some of those new technologies, by the way, will have a chance to state their cases during a <a href=\"http:\/\/event.gigaom.com\/structuredata\/?utm_source=data&#38;utm_medium=editorial&#038;%2338;utm_campaign=intext&#038;%2338;utm_term=613686+emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya&#038;%2338;utm_content=dharrisstructure\">Structure: Data<\/a> panel on March 21 that\u2019s all about Hadoop as the next-generation business intelligence platform.)<\/p>\n<p>Greenplum, by contrast, rebuilt its entire R&#38;D team to focus on bringing 10 years of database technology to Hadoop. \u201cWe literally have over 300 engineers working on our Hadoop platform,\u201d Yara said. \u201c\u2026 We\u2019re bringing all the power of EMC and VMware behind it.\u201d<\/p>\n<h2 id=\"the-data-warehouse-is-the-new-\">The data warehouse is the new mainframe<\/h2>\n<p>Looking past his competitive boasting, though, it\u2019s easy to see Yara\u2019s greater point when you ask him what all this Hadoop talks means for the data warehouse business on which Greenplum was built. He points to the mainframe business that fell from its high perch decades ago but still drives billions a year in revenue. A single MPP database system is still faster on certain workloads than SQL on Hadoop, but that gap will close over time and\u00a0 \u201cI do think the center of gravity will move toward HDFS,\u201d he said.<\/p>\n<p>Josh Klahr, a Pivotal HD product manager, noted the importance of being able to process all of a company\u2019s data right in a single scalable data store rather than operating numerous systems. He pointed to one customer that\u2019s storing a petabyte of data in Greenplum Database but wants to grow its data volume to 20 petabyes over the next few years and needs something like Hadoop to do that both financially and technically. He said Netflix\u2019s <a href=\"http:\/\/gigaom.com\/2013\/01\/10\/netflix-shows-off-its-hadoop-architecture\/\">decision to store all its data in Amazon S3<\/a> and bring analytic services to it is a good indicator of where the market is headed.<\/p>\n<p>A few years ago, Yara acknowledged, embracing Hadoop as the future might have been a scary proposition. However, he said, \u201cNow, if you don\u2019t embrace Hadooop as the new database platform, if you\u2019re a database vendor, that\u2019s a grave mistake.\u201d<\/p>\n<p> <img loading=\"lazy\" decoding=\"async\" alt=\"\" border=\"0\" src=\"http:\/\/stats.wordpress.com\/b.gif?host=gigaom.com&#038;blog=14960843&#038;%23038;post=613686&#038;%23038;subd=gigaom2&#038;%23038;ref=&#038;%23038;feed=1\" width=\"1\" height=\"1\" \/><\/p>\n<p><a href=\"http:\/\/pubads.g.doubleclick.net\/gampad\/jump?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=869937\"><img decoding=\"async\" src=\"http:\/\/pubads.g.doubleclick.net\/gampad\/ad?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=869937\" \/><\/a><\/p>\n<p><strong>Related research and analysis from GigaOM Pro:<\/strong><br \/>Subscriber content. <a href=\"http:\/\/pro.gigaom.com\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=613686+emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya&#038;utm_content=dharrisstructure\">Sign up for a free trial<\/a>.<\/p>\n<ul>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/04\/infrastructure-q1-cloud-and-big-data-woo-the-enterprise\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=613686+emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya&#038;utm_content=dharrisstructure\">Infrastructure Q1: Cloud and big data woo enterprises<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/03\/a-near-term-outlook-for-big-data\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=613686+emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya&#038;utm_content=dharrisstructure\">A near-term outlook for big data<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2010\/09\/the-red-hot-data-warehouse-market-whos-buying-next\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=613686+emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya&#038;utm_content=dharrisstructure\">The Red-Hot Data Warehouse Market: Who&#8217;s Buying Next?<\/a><\/li>\n<\/ul>\n<p><img width='1' height='1' src='http:\/\/gigaom.feedsportal.com\/c\/34996\/f\/646446\/s\/28f14215\/mf.gif' border='0'\/><\/p>\n<div class='mf-viral'>\n<table border='0'>\n<tr>\n<td valign='middle'><a href=\"http:\/\/share.feedsportal.com\/viral\/sendEmail.cfm?lang=en&#038;title=EMC+to+Hadoop+competition%3A+%E2%80%9CSee+ya%2C+wouldn%E2%80%99t+wanna+be+ya.%E2%80%9D&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F02%2F25%2Femc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/emailthis2.gif\" border=\"0\" \/><\/a><\/td>\n<td valign='middle'><a href=\"http:\/\/res.feedsportal.com\/viral\/bookmark.cfm?title=EMC+to+Hadoop+competition%3A+%E2%80%9CSee+ya%2C+wouldn%E2%80%99t+wanna+be+ya.%E2%80%9D&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F02%2F25%2Femc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/bookmark.gif\" border=\"0\" \/><\/a><\/td>\n<\/tr>\n<\/table>\n<\/div>\n<p><a href=\"http:\/\/da.feedsportal.com\/r\/158873303259\/u\/49\/f\/646446\/c\/34996\/s\/28f14215\/a2.htm\"><img decoding=\"async\" src=\"http:\/\/da.feedsportal.com\/r\/158873303259\/u\/49\/f\/646446\/c\/34996\/s\/28f14215\/a2.img\" border=\"0\"\/><\/a><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/pi.feedsportal.com\/r\/158873303259\/u\/49\/f\/646446\/c\/34996\/s\/28f14215\/a2t.img\" border=\"0\"\/><\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?a=skkmt_tg7fI:Sabk9P8Ry1o:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/OmMalik\/~4\/skkmt_tg7fI\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If, like many industry watchers, you\u2019ve been confused about EMC Greenplum\u2019s Hadoop strategy over the past couple years, Scott Yara has a message for you: \u201cWe\u2019re all in on Hadoop, period.\u201d Yara, Greenplum\u2019s co-founder and senior vice president of products, has a not-so-coded message for his big data market competitors, too. Put simply, he doesn\u2019t [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-643823","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/643823","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=643823"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/643823\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=643823"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=643823"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=643823"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}