{"id":653875,"date":"2013-04-23T08:00:32","date_gmt":"2013-04-23T12:00:32","guid":{"rendered":"http:\/\/gigaom.com\/?p=633392"},"modified":"2013-04-23T08:00:32","modified_gmt":"2013-04-23T12:00:32","slug":"hadoop-startup-qubole-raises-7m-for-hive-as-a-service","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/653875","title":{"rendered":"Hadoop startup Qubole raises $7M for Hive as a Service"},"content":{"rendered":"<p><a href=\"http:\/\/www.qubole.com\/\">Qubole<\/a>, the startup from former Facebook engineers Ashish Thusoo and Joydeep Sen Sarma, \u00a0just closed a Series A investment round for its service, which lets users run <a href=\"http:\/\/feedproxy.google.com\/~r\/OmMalik\/~3\/fCl6LGjOjN0\/hive.apache.org\">Hive<\/a> jobs in the Amazon Web Services cloud. Hive is the data warehouse system and SQL-like language for Hadoop that Thusoo and Sen Sarma <a href=\"http:\/\/infolab.stanford.edu\/~ragho\/hive-icde2010.pdf\">helped create while at the social-networking company<\/a>. Charles River Ventures and Lightspeed Ventures led the round, which brings the company&#8217;s total venture capital investment to $7 million, including its seed round in late 2012.<\/p>\n<p>Qubole <a href=\"http:\/\/gigaom.com\/2012\/06\/06\/exclusive-the-brains-behind-hive-launch-on-demand-hadoop-service\/\">launched in June 2012<\/a> and opened its platform for public consumption in December, Thusoo told me, and has processed about half a petabyte of customer data since then. Thus far, the platform&#8217;s biggest users have been in the advertising technology, e-commerce and application-development spaces. A common use case (and one <a href=\"http:\/\/www.qubole.com\/blog\/mediamath-qubole-customer-use-case-study-marketing\">detailed in a blog post by Qubole customer MediaMath<\/a>) is to create pipelines that use Hive to process unstructured data before pushing it into relational databases such as MySQL, Vertica or Infobright for more-traditional business-intelligence applications.<\/p>\n<div id=\"attachment_626654\" class=\"wp-caption alignright\" style=\"width: 310px\"><a href=\"http:\/\/gigaom2.files.wordpress.com\/2013\/04\/gigaom_structure_data_2224.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"Structure Data 2013 Ashish Thusoo Quobole\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/04\/gigaom_structure_data_2224.jpg?w=300&#038;h=200\" width=\"300\" height=\"200\" class=\"size-medium wp-image-626654\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Ashish Thusoo at Structure: Data 2013, (c) Albert Chau, itsmebert.com<\/p>\n<\/div>\n<p>However, Thusoo added, Qubole also has connectors for getting data out of certain other data stores, such as MongoDB, and is working on letting customers import data via API from services such as Omniture and Google analytics.<\/p>\n<p>Being in the cloud &#8212; especially Amazon&#8217;s cloud &#8212; could actually pay big dividends, too, and not just because it lets Qubole scale clusters automatically and lets users avoid the operational headaches of maintaining a Hadoop cluster. Companies are already using Amazon S3 to store a lot of data &#8212; <a href=\"http:\/\/gigaom.com\/2013\/04\/18\/amazon-s3-goes-exponential-now-stores-2-trillion-objects\/\">more than 2 trillion objects <\/a>at this point &#8212; and that&#8217;s Qubole&#8217;s choice for a storage system, as well. As companies move more of their big data workloads to the cloud, S3 serves as a cheap, easy and generic storage platform to which they can connect various services and applications.<\/p>\n<p>In January, for example, Netflix <a href=\"http:\/\/gigaom.com\/2013\/01\/10\/netflix-shows-off-its-hadoop-architecture\/\">detailed its cloud-based Hadoop platform<\/a> that consists of numerous services but relies on Amazon S3 as the source-of-truth data store.<\/p>\n<div id=\"attachment_601005\" class=\"wp-caption aligncenter\" style=\"width: 410px\"><a href=\"http:\/\/gigaom2.files.wordpress.com\/2013\/01\/hadoop-nflx.jpg\"><img decoding=\"async\" alt=\"Netflix's Hadoop architecture.\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/01\/hadoop-nflx.jpg?w=708\" class=\"size-full wp-image-601005\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Netflix&#8217;s Hadoop architecture.<\/p>\n<\/div>\n<p>If there&#8217;s one big question about Qubole, though, it has to be <a href=\"http:\/\/gigaom.com\/2013\/02\/21\/sql-is-whats-next-for-hadoop-heres-whos-doing-it\/\">the emergence of a rather-large SQL-on-Hadoop market<\/a> since the company launched. Although Hive has been an important part of the Hadoop stack over the past few years, its MapReduce foundation is beginning to show its age in terms of query speed, and the new breed of database startups pushing SQL analytics atop Hadoop <a href=\"http:\/\/drawntoscale.com\/is-there-a-database-in-big-data-heaven-understanding-the-world-of-sql-on-hadoop\/\">are quick to point this out<\/a>.<\/p>\n<p>Thusoo has certainly noticed this activity, but he stills sees Qubole as being in a good position. For starters, he said, the company is looking at interactive analytics projects such as <a href=\"http:\/\/gigaom.com\/2012\/10\/24\/cloudera-makes-sql-a-first-class-citizen-in-hadoop\/\">Impala<\/a> and <a href=\"http:\/\/gigaom.com\/2013\/04\/17\/welcome-to-berkeley-where-hadoop-isnt-nearly-fast-enough\/\">Shark<\/a> to see how they might integrate with the Qubole platform, and Hadoop startup Hortonworks is <a href=\"http:\/\/hortonworks.com\/blog\/100x-faster-hive\/\">leading the Stinger project<\/a> to drastically boost the speed of Hive itself.<\/p>\n<p>Further, there&#8217;s the fact that Qubole itself has already <a href=\"http:\/\/www.qubole.com\/blog\/index.php\/optimizing-hadoop-for-s3-part-1\">optimized its platform<\/a> to run, on average, about five times faster than Hive would normally run on Amazon Elastic MapReduce alone.<\/p>\n<p>&#8220;We&#8217;re also keeping a close tab on other projects in our space,&#8221; Thusoo said. &#8220;We have a lot of options &#8230; to play with.&#8221;<\/p>\n<p> <img loading=\"lazy\" decoding=\"async\" alt=\"\" border=\"0\" src=\"http:\/\/stats.wordpress.com\/b.gif?host=gigaom.com&#038;blog=14960843&#038;%23038;post=633392&#038;%23038;subd=gigaom2&#038;%23038;ref=&#038;%23038;feed=1\" width=\"1\" height=\"1\" \/><\/p>\n<p><a href=\"http:\/\/pubads.g.doubleclick.net\/gampad\/jump?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=279813\"><img decoding=\"async\" src=\"http:\/\/pubads.g.doubleclick.net\/gampad\/ad?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=279813\" \/><\/a><\/p>\n<p><strong>Related research and analysis from GigaOM Pro:<\/strong><br \/>Subscriber content. <a href=\"http:\/\/pro.gigaom.com\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=633392+hadoop-startup-qubole-raises-7m-for-hive-as-a-service&#038;utm_content=dharrisstructure\">Sign up for a free trial<\/a>.<\/p>\n<ul>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/04\/infrastructure-q1-cloud-and-big-data-woo-the-enterprise\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=633392+hadoop-startup-qubole-raises-7m-for-hive-as-a-service&#038;utm_content=dharrisstructure\">Infrastructure Q1: Cloud and big data woo enterprises<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/03\/a-near-term-outlook-for-big-data\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=633392+hadoop-startup-qubole-raises-7m-for-hive-as-a-service&#038;utm_content=dharrisstructure\">A near-term outlook for big data<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2010\/12\/9-companies-that-pushed-the-infrastructure-discussion-in-2010\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=633392+hadoop-startup-qubole-raises-7m-for-hive-as-a-service&#038;utm_content=dharrisstructure\">9 Companies that Pushed the Infrastructure Discussion in 2010<\/a><\/li>\n<\/ul>\n<p><img width='1' height='1' src='http:\/\/gigaom.feedsportal.com\/c\/34996\/f\/646446\/s\/2b0fd4e6\/mf.gif' border='0'\/><\/p>\n<div class='mf-viral'>\n<table border='0'>\n<tr>\n<td valign='middle'><a href=\"http:\/\/share.feedsportal.com\/share\/twitter\/?u=http%3A%2F%2Fgigaom.com%2F2013%2F04%2F23%2Fhadoop-startup-qubole-raises-7m-for-hive-as-a-service%2F&#038;t=Hadoop+startup+Qubole+raises+%247M+for+Hive+as+a+Service\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/social\/twitter.png\" border=\"0\" \/><\/a>&nbsp;<a href=\"http:\/\/share.feedsportal.com\/share\/facebook\/?u=http%3A%2F%2Fgigaom.com%2F2013%2F04%2F23%2Fhadoop-startup-qubole-raises-7m-for-hive-as-a-service%2F&#038;t=Hadoop+startup+Qubole+raises+%247M+for+Hive+as+a+Service\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/social\/facebook.png\" border=\"0\" \/><\/a>&nbsp;<a href=\"http:\/\/share.feedsportal.com\/share\/linkedin\/?u=http%3A%2F%2Fgigaom.com%2F2013%2F04%2F23%2Fhadoop-startup-qubole-raises-7m-for-hive-as-a-service%2F&#038;t=Hadoop+startup+Qubole+raises+%247M+for+Hive+as+a+Service\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/social\/linkedin.png\" border=\"0\" \/><\/a>&nbsp;<a href=\"http:\/\/share.feedsportal.com\/share\/gplus\/?u=http%3A%2F%2Fgigaom.com%2F2013%2F04%2F23%2Fhadoop-startup-qubole-raises-7m-for-hive-as-a-service%2F&#038;t=Hadoop+startup+Qubole+raises+%247M+for+Hive+as+a+Service\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/social\/googleplus.png\" border=\"0\" \/><\/a>&nbsp;<a href=\"http:\/\/share.feedsportal.com\/share\/email\/?u=http%3A%2F%2Fgigaom.com%2F2013%2F04%2F23%2Fhadoop-startup-qubole-raises-7m-for-hive-as-a-service%2F&#038;t=Hadoop+startup+Qubole+raises+%247M+for+Hive+as+a+Service\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/social\/email.png\" border=\"0\" \/><\/a><\/td>\n<td valign='middle'><\/td>\n<\/tr>\n<\/table>\n<\/div>\n<p><a href=\"http:\/\/da.feedsportal.com\/r\/164016279177\/u\/49\/f\/646446\/c\/34996\/s\/2b0fd4e6\/a2.htm\"><img decoding=\"async\" src=\"http:\/\/da.feedsportal.com\/r\/164016279177\/u\/49\/f\/646446\/c\/34996\/s\/2b0fd4e6\/a2.img\" border=\"0\"\/><\/a><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/pi.feedsportal.com\/r\/164016279177\/u\/49\/f\/646446\/c\/34996\/s\/2b0fd4e6\/a2t.img\" border=\"0\"\/><\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?a=fCl6LGjOjN0:vfiiEgkdUXc:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/OmMalik\/~4\/fCl6LGjOjN0\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Qubole, the startup from former Facebook engineers Ashish Thusoo and Joydeep Sen Sarma, \u00a0just closed a Series A investment round for its service, which lets users run Hive jobs in the Amazon Web Services cloud. Hive is the data warehouse system and SQL-like language for Hadoop that Thusoo and Sen Sarma helped create while at [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-653875","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/653875","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=653875"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/653875\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=653875"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=653875"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=653875"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}