{"id":646877,"date":"2013-03-14T18:31:12","date_gmt":"2013-03-14T22:31:12","guid":{"rendered":"http:\/\/gigaom.com\/?p=620545"},"modified":"2013-03-14T18:31:12","modified_gmt":"2013-03-14T22:31:12","slug":"google-bigquery-is-now-even-bigger","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/646877","title":{"rendered":"Google BigQuery is now even bigger"},"content":{"rendered":"<p>Google <a href=\"\">might be upsetting a lot of people<\/a> with some of its recent \u201cspring cleaning,\u201d but its latest batch of updates to <a href=\"https:\/\/cloud.google.com\/products\/big-query\">BigQuery<\/a> should make data analysts happy, at least.<\/p>\n<p>With the latest updates \u2014 <a href=\"http:\/\/googleenterprise.blogspot.com\/2013\/03\/bringing-simplicity-to-large-data.html\">announced in a blog post<\/a> by BigQuery Product Manager Ju-kay Kwek on Thursday \u2014 users can now join large tables, import and query timestamped data, and aggregate large collections of distinct values. It\u2019s hardly the equivalent of Google launching Compute Engine last summer, but as (arguably) the inspiration for the <a href=\"http:\/\/gigaom.com\/2013\/02\/21\/sql-is-whats-next-for-hadoop-heres-whos-doing-it\/\">SQL-on-Hadoop trend that\u2019s sweeping the big data world<\/a> right now, every improvement to BigQuery is notable.<\/p>\n<p>BigQuery is a cloud service that lets users analyze terabyte-sized data sets using SQL-like queries. It\u2019s based on <a href=\"http:\/\/research.google.com\/pubs\/pub36632.html\">Google\u2019s Dremel querying system<\/a>, which can analyze data where it\u2019s located (i.e., in the Google File System or BigTable) and which Google uses internally to analyze a variety of different data sets. Google <a href=\"http:\/\/gigaom.com\/2012\/03\/21\/google-structure-data-2012\/\">claims queries in BigQuery run at interactive speeds<\/a>, which is something that MapReduce \u2014 the previous-generation tool for dealing with such large data sets \u2014 simply couldn\u2019t handle within a reasonable time frame or level of complexity. Of course, if you want to schedule batch jobs, <a href=\"http:\/\/gigaom.com\/2012\/08\/29\/google-brings-bigquery-down-to-earth-with-excel-connector\/\">BigQuery lets you do that, too, for a lower price<\/a>.<\/p>\n<p>This constraint \u2014 and therefore the potential benefits of something like Dremel and<a href=\"http:\/\/gigaom.com\/2012\/05\/01\/google-opens-up-its-biq-query-data-analytics-service-to-all\/\"> its commercial incarnation, BigQuery<\/a> \u2014 wasn\u2019t lost on the Hadoop community, which itself had been largely reliant on MapReduce processing for years. In the past year, we\u2019ve seen numerous startups <a href=\"http:\/\/gigaom.com\/2013\/02\/25\/emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya\/\">and large vendors<\/a> pushing their own Dremel-like (or MPP-like) technologies for data sitting in the Hadoop Distributed File System. If you happen to be in New York next week, you can hear some of the pioneers in this space talk about it at our <a href=\"http:\/\/event.gigaom.com\/structuredata\/?utm_source=data&#38;utm_medium=editorial&#038;%2338;utm_campaign=intext&#038;%2338;utm_term=620545+google-bigquery-is-now-even-bigger&#038;%2338;utm_content=dharrisstructure\">Structure: Data conference<\/a>.<\/p>\n<p>Background aside, the ability to join large data sets in BigQuery is probably the most-important of the three new functions. Joins are an essential aspect of data analysis in most environments because pieces of data that are relevant to each other don\u2019t always reside within the same table or even within the same cluster. And joining tables of the size BigQuery is designed for can take a long time without the right query engine in place.<\/p>\n<div id=\"attachment_620754\" class=\"wp-caption aligncenter\" style=\"width: 718px\"><a href=\"http:\/\/gigaom2.files.wordpress.com\/2013\/03\/join.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"How to do a join in BigQuery\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/03\/join.jpg?w=708&#038;h=105\" width=\"708\" height=\"105\" class=\"size-large wp-image-620754\"><\/a><\/p>\n<p class=\"wp-caption-text\">How to do a join in BigQuery<\/p>\n<\/div>\n<p>Kwek offers an anecdote from Google that shows why joins, and the new aggregation function, are important:<\/p>\n<blockquote id=\"quote-when-our-app-engine-\">\n<p>[W]hen our App Engine team needed to reconcile app billing and usage information, Big JOIN allowed the team to merge 2TB of usage data with 10GB of configuration data in 60 seconds. Big Group Aggregations enabled them to immediately segment those results by customer.\u00a0Using the integrated Tableau client the team was able to quickly visualize and detect some unexpected trends.<\/p>\n<\/blockquote>\n<p> <img loading=\"lazy\" decoding=\"async\" alt=\"\" border=\"0\" src=\"http:\/\/stats.wordpress.com\/b.gif?host=gigaom.com&#038;blog=14960843&#038;%23038;post=620545&#038;%23038;subd=gigaom2&#038;%23038;ref=&#038;%23038;feed=1\" width=\"1\" height=\"1\" \/><\/p>\n<p><a href=\"http:\/\/pubads.g.doubleclick.net\/gampad\/jump?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=616580\"><img decoding=\"async\" src=\"http:\/\/pubads.g.doubleclick.net\/gampad\/ad?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=616580\" \/><\/a><\/p>\n<p><strong>Related research and analysis from GigaOM Pro:<\/strong><br \/>Subscriber content. <a href=\"http:\/\/pro.gigaom.com\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=620545+google-bigquery-is-now-even-bigger&#038;utm_content=dharrisstructure\">Sign up for a free trial<\/a>.<\/p>\n<ul>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/05\/the-importance-of-putting-the-u-and-i-in-visualization\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=620545+google-bigquery-is-now-even-bigger&#038;utm_content=dharrisstructure\">The importance of putting the U and I in visualization<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/04\/infrastructure-q1-cloud-and-big-data-woo-the-enterprise\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=620545+google-bigquery-is-now-even-bigger&#038;utm_content=dharrisstructure\">Infrastructure Q1: Cloud and big data woo enterprises<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2013\/01\/cloud-and-data-fourth-quarter-2012-analysis\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=620545+google-bigquery-is-now-even-bigger&#038;utm_content=dharrisstructure\">The fourth quarter of 2012 in cloud<\/a><\/li>\n<\/ul>\n<p><img width='1' height='1' src='http:\/\/gigaom.feedsportal.com\/c\/34996\/f\/646446\/s\/29953bfa\/mf.gif' border='0'\/><\/p>\n<div class='mf-viral'>\n<table border='0'>\n<tr>\n<td valign='middle'><a href=\"http:\/\/share.feedsportal.com\/viral\/sendEmail.cfm?lang=en&#038;title=Google+BigQuery+is+now+even+bigger&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F03%2F14%2Fgoogle-bigquery-is-now-even-bigger%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/emailthis2.gif\" border=\"0\" \/><\/a><\/td>\n<td valign='middle'><a href=\"http:\/\/res.feedsportal.com\/viral\/bookmark.cfm?title=Google+BigQuery+is+now+even+bigger&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F03%2F14%2Fgoogle-bigquery-is-now-even-bigger%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/bookmark.gif\" border=\"0\" \/><\/a><\/td>\n<\/tr>\n<\/table>\n<\/div>\n<p><a href=\"http:\/\/da.feedsportal.com\/r\/159490250273\/u\/49\/f\/646446\/c\/34996\/s\/29953bfa\/a2.htm\"><img decoding=\"async\" src=\"http:\/\/da.feedsportal.com\/r\/159490250273\/u\/49\/f\/646446\/c\/34996\/s\/29953bfa\/a2.img\" border=\"0\"\/><\/a><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/pi.feedsportal.com\/r\/159490250273\/u\/49\/f\/646446\/c\/34996\/s\/29953bfa\/a2t.img\" border=\"0\"\/><\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?a=7OF6Ub1Hu4o:m-UxOf7L-DY:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/OmMalik\/~4\/7OF6Ub1Hu4o\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Google might be upsetting a lot of people with some of its recent \u201cspring cleaning,\u201d but its latest batch of updates to BigQuery should make data analysts happy, at least. With the latest updates \u2014 announced in a blog post by BigQuery Product Manager Ju-kay Kwek on Thursday \u2014 users can now join large tables, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-646877","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/646877","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=646877"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/646877\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=646877"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=646877"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=646877"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}