{"id":647865,"date":"2013-03-20T15:42:50","date_gmt":"2013-03-20T19:42:50","guid":{"rendered":"http:\/\/gigaom.com\/?p=622491"},"modified":"2013-03-20T15:42:50","modified_gmt":"2013-03-20T19:42:50","slug":"from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/647865","title":{"rendered":"From Amazon\u2019s top data geek: data has got to be big \u2014 and reproducible"},"content":{"rendered":"<p>Much has been done to bring big data closer to the people who need it. The advent of public cloud infrastructure has decimated the cost of collecting, maintaining and processing vast amounts of data. The next frontier is making that data reproducible, said Matt Wood, principal data scientist for Amazon Web Services, at <a href=\"http:\/\/event.gigaom.com\/structuredata\/?utm_source=data&#38;utm_medium=editorial&#038;%2338;utm_campaign=intext&#038;%2338;utm_term=622491+from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible&#038;%2338;utm_content=gigabarb\">GigaOM\u2019s Structure:Data 2013<\/a> event Wednesday.<\/p>\n<p>In short, it\u2019s great to get a result from your number crunching, but if the result is different next time out, there\u2019s a problem. No self-respecting scientist would think of submitting the findings for a trial or experiment unless she is able to show that the it will be the same after multiple runs.<\/p>\n<p>\u201cMuch of today\u2019s statistical modeling and predictive analytics is beautiful but unique. It\u2019s impossible to repeat, it\u2019s snowflake data science.\u201d Wood told attendees in New York. \u201cReproducibility becomes a key arrow in the quiver of the data scientist.\u201d<\/p>\n<p>The next frontier is making sure that people can reproduce, reuse and remix their data which provides a \u201ctremendous amount of value,\u201d Wood noted.<\/p>\n<p>For more on Wood, check out <a href=\"http:\/\/gigaom.com\/2012\/11\/30\/why-amazon-thinks-big-data-was-made-for-the-cloud\/\">this Derrick Harris post.<\/a><\/p>\n<p>And check out t<a href=\"http:\/\/gigaom.com\/2013\/03\/20\/structuredata-2013-live-coverage\/\">he rest of our Structure:Data 2013 coverage here<\/a>, and a video embed of the session follows below:<\/p>\n<p> <iframe loading=\"lazy\" src=\"http:\/\/new.livestream.com\/accounts\/74987\/events\/1927733\/videos\/14314410\/player?autoPlay=false&#38;height=360&#038;%2338;mute=false&#038;%2338;width=640\" width=\"640\" height=\"360\" frameborder=\"0\" scrolling=\"no\"><\/iframe> <br \/> <img loading=\"lazy\" decoding=\"async\" alt=\"\" border=\"0\" src=\"http:\/\/stats.wordpress.com\/b.gif?host=gigaom.com&#038;blog=14960843&#038;%23038;post=622491&#038;%23038;subd=gigaom2&#038;%23038;ref=&#038;%23038;feed=1\" width=\"1\" height=\"1\" \/><\/p>\n<p><a href=\"http:\/\/pubads.g.doubleclick.net\/gampad\/jump?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=651000\"><img decoding=\"async\" src=\"http:\/\/pubads.g.doubleclick.net\/gampad\/ad?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=651000\" \/><\/a><\/p>\n<p><strong>Related research and analysis from GigaOM Pro:<\/strong><br \/>Subscriber content. <a href=\"http:\/\/pro.gigaom.com\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=622491+from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible&#038;utm_content=gigabarb\">Sign up for a free trial<\/a>.<\/p>\n<ul>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/04\/aws-storage-gateway-jolts-cloud-storage-ecosystem\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=622491+from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible&#038;utm_content=gigabarb\">AWS Storage Gateway jolts cloud-storage ecosystem<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/03\/its-time-for-cloud-security-and-big-data-to-come-together\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=622491+from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible&#038;utm_content=gigabarb\">It&#8217;s time for cloud security and big data to come together<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2010\/12\/9-companies-that-pushed-the-infrastructure-discussion-in-2010\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=622491+from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible&#038;utm_content=gigabarb\">9 Companies that Pushed the Infrastructure Discussion in 2010<\/a><\/li>\n<\/ul>\n<p><img width='1' height='1' src='http:\/\/gigaom.feedsportal.com\/c\/34996\/f\/646446\/s\/29cc4b8e\/mf.gif' border='0'\/><\/p>\n<div class='mf-viral'>\n<table border='0'>\n<tr>\n<td valign='middle'><a href=\"http:\/\/share.feedsportal.com\/viral\/sendEmail.cfm?lang=en&#038;title=From+Amazon%E2%80%99s+top+data+geek%3A+data+has+got+to+be+big+%E2%80%94+and+reproducible&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F03%2F20%2Ffrom-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/emailthis2.gif\" border=\"0\" \/><\/a><\/td>\n<td valign='middle'><a href=\"http:\/\/res.feedsportal.com\/viral\/bookmark.cfm?title=From+Amazon%E2%80%99s+top+data+geek%3A+data+has+got+to+be+big+%E2%80%94+and+reproducible&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F03%2F20%2Ffrom-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/bookmark.gif\" border=\"0\" \/><\/a><\/td>\n<\/tr>\n<\/table>\n<\/div>\n<p><a href=\"http:\/\/da.feedsportal.com\/r\/161393867694\/u\/49\/f\/646446\/c\/34996\/s\/29cc4b8e\/a2.htm\"><img decoding=\"async\" src=\"http:\/\/da.feedsportal.com\/r\/161393867694\/u\/49\/f\/646446\/c\/34996\/s\/29cc4b8e\/a2.img\" border=\"0\"\/><\/a><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/pi.feedsportal.com\/r\/161393867694\/u\/49\/f\/646446\/c\/34996\/s\/29cc4b8e\/a2t.img\" border=\"0\"\/><\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?a=L1KYlbbifw8:wKWwYvtyQns:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/OmMalik\/~4\/L1KYlbbifw8\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Much has been done to bring big data closer to the people who need it. The advent of public cloud infrastructure has decimated the cost of collecting, maintaining and processing vast amounts of data. The next frontier is making that data reproducible, said Matt Wood, principal data scientist for Amazon Web Services, at GigaOM\u2019s Structure:Data [&hellip;]<\/p>\n","protected":false},"author":7419,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-647865","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/647865","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/7419"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=647865"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/647865\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=647865"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=647865"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=647865"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}