{"id":644631,"date":"2013-02-28T17:45:37","date_gmt":"2013-02-28T22:45:37","guid":{"rendered":"http:\/\/gigaom.com\/?p=615397"},"modified":"2013-02-28T17:45:37","modified_gmt":"2013-02-28T22:45:37","slug":"can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam","status":"publish","type":"post","link":"https:\/\/mereja.media\/index\/644631","title":{"rendered":"Can evil data scientists fool us all with the world\u2019s best spam?"},"content":{"rendered":"<p>While most of the concern over web security has to do with criminal activity such as cyberterrorism, state secrets and hacktivism, there\u2019s a far more annoying threat lurking beneath the surface. It\u2019s a new generation of spam that does away with brute force email barrages in favor of fake online personas so real that people \u2014 and, more importantly, email and web-service spam filters \u2014 can\u2019t tell they\u2019re fake. Done right, these fake identities could influence everything from app downloads to e-commerce to elections.<\/p>\n<p>It\u2019s called influence manipulation. And, as data scientist <a href=\"http:\/\/metaoptimize.com\/blog\/about-joseph-turian\/\">Joseph Turian<\/a> said during a presentation at the O\u2019Reilly Strata conference on Wednesday, \u201cIt\u2019s a pretty serious issue and it\u2019s also pretty hard to catch.\u201d (Turian will also be moderating a panel on next-generation databases at our<a href=\"http:\/\/event.gigaom.com\/structuredata?utm_source=data&#38;utm_medium=editorial&#038;%2338;utm_campaign=intext&#038;%2338;utm_term=615397+can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam&#038;%2338;utm_content=dharrisstructure\"> Structure: Data conference<\/a> in New York next month, but I\u2019m sure he\u2019ll gladly talk black-hat data science if you catch him in the hall.)<\/p>\n<div id=\"attachment_581177\" class=\"wp-caption alignright\" style=\"width: 310px\"><img loading=\"lazy\" decoding=\"async\" alt=\"RoadMap 2012 Joseph Turian MetaOptimize\" src=\"http:\/\/gigaom2.files.wordpress.com\/2012\/11\/8d6k2349.jpg?w=300&#038;h=200\" width=\"300\" height=\"200\" class=\"size-medium wp-image-581177\"><\/p>\n<p class=\"wp-caption-text\">Joseph Turian at GigaOM RoadMap 2012 (c) 2012 Pinar Ozger pinar@pinarozger.com<\/p>\n<\/div>\n<p>It\u2019s hard to catch because influence manipulation, which Turian also calls black-hat data science, is really just white-hat (or good) data science techniques inversed and pointed toward a nefarious purpose. So, whereas as white-hat data scientists try to uncover unnatural networks of links created to game Google\u2019s PageRank algorithm, Turian explained, black hats will try to build artificial networks so good they look real. If someone wants to send lots and lots of undetectable spam, it\u2019s just a matter of analyzing enough language to create messages that look less like a machine wrote them and more like a stupid human wrote them \u2014 because most spam filters try not to penalize users who just don\u2019t write well.<\/p>\n<p>During a one-on-one conversation later in the day, Turian told me he did a lot of work on language modeling as part of his Ph.D. work, and that the same techniques used for language evaluation \u2014 something like sentiment analysis, for example \u2014 can also be used for language generation. Marketing startups such as <a href=\"http:\/\/gigaom.com\/2012\/05\/15\/your-data-has-a-secret-but-you-yes-you-can-make-it-talk\/\">DataPop<\/a> and <a href=\"http:\/\/gigaom.com\/2012\/02\/22\/bloomreach-wants-to-save-your-site-with-big-data\/\">BloomReach<\/a> are already using some presumably similar techniques to create personalized online ads and web pages on the fly.<\/p>\n<h2 id=\"does-evil-lurk-among-our-data-\">Does evil lurk among our data scientists?<\/h2>\n<div id=\"attachment_615501\" class=\"wp-caption alignleft\" style=\"width: 109px\"><img loading=\"lazy\" decoding=\"async\" alt=\"Hilary Mason Source: hilarymason.com\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/02\/hilary_electronics-199x300.jpg?w=99&#038;h=150\" width=\"99\" height=\"150\" class=\"size-thumbnail wp-image-615501\"><\/p>\n<p class=\"wp-caption-text\">Not evil. Source: hilarymason.com<\/p>\n<\/div>\n<p>But are there actually so-called black-hat data scientists among us, using their mastery of statistics to influence our opinions or make us buy Cialis? Turian quoted Bit.ly data scientist <a href=\"http:\/\/www.hilarymason.com\/\">Hilary Mason<\/a>, who he said asks of all her work, \u201cWhat\u2019s the most evil thing that can be done with this?\u201d We can assume she\u2019s just trying to avoid a mini-<a href=\"http:\/\/en.wikipedia.org\/wiki\/Winchester_Mystery_House\">Sarah Winchester situation<\/a>, but others might not be so ethical. (Turian already classifies as \u201cgray hat\u201d certain well-known companies that play fast and loose with user data.)<\/p>\n<p>After all, Turian noted in his presentation, Greylock\u2019s D.J. Patil has <a href=\"http:\/\/hbr.org\/2012\/10\/data-scientist-the-sexiest-job-of-the-21st-century\/ar\/1\">called being a data scientist the sexiest job of the 21st century<\/a>, comparing it with Wall Street quants in the 1980s. And where there\u2019s opportunity, there will always be people trying to cash in on it by any means necessary. Real-life Gordon Gekkos came to make quants almost universally reviled, and a few bad apples could certainly find their way into the data science bunch.<\/p>\n<p>Turian assured me he isn\u2019t one of them. \u201c[I]f I did [this] I\u2019d be riding around in a Rolls Royce,\u201d he joked during our hallway conversation.<\/p>\n<h2 id=\"define-good-enough\">Define \u201cgood enough\u201d<\/h2>\n<p>Maybe, maybe not. If all you\u2019re trying to do is improve search rankings, mediocre bots might work in the same way that \u201clegit\u201d content-generation services like <a href=\"http:\/\/chirpsy.com\/\">Chirpsy<\/a> and <a href=\"http:\/\/www.fastcompany.com\/1773610\/column-was-crowdsourced-servio\">Servio<\/a> work, he noted. Marketers don\u2019t necessarily care how good a tweet or article is as long as it\u2019s positive and says their company\u2019s name a lot.<\/p>\n<p>But in order to be successful in the world of online influence manipulation, fake personas and their messages have to be <em>really good.<\/em> Lutz Finger, co-founder of <a href=\"http:\/\/fisheyeanalytics.com\/\">Fisheye Analytics<\/a>, laid out <a href=\"http:\/\/strata.oreilly.com\/2013\/02\/who-do-you-trust-you-are-surrounded-by-bots.html\">some interesting statistics<\/a> during another conference talk that highlight how difficult it is to really influence someone. According to the studies he cited, 7 percent of people\u2019s twitter followers are actually spambots; 30 percent of social media users are deceived by spambots and chatbots; and 20 percent of social media users accept friend requests from unknown people, 51 percent of which are not human.<\/p>\n<p>Presently, though, the charlatans are not very good. Finger said that when it comes to \u201castroturfing\u201d \u2014 the practice of creating fake grassroots movements to influence opinions \u2014 the <a href=\"http:\/\/news.bbc.co.uk\/2\/hi\/technology\/7719281.stm\">hit ratio on email spams is about 12.5 million to 1<\/a>. In order to create an astroturf movement on the scale of the anti-SOPA movement in 2011, every person on earth would have to receive the same spam message 8 times. The number might be even higher on an already-noisy platform like Twitter.<\/p>\n<p>That, he noted, makes spambot @peace_karen25\u2032s (a now defunct spambot) 10,000 pre-election tweets seem pretty inconsequential.<\/p>\n<p>However, he explained, spammers are getting smarter and are working on some of the black-hat data science techniques that Turian warns about. Next-generation bots will be better at gaining trust (attractive females with familiar names are most likely to have their fake friend requests accepted), and they\u2019ll act more real by mixing improved <a href=\"http:\/\/en.wikipedia.org\/wiki\/Chatterbot\">chatbot technologies<\/a> and analytics to figure out how people speak and what to say in what circumstances. Once they have your trust, these bots can make introductions to more bots and people will be more likely to accept those requests, too.<\/p>\n<p>Even if it\u2019s difficult to change someone\u2019s mind on issues like global warming or politics, Finger said well-timed messages could affect individual decisions. At the time someone is ready to buy something on Amazon.com, for example, he\u2019s open to messages about that product, perhaps in the form of product reviews. Maybe someone waiting in line at the polling place and still sitting on the fence is open to suggestions, too.<\/p>\n<p>And it\u2019s possible the bar to convincing people \u2014 especially teens \u2014 to act really isn\u2019t that high at all. In his talk, Turian highlighted teenage social media maven <a href=\"https:\/\/twitter.com\/KshaClark\">Acacia Brinley Clark<\/a> and her single tweet that led to an app called Pheed becoming one of the most-downloaded apps in Apple\u2019s App Store last week. After reading the rest of her Twitter feed, he said, (only half-jokingly, I think) it took quite a bit of research to convince him she\u2019s a real person.<\/p>\n<p><img decoding=\"async\" alt=\"brinley\" src=\"http:\/\/gigaom2.files.wordpress.com\/2013\/02\/brinley.jpg?w=708\" class=\"aligncenter size-full wp-image-615507\"><\/p>\n<p>Her 120,000-plus followers don\u2019t seem to share the skepticism, but they certainly seem willing to follow her lead.<\/p>\n<p> <img loading=\"lazy\" decoding=\"async\" alt=\"\" border=\"0\" src=\"http:\/\/stats.wordpress.com\/b.gif?host=gigaom.com&#038;blog=14960843&#038;%23038;post=615397&#038;%23038;subd=gigaom2&#038;%23038;ref=&#038;%23038;feed=1\" width=\"1\" height=\"1\" \/><\/p>\n<p><a href=\"http:\/\/pubads.g.doubleclick.net\/gampad\/jump?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=161561\"><img decoding=\"async\" src=\"http:\/\/pubads.g.doubleclick.net\/gampad\/ad?iu=\/1008864\/GigaOM_RSS_300x250&#038;sz=300x250&#038;%23038;c=161561\" \/><\/a><\/p>\n<p><strong>Related research and analysis from GigaOM Pro:<\/strong><br \/>Subscriber content. <a href=\"http:\/\/pro.gigaom.com\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=615397+can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam&#038;utm_content=dharrisstructure\">Sign up for a free trial<\/a>.<\/p>\n<ul>\n<li><a href=\"http:\/\/pro.gigaom.com\/2011\/11\/connected-world-the-consumer-technology-revolution\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=615397+can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam&#038;utm_content=dharrisstructure\">Connected world: the consumer technology revolution<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/01\/why-the-next-front-in-big-data-might-be-psychological\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=615397+can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam&#038;utm_content=dharrisstructure\">Why the next front in big data might be psychological<\/a><\/li>\n<li><a href=\"http:\/\/pro.gigaom.com\/2012\/09\/listening-platforms-finding-the-value-in-social-media-data\/?utm_source=data&#038;utm_medium=editorial&#038;utm_campaign=auto3&#038;utm_term=615397+can-evil-data-scientists-fool-us-all-with-the-worlds-best-spam&#038;utm_content=dharrisstructure\">Listening platforms: finding the value in social media data<\/a><\/li>\n<\/ul>\n<p><img width='1' height='1' src='http:\/\/gigaom.feedsportal.com\/c\/34996\/f\/646446\/s\/2912f278\/mf.gif' border='0'\/><\/p>\n<div class='mf-viral'>\n<table border='0'>\n<tr>\n<td valign='middle'><a href=\"http:\/\/share.feedsportal.com\/viral\/sendEmail.cfm?lang=en&#038;title=Can+evil+data+scientists+fool+us+all+with+the+world%E2%80%99s+best+spam%3F&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F02%2F28%2Fcan-evil-data-scientists-fool-us-all-with-the-worlds-best-spam%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/emailthis2.gif\" border=\"0\" \/><\/a><\/td>\n<td valign='middle'><a href=\"http:\/\/res.feedsportal.com\/viral\/bookmark.cfm?title=Can+evil+data+scientists+fool+us+all+with+the+world%E2%80%99s+best+spam%3F&#038;link=http%3A%2F%2Fgigaom.com%2F2013%2F02%2F28%2Fcan-evil-data-scientists-fool-us-all-with-the-worlds-best-spam%2F\" ><img decoding=\"async\" src=\"http:\/\/res3.feedsportal.com\/images\/bookmark.gif\" border=\"0\" \/><\/a><\/td>\n<\/tr>\n<\/table>\n<\/div>\n<p><a href=\"http:\/\/da.feedsportal.com\/r\/159489787318\/u\/49\/f\/646446\/c\/34996\/s\/2912f278\/a2.htm\"><img decoding=\"async\" src=\"http:\/\/da.feedsportal.com\/r\/159489787318\/u\/49\/f\/646446\/c\/34996\/s\/2912f278\/a2.img\" border=\"0\"\/><\/a><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/pi.feedsportal.com\/r\/159489787318\/u\/49\/f\/646446\/c\/34996\/s\/2912f278\/a2t.img\" border=\"0\"\/><\/p>\n<div class=\"feedflare\">\n<a href=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?a=gPi-gW5VpxQ:jUCkBA2Y3hg:yIl2AUoC8zA\"><img decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~ff\/OmMalik?d=yIl2AUoC8zA\" border=\"0\"><\/img><\/a>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/feeds.feedburner.com\/~r\/OmMalik\/~4\/gPi-gW5VpxQ\" height=\"1\" width=\"1\"\/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>While most of the concern over web security has to do with criminal activity such as cyberterrorism, state secrets and hacktivism, there\u2019s a far more annoying threat lurking beneath the surface. It\u2019s a new generation of spam that does away with brute force email barrages in favor of fake online personas so real that people [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-644631","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/644631","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/comments?post=644631"}],"version-history":[{"count":0,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/posts\/644631\/revisions"}],"wp:attachment":[{"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/media?parent=644631"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/categories?post=644631"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mereja.media\/index\/wp-json\/wp\/v2\/tags?post=644631"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}