The State Of Data, December 2020

[ad_1]


Data is eating the world and there are many indicators of its ubiquitous presence in our lives. From fueling the recent success of “artificial intelligence” (AI) and the rise of “digital transformation” to its accelerated growth due to Covid-19 to new approaches to its “monetization” (finding money in mining data, a growing practice since at least the 1970s) to how it makes businesses and consumers both anxious and animated, data dominates our deeds, debates, and dreams.

Here’s the data on the state-of-data at the end of 2020.

The Covid effect…

2,165 billion      the number of total streaming minutes from February 24 to June 7, 2020, up from 1,165 billion in the same period in 2019 [Recode]

37.1                    the percentage increase in third quarter 2020 e-commerce from the third quarter of 2019 (while total retail sales increased 6.9% in the same period) [U.S. Census Bureau]

51                       the percentage of CIOs and CTOs that report accelerating the adoption of machine learning and AI due to covid-19 [IEEE]

32                       the percentage of business executives at companies that adopted AI in sales and marketing that reported the failure of their machine learning models in that function that relied on data collected before the pandemic and did not reflect changing consumer behavior [McKinsey]

“Buying one-way airline tickets was a good predictor of fraud [in automated detection models]. And then with the Covid-19 lockdowns, suddenly lots of innocent people were doing it,” Svetlana Sicular, Gartner

7.6                     the percentage of data scientists and analytics professionals that reported in July 2020 that their team actually increased hiring due to the Covid-19 crisis [Burtch Works]

84                       the percentage of employers that are set to rapidly digitalize working processes, including a significant expansion of remote working [World Economic Forum]

52                       the percentage of employed Americans that were working from home in late October at least one day a week, 71% of them as a result of the Covid-19 pandemic [Fast Company]

46                       the percentage of work-from-home employees who say they spend up to 2 hours a day interacting with digital colleagues or software robots [ABBYY]

57                       the percentage of work-from-home employees that have binge-watched a TV show during work hours since the start of the pandemic [Fast Company]

52                       the percentage of consumers that say digital capabilities are a primary requirement when searching for a small service provider, a 30% increase during the pandemic [Moxtra]

27                       the percentage of Gen Z/Millennials that confess to online shopping while in a meeting [Harris]

25                       the percentage of organizations that believe they are at greater cybersecurity risk now than before the pandemic [Netwrix]

19                       the percentage of UK workers that report having seen an increase in cyber attacks while working from home [Iomart]

$44.94 billion    the estimated digital restaurant marketplace sales (e.g., DoorDash, Uber Eats, Grubhub) in 2020, up from $20.08 billion in 2019 [eMarketer]

24.95                  the percentage annual growth of the worldwide tablet market in the third quarter of 2020 with shipments totaling 47.6 million units—”The demand for affordable access to basic computing and larger screens to facilitate remote work, learning, and leisure due to COVID-19 restrictions drove the market” [IDC]

35.1                    the percentage annual growth of the global wearables market during the third quarter of 2020 with total shipments reaching 125 million units, partly due to the Covid-19 pandemic [IDC]

208,333              the number of participants in Zoom meetings every minute [Domo]

29                       the percentage increase during the pandemic in the overall time playing video games [angle.co]

42                       the percentage of Americans who had dated online in the past that say they’re doing so more now compared to before the pandemic [Harris]

“The dating game in COVID is all screen and no scene,” John Gerzema, CEO, The Harris Poll

The data effect…

21                       the percentage rise in the blood pressure level of 1,100 test participants due to stress caused by slow loading web pages [Cyber-Duck]

84                       the percentage of Americans that are worried that data collection for Covid-19 containment will sacrifice too much of their privacy [Okta]

31                       the percentage of Americans that say the federal government can be trusted to collect and protect personal information and data for contact tracing; 32% trust private sector companies and 38% trust state and local governments [IBM]

“The data is moving slower than the disease”—Dr. Umair Shah, Washington state secretary of health

24                       the percentage of Americans that are willing to share their data with law enforcement agencies [Okta]

0.5 billion          the number of devices such as computers, routers and fitness trackers that Washington DC-based SignalFrame can determine their real-world location and identity by tapping up to 5 million cellphones [WSJ]

55                       the percentage of American adults that are worried about government agencies tracking them through location data generated from their cellphones and other digital devices [Harris Poll/WSJ]

47                       the number of minutes it takes to read social media privacy policies, on average [addictivetips.com]

77                       the percentage of Americans that believe data collected on social platforms shouldn’t be used for political targeting [Pew Research]

41                       the percentage of journalists believing that social media algorithms will change the way they work the most, up from 38% in 2019 [Cision]

76                       the percentage of Millennials and Gen Z that agree that “I spend too much time on the internet (surfing the web, social media, apps)” [Harris]

175 million        the number of people sending a message to a WhatsApp Business account every day [WhatsApp]

41,666,667        the number of messages shared on WhatsApp every minute [Domo]

5.83 million             transactions/second on the Alibaba cloud in the 2020 “Singles Day,” the world’s largest online shopping day, up from 5.44 million transactions/second in 2019 [ChinAI]

500-600 billion the number web pages included in Google’s index, up from 1 billion in 2000 [New York Times]

16.1                    the share of e-commerce of total U.S. retail sales in the second quarter of 2020, up from 0.6% in the fourth quarter of 1999 [U.S. Census Bureau]

59 zettabytes    the amount of data created, captured, copied, and consumed in the world in 2020, up from 33 zettabytes (33 trillion gigabytes) in 2018 [IDC]

Data monetization…

$38 billion         the amount spent on advertising on social networks worldwide, up from $16 billion in 2016 [Merkle]

$13 billion         the combined market value of IHS and Markit when they merged in 2016; in November 2020, S&P Global agreed to acquire IHS Markit for about $44 billion [WSJ]

$475,944.49      the amount of money the U.S. Customs and Border Protection (CBP) agency paid in August 2020 to the data broker Venntel, a company that sells a product based on location data harvested from ordinary apps installed on peoples’ cell phones [Vice]

221                     the number of North American AI faculty departures between 2004-2018 (131 AI professors completely left universities to pursue an industry career or to establish their startups, and 90 AI professors worked for a private company or opened his/her own startup while still retaining affiliation with the university), constituting an “unprecedented AI brain drain to the industry” [Gofman and Zin]

97                       the number of funding “mega-rounds” ($100M+) by Fintech startups in 2020, up from 66 in 2018 [CBInsights]

31                       the percentage increase in the value of Palantir’s stock on September 30, 2020, its first day of trading, valuing the big data analytics company at $21 billion [WSJ]

“If I get a first name and a cellphone number, you’d be shocked how much information Palantir can provide,” Charles Rettig, IRS Commissioner

$3 billion           The amount of money self-driving car developer Waymo, a Google subsidiary, has raised in 2020 [Crunchbase]

$300,000           The top base salary for data sciences managers in the U.S. in 2020 [Burtch Works]

$200 billion       the size of the market for external data in 2019, according to IDC [Forbes]

$1,000                the price charged by Generated.Photos to create 1,000 AI-produced images of fake people; just a couple of fake people cost nothing at ThisPersonDoesNotExist.com [The New York Times]

“…such capabilities [as ‘deepfake’ transformation of the human face] were called image processing 15 years ago, but are routinely termed AI today. The reason is, in part, marketing,” Glen Weyl and Jaron Lanier

$10.7 trillion     the combined economic gains from AI of China (26% boost to GDP in 2030) and North America (14.5% boost), accounting for almost 70% of the global economic impact by 2030 [PWC]

The AI effect…

22                       The percentage of business executives worldwide reporting that more than 5% of their organizations’ enterprise-wide earnings before interest and taxes in 2019 was attributable to their use of AI [McKinsey]

10                       the percentage of managers reporting significant financial benefits from their AI investment so far [BCG]

49                       the percentage of “AI high performers” (20% or more of enterprise-wide EBIT in 2019 was attributable to their AI use) that generate synthetic data to train AI models when there are insufficient natural data sets, as opposed to 16% of all other businesses surveyed [McKinsey]

43                       the percentage of executives worldwide that believe AI will transform their organization in the next three years [Deloitte]

36                       the percentage of executives worldwide that believe AI will transform their industry in the next three years [Deloitte]

227                     the number of annual reports mentioning Amazon as a risk factor, up from 14 in 2010 and 85 in 2015 [Recode]

49                       the percentage of customers that find AI interactions to be trustworthy, up from 30% in 2018 [Capgemini]

82                       the percentage of employees worldwide that think AI can support their mental health better than humans [Calcalist]

6.1 million         the number of miles represented by the data Waymo released in October 2020 on 21 months of automated driving in the Phoenix, Arizona, metropolitan area, detailing “every collision and minor contact experienced… [including] 18 actual and 29 simulated contact events, none of which would be expected to result in severe or life-threatening injuries” [Waymo]

16                       share of artificial intelligence (AI) patent applications of all patent applications received in 2018 by the United States Patent and Trademark Office (USPTO), up from 9% in 2002 [USPTO]

25                       the percentage of individual inventor-patentees active in AI in 2018, up from 1% in 1976 [USPTO]

108                     the number of languages supported by Google’s automated translation as of June 2020 [Google]

40.2                    the percentage annual increase in the number of papers submitted to top AI conference NeurIPS 2020 [Sergei Ivanov]

22,245                the number of papers in biology involving AI methods published on 2020, up from 14,295 in 2019 and 956 in 2010 [PubMed]

“I haven’t seen anything in which A.I. has helped us yet, clinically,” Dr. Eric Topol, Scripps Research Translational Institute

132                     the number of known individual bears that were captured by 4,674 images used to train BearID, an open-source facial recognition application for identifying individual bears (and other species that lack unique and consistent body markings), achieving 83.9% accuracy in identifying individual bears [Ecology and Evolution]

15                       the percentage of journalists seeing AI/machine learning as the most important technology to affect their industry, down from 19% in 2019 [Cision]

175 billion         the number of parameters in GPT-3, the new version of OpenAI’s NLP program, “10x more than any previous non-sparse language model” [Arxiv]

“Extrapolating the spectacular performance of GPT3 into the future suggests that the answer to life, the universe and everything is just 4.398 trillion parameters,” Geoffrey Hinton, Turing Award winner

$10 million        the likely budget for developing and training GPT-3 [The State of AI Report]

“We’re doing incredibly better with NLP than we were five years ago, yet we’re still incredibly worse than humans”—Yoshua Bengio, Turing Award winner

$100 trillion      the estimated cost of reducing the ImageNet error rate from 11.5% to 1% without new major research breakthroughs [The State of AI Report]

“I would declare victory if in my professional lifetime we could make machines that are as intelligent as a rat”—Yann LeCun, Turing Award winner

87                       the median score of DeepMind’s AI program AlphaFold 2 in the 2020 Critical Assessment of Structure Prediction or CASP, a competition for algorithms that can be used to predict protein structure; this score was about 26 points better than its nearest competitor and is close to being as good as X-ray crystallography, which until now, has been the primary way to obtain a high-resolution model of a protein’s structure [Fortune]

AI, data, and digital challenges…

Less than 1        the percentage of US labs using digital pathology for primary diagnosis [Clinical Lab Manager]

56                       the percentage of IT and business executives worldwide that agree that their organization is slowing adoption of AI technologies because of emerging risks such as cybersecurity, AI failures, misuse of personal data, and regulatory uncertainty [Deloitte]

93                       the percentage of chief analytics officers and chief data officers that say ethical considerations must be dealt with to drive AI adoption within their organizations [FICO]

65                       the percentage of executives that are aware of the issue of discriminatory bias with AI systems, up from 35% in 2019 [Capgemini]

91.7                    the percentage of UK business executives saying the key barrier to AI adoption is a shortage of talent with AI skills [Digitalisation World]

22                       the percentage of executives that report they have faced a customer backlash as a result of their AI systems operations [Capgemini]

46                       the percentage of life sciences companies that said it takes longer than expected to receive payback from investments in AI initiatives [Deloitte]

$18 billion         the total reported losses of the 100 largest cyber incidents of the last five years, with 10 billion compromised records [Cynetia]

71                       the percentage of small and medium-size businesses that do not have a formal plan/protocols in place to deal with any potential cyber-attacks [specopssoft]

28                       the percentage of UK businesses that do not provide essential cyber security training to help workers identify potential data breaches [Iomart]

74                       the percentage of employees saying that their organization takes appropriate measures to store and secure sensitive data [Snow Software]

6,219,819,956   the number of data breach cases recorded in the United States since 2013, making it  the worst-affected country in the world [Uswitch]

3                          the number of countries (Belgium, Luxembourg, Morocco) that have partial bans on facial recognition technology that only allow it in specific cases; in all other countries the technology is in active use or being considered [The State of AI Report]

When analog meets digital…

11.1 million       the number of vinyl records sold in 2020 through Discogs, a catalog of more than 13.3 million releases and 7.2 million artists, making it the most extensive physical music database in the world with 8 million users [Discogs]

17 billion           the number of pieces of commercial mail Pitney Bowes pre-sorts for the U.S. Postal Service every year, providing “a rich source of data on everything from processing speeds, to shipping rates and delivery routes”; the company’s Q3 2020 revenues from managing deliveries of online sales were up nearly 50% from the year-earlier period [WSJ]

3x                        the factor by which the number of physical records in the National Archives has increased since 1991; the number of electronic records has increased by a factor of 1,654 [National Archives]

2.5 billion          the number of years it would take a supercomputer to perform the Gaussian boson sampling calculation but took only 200 seconds on a quantum computer at the University of Science and Technology of China [Science]

Read More …

[ad_2]


Write a comment