Data scientist is the job of the future, but Canada will need to create a lot more of them
By now, everyone knows that big data is big business. Companies are investing billions of dollars in analytics − research firm IDC says the market is growing six times faster than the overall IT sector − and more and more number crunching is being done every day.
As excited as people are about big data's possibilities, there's one issue that could slow the industry's growth: a lack of data scientists.
According to a 2011 McKinsey report, the U.S. alone will face a shortage of between 140,000 and 190,000 people with analytical expertise, and 1.5 million managers and analysts who can make decisions based on the data that's mined. While there are no exact corresponding numbers for Canada, there's also a dearth of data scientists here, says Greg Mori, an associate professor in Simon Fraser University's School of Computing Science. "The demand for well-trained computer scientists is huge and unemployment rates are tiny," he says.
Big data's boom
The analytics industry has blown up in large part because storing data has become exponentially cheaper. In the last decade, it cost $50,000 to $130,000 to store a terabyte of data. It was too pricey to keep everything, let alone mine it; companies could only analyze a small percentage of the data they collected. Storage now costs between $1,000 and $2,000 per terabyte and that allows companies to hang on to and analyze copious amounts of information. "You could never go down to the fine grain that companies wanted," says Sham Chotai, chief technology officer at GE Transportation. "Now the world has changed, so we can look at every small detail."
Companies have started to realize just how profitable those small details can be. For instance, America's rail sector has been able to figure out that if a train can increase its average velocity by 1 per cent, the industry will make $2.5-billion more. If a train sits idle for an hour less, the sector will save $2.2-billion. Other companies are mining social media and tracking customer behaviours to gain business insights, while industrial operations are analyzing machine performance to determine when a crucial piece of technology is going to break down. Naturally, as companies collect more data, they need more data scientists to analyze it all.
Diverse skill sets
One of the main reasons why the supply of data scientists is so tight is that most people don't have the right combination of skills, a set that involves statistics, science, business and computer programming. People also have to be curious, creative and love data, says Lorne Rothman, principal data scientist at SAS Canada, a firm that develops data analysis tools. "Businesses think they can fill the data science position with one individual, and those people are really hard to find," he says. He thinks companies are better off hiring a team of scientists, with each person being responsible for one aspect of the analytics process. "If you can split out the function and have people who are on the programming side and the analytics side then you might be able to fill the job roles more easily," he says.
It's likely that demand for data scientists will continue to outstrip supply for some time, but universities and companies are trying to fix that imbalance. In September, Simon Fraser launched a professional masters program in big data. While it only admitted 25 people in its first year − about 100 people applied − that could double in the future, says Prof. Mori. Other schools are jumping in: Queens offers a masters degree in management analytics, Carleton just announced a new masters degree in data science and it's only matter of time until more institutions follow suit.
At SFU, the students are trained to program, extract information from large data sets, use tools and machines for data mining, figure out what kinds of data to mine and more. Companies like GE and SAS Canada have also developed their own programs to train internal staff and employees at other companies to become proficient data scientists. For example, GE has a Global Software Center of Excellence located in San Ramon, California, where it trains people on everything big data. The company has spent $1-billion on the centre and Chotai says it will "double down" on its investment. The centre was created, in part, to address the supply-and- demand imbalance. By training their own staff, GE is less at the mercy of job market forces. "We're not waiting for the supply to catch up," says Chotai. "We're growing our own."
Eventually, supply will catch up. Wages are attractive − according to the U.S. Bureau of Labor Statistics, the median pay for data scientists in 2012 was $102,000 a year − and the opportunities are expanding. "There is such huge growth in this sector that we will see more people wanting to work in this space," says Chotai. "I think you'll start to see a rebalancing in the next few years."
For more innovation insights, visit www.gereports.ca
This content was produced by The Globe and Mail's advertising department, in consultation with GE. The Globe's editorial department was not involved in its creation.