Job: Data engineer
The role: To take advantage of new technologies such as machine learning, predictive analytics and artificial intelligence, organizations must first sort and organize data that are often stored on multiple, incompatible platforms. The role of a data engineer is to make data from different sources available for data scientists to utilize.
“You have all this disparate data that’s not mutually compatible with each other, and it’s very hard for a data scientist to work with,” explained Colin Fraser, a data scientist for Vancouver-based online charitable giving platform CHIMP. “A data engineer is someone who is fluent with all of those database types and languages, and can access the data out of all of them and integrate it into something that is usable for a data scientist.”
Mr. Fraser explains that companies often mistakenly hire data scientists to build new capabilities using the data they’ve gathered, only to discover that they first need to organize that data into a usable format, something most data scientists aren’t trained to do.
“Companies will jump to hiring a data scientist before they have a data engineer, and then the data scientist ends up sitting around for four, five, six months not really doing anything, because they're not skilled in data engineering,” he said.
“I would think in an entry-level position in data engineering, you would be looking at about $50,000 to $60,000 per year, and then mid-level would be $60,000 to $80,000 [annually] and upper would be $80,000 to over $100,000 [per year],” said Mr. Fraser, adding that the more platforms data scientists have proficiency in, the more they stand to earn.
Education: While there are no educational or licensing requirements, a vast majority of data engineers have at least an undergraduate-level degree in computer science, engineering, statistics or a related field.
The most important educational requirement for many employers, however, is certification or proven proficiency with popular database platforms, such as Amazon Web Services, Microsoft Azure and Google Cloud.
“If you’re someone that already knows three or four systems, you’re a lot more valuable than someone who has to learn it on the job,” Mr. Fraser said.
Those seeking certification can find training programs and exam preparation materials through the major database platforms, including Microsoft Certified Professional certification, Google Cloud Certified Professional Data Engineer and Amazon Web Services Certified Big Data Specialty. Exams are offered in major cities across Canada multiple times each year, and typically cost between $200 and $500 to write.
Job prospects: The demand for data scientists in Canada is skyrocketing, and organizations are gradually realizing the need for pairing them with data engineers. As a result, demand is growing, but not at the same rate as the more commonly understood data-scientist role.
“You might not find as many postings, but I think the need is just about the same,” Mr. Fraser said. “Big financial firms, banks, insurance companies, companies that have a long history of knowing that they need to protect and cherish their data will have bigger data-engineering teams, but it’s something that’s becoming more and more apparent and important for smaller companies, as well.”
Mr. Fraser adds that most employers are based in major cities, but data engineers are typically able to work remotely.
Challenges: Mr. Fraser says the biggest challenge most data engineers contend with is simply keeping up with the technology in a quickly evolving field.
“The way that you manage data today is very different even from how it was three years ago,” he said. “That means that whatever you learned in school five years ago is not used any more, so you constantly have to read about new technologies, try new technologies, go to conferences and see what other companies are doing.”
Why they do it: While keeping on top of advances in technology is the greatest challenge, Mr. Fraser says that, for many, it’s also the greatest perk.
“You’re really an architect when you’re a data engineer, in the sense that you’re building some kind of system with lots of moving parts that all have to talk to each other,” he said. “For the right personality, that’s a really exciting challenge.”
Misconceptions: The biggest misconception about data engineers, according to Mr. Fraser, is that companies don’t need to hire one if they already have a data scientist on staff. “The reality is that your data is probably very messy, your data is in lots of different places, and it takes a specialized, dedicated role to really process it and get it into a state where the data scientist can work with it,” he said.