This article originally appeared in the Volume 22, Number 4 issue of TDWI's Business Intelligence Journal.
Q: The shortage of people with deep analytical skills is well documented. A 2011 study by the McKinsey Global Institute predicted that by 2018 the U.S. alone will face a shortage of 140,000 to 190,000 people with deep analytics skills, as well as 1.5 million managers and analysts to analyze big data and make decisions—a trend confirmed by a follow-up study in 2016. This shortage is significant because inadequate staffing and skills are the leading barriers to a company’s use of analytics.
- What recommendations do you have for obtaining people with advanced analytical skills?
- One frequently suggested option is to upgrade the skills of people already on board. How do you identify good candidates? How do you upgrade their skills? Are they likely to want to make a career change?
- Gartner has introduced the idea of the “citizen data scientist.” This is someone with an analytical mind who is provided with the right tools to do analytical work. What do you think of this idea? What are the downsides? How do you prepare and help them become productive? What kinds of work should they take on?
A: [Nancy Couture] - First and foremost, be sure to spend time defining the roles you really need. Many organizations are feeling the pressure to hire data scientists and perform deep analytics on big data. There’s no doubt that finding and developing key data correlations can result in substantial gain for an organization. However, doing this takes investment.
The data scientists in an organization need to go through many steps, including
- Exploring large volumes of data
- Looking for data correlations
- Developing hypotheses
- Proving them out with large volumes of data
- Determining if this discovery has business value
- Developing a repeatable algorithm or model
They spend time exploring, and their insights are only sometimes useful. Many companies can only afford this in a limited way. One organization I consulted for had a data science team of about five individuals. In one and a half years, they made one sizable discovery related to healthcare. This discovery provided the organization with some recognition, but the results still haven’t been used.
As a business, define what you really need. If you are determined to make an investment in data science, there are several approaches you can take, despite the publicized shortage of these skill sets.
Define the roles and skills needed, and fill them accordingly. Initially, data scientists were assumed to be Ph.Ds. However, most master’s degree holders likely have the needed skills.
Split the data scientist role into two: one advanced data scientist and a separate team that can manage the data and perform analytics and visualizations. In this way, the data scientist can focus on new ideas and discovery and you can make the most use of his or her specific skill sets.
For example, in a prior organization, we had a decision science team made up of strong data analysts who could leverage analytics tools and develop models. This team had only one data scientist and the rest of the team performed the bulk of the analytics work. This organizational model worked well. They developed several new algorithms that we incorporated into our business processes and business decisions.
An obvious approach is to recruit from college campuses. This will most likely offer limited success but is still an avenue to explore. Because data science is still relatively new, data science programs are just starting to spread.
According to a 2016 U.S. News & World Report article, less than a third of the top 100 global universities offered degrees in data science. Of the 29 that did, only six made them available to undergraduates. In addition, most of these programs focus on the technical aspects rather than how to define business value.
However, this is changing. Data science programs, both Ph.D. and master’s, are becoming more prevalent.
Support internal team members interested in becoming data scientists and are willing to invest in this career. For example, in one of my prior jobs, an individual on my team who graduated with a major in statistics was very analytical and had a talent for learning. The person took courses (including self-learning) and is now working with big data technologies and developing models.
As another example, a data architect consultant I worked with decided to study for data science certifications. He’s about two months from completing these certifications, and is already looking forward to making use of these skills.
These two individuals decided they wanted to take this path and were self-starters in data science and analytics.
Train internal candidates who are already familiar with the business and have strong mathematical and analytical skills as well as a desire and propensity to learn. Develop data analytics training programs, individually or in partnership with a university. If you make it known to your employees that your company needs and values data science skills—and that you will pay for some of the training and education—you will probably develop some junior-level certified data scientists within a year.
Create a career path that will hook the individual for a prolonged period, starting as an apprentice data scientist, moving to an associate, and eventually to a full-fledged data scientist. The career path is associated with a combination of training, data science mentorship, and experience.
One organization I knew worked with two universities to create distance learning and certification programs. They have since trained and certified several data scientists who are now based in a variety of functions and business units.
Keep an eye on the software market—new data and analytics software platforms that automate model building and predictive and prescriptive analytics will enable the data scientists to focus on a subset of more complex activities. Automate more of the data procurement and management tasks to free analysts and data scientists to perform more true analytics. Ensure your data scientists are focused on developing new ideas and discovering new correlations rather than spending precious time developing an algorithm they could have bought.
This shortage of data scientists and analysts will most likely continue in the near future, so be creative in your approaches for developing these capabilities in your organization.
A: [Ben Daniel] - “A good analyst is hard to find” is a fitting adage for analytics managers today. Firms such as McKinsey Global have thoroughly documented the struggle of business managers around the world find good people for analytics positions. Higher education institutions are already modifying their degree programs to include more data analytics, but it will take time for them to yield good data analysts and scientists—thus the need to discover and enhance analytics personnel in the short term.
To start, an organization should assess its current level of analytics capabilities, understand its shortcomings, and have a clear vision of what it wants to achieve with its data assets. This will determine the type of analytics talent to recruit as well as where to pursue such talent.
Most computer science, data warehousing, and BI professionals are active on professional networks such as LinkedIn and online recruiting resources. More specialized data scientists may also use social networks, but recruiters for these positions can also benefit by working directly with university graduate programs, as well as by tapping into the alumni networks of their alma maters—especially if they have reputations for good analytics talent.
Personal networking cannot be overstated. Managers need to introduce themselves to people in the analytics community.
Personal networking cannot be overstated. Managers need to introduce themselves to people in the analytics community. For example, a search for “data science” on LinkedIn returns more than 1,000 groups. Find out who the leaders of these online groups are and reach out to them directly.
Another website, Meetup.com, not only serves as a social network but also coordinates real-world events for people interested in a common topic. Attending such meetings allows recruiters to discover talent while identifying who might be a good cultural fit for their organization.
These approaches work for identifying external candidates. The best way to identify internal staff who might be willing to take a step into data science is to look at their past work and attitudes. Do they show intellectual depth and/or systematic thinking? Are they independent problem solvers? If so, they may be good candidates for either analytics positions (if they are already qualified) or continuing education.
Several options have expanded over the past five years that make online education affordable, effective, and efficient. Encouraging someone who is a good, disciplined self-learner to engage in online learning provides immediate benefits because they can apply their skills as they proceed through an educational program.
Employers can leverage training resources such as Coursera, Udemy, and TDWI’s Online Learning, but they should insist that employees go through recognized certifications and training programs.
Candidates for analytics positions can also participate in Kaggle competitions, which let employers see their work directly (Kaggle can also serve as a social network for recruiting).
There is also the concept of the citizen data scientist—someone with an analytical mind who is provided with the right tools to do analytical work. Citizen data scientists may be a good short-term answer, especially if they can perform querying and visualization. Although this is light work, good managers should challenge their employees to expand their skill sets and take on more ambitious projects.
Applications such as IBM Watson and other cognitive analytics tools can automate programming tasks much the way that mathematical algorithms have been “black boxed” by APIs and other software packages. Indeed, this may expand an organization’s capabilities in the short run.
Nevertheless, a firm will not gain true advantage until they have people who can evaluate the output of analytics tools with a critical eye and get others to adopt their solutions. Understanding what is happening in a model allows the user to correctly interpret and communicate its results; otherwise, people within the organization will not adopt or use the information.
Getting more [employees] into analytics will require encouragement and support from employers.
Solving the shortage of talented analytical people will take time, but there has never been a time when the ability and resources for learning were so plentiful. Getting more people into analytics will require encouragement and support from employers as well as the creation of economic incentives for employees to gain new skills. Employers should create parallel career paths to management for analytics personnel that allow them to enjoy the prestige and benefits of the value they create.
Yes, a good analyst is hard to find, but by searching in the right places, encouraging employees to self-train, and creating the right incentives for analytical careers, good analysts will find you.
A: [Brian Valeyko] - Teams have struggled to fill the data scientist role without great success for several years. In reaction to the market, there are (not shockingly) quite a few people who are now calling themselves “data scientists.”
The issue with being successful in finding someone who works well in your organization may come down to setting appropriate expectations. Trying to find one person—especially a recent graduate—with the breadth of technical, business, and statistical knowledge, as well as presentation abilities, is a tall order. The problem may be a consequence of believing that the data scientist is a single person.
From my experience, the most successful analytics organizations rely upon the skills of several people working toward a common goal and using their own specialized capabilities to enhance the whole. Though there are people who possess all of these skills, those people are very rare because the business needs for this role are quite varied.
The typical data science “project” requires someone who can understand the business context of a problem and creatively solve that problem. Then she needs to build a hypothesis for how to use data to make that happen. After deciding what data is necessary, the data scientist must wrangle, organize, and cleanse that data, which requires multiple technical disciplines. After the data is ready, it must be interrogated appropriately to ensure a statistically accurate representation of facts and correlations.
With that all done, the data scientist must create a presentation in a format that is understandable, believable, and provable to a layperson. This is no small feat and often requires graphical and sociological presentation skills. Finally, he or she needs to present and evangelize the results (and process) to build demand for ongoing work.
Rather than trying to find the very rare individual who can succeed at all of these disciplines, I promote data science as a team effort. A group of three to five specialists with a common goal provides a much better chance of hitting the mark when resolving a complex problem and all the accompanying goals to build the use of data into a business practice. Eventually, each member of the team may pick up some of the other disciplines through participation in the process. That’s the ideal cycle of growth to me.
In our business, we have created a rotational program that allows us to seed people who have some of the technical skills into business units on a temporary basis to help jumpstart these cross-functional teams. Some of these participants may eventually build the full range of skills to be a solo data scientist, but in the meantime, we can meet the needs of our business using the team approach.
A side benefit of this rotation is that it also grows the technical capabilities of the business-side participants so they can eventually be the “citizen data scientists” of the future. The team approach allows people to consult and work on limited-duration efforts where they see the full life cycle of an analytics project. This helps to inform their own route to advancing their own analytics skills.
In summary, I’ve found that building collaborative cross-functional teams is an effective alternative to searching for a very exotic single employee. Over time, I believe this approach will be more successful and produce a better result for the organization.
Nancy Couture is Senior Director, Delivery Enablement at Datasource Consulting.
Ben Daniel is CRM Analytics Manager for The Home Depot.
Brian Valeyko is Director of Data Warehousing, Business Intelligence and Big Data Analytics for NCR.