The ability to understand and manipulate data has always been an important part of a scientific education. However, today's volume and speed of data accumulation is almost overwhelming. Past generations often faced challenges of a single limited data set, how to derive valid patterns, and inferences on the short data supply. Today the usual challenge is to pick out any patterns at all amid the volume of multiple data sets (and corresponding noise).
Fortunately, the ability to process this large amount of data is at least close to keeping pace with the ability to generate it and collect it. Unfortunately, machines still have limitations on being able to interpret it. (Maybe you would say fortunately, depending on your comfort level with technology.)
Who can interpret these mountains of data and provide pattern recognition and context to managers in a timely fashion? Enter the Big Data Scientist. According to SmartData Collective, it's "the sexiest job of the 21st century." That may be a bit much, but those skilled in the field are in high demand. It's been estimated that there are only enough qualified people to fill one-third of the job positions available, and demand is likely to increase along with the data stockpile.
What does a big data scientist do? The real question is: what do you want them to do? There is no single job description for a data scientist, because the definition will vary by the field they are in. However, there are some common skills that are required.
They need to have superior background skills in statistics, programming, and predictive modeling-but most importantly, they need critical thinking skills, a big-picture perspective, and a grasp of the interface between abstract data and the real world. This allows a Big Data Scientist to quickly grasp the context of how their skills can be applied to almost any field. They can use variations of their same skill set for anything from designing robust power transmission grids to predicting the success of an advertising campaign to deciding whether paying a 38-year old shortstop $20 million per year is a good deal.
Other important properties for a big data scientist to have are self-sufficiency, confidence, and excellent communication skills. A good data analyst should be able to attack a problem with minimal direction, work with the confidence to correctly assess data (avoiding influences from outside factors or their own biases), and have the communication skills to present the findings in terms that everyone can understand.
Finally, a good data scientist (as with most scientists) needs an innate sense of curiosity. The successful big data scientist sees the world as a gigantic data set, full of wonderful and useful patterns to extract and put to good use.
As the world grows in complexity and we are swamped with even more data, it's more vital than ever that we develop the next generation of big data scientists to help us navigate through the data swamp-whether it's the 21st century's sexiest job or not.