What Generative AI Means for Data Tasks
The narrative that technology will steal your job has been around for at least 200 years. Some roles disappear, but most simply evolve, adapt and merge.
The recurring question in any IT job is how long it will exist. Whether they say it openly or not, companies investing in automation, robotics and AI intend to replace human labor hours with inexpensive machine labor. That doesn’t mean they won’t create other jobs, but it does mean that some roles could eventually cease to exist, including those for data professionals.
I don’t agree with the doomsayers who believe we will all live on universal basic income, but I do think AI will change the roles and nature of data work. Far from reducing the demand for data talent, AI will expand it to the point where data literacy becomes comparable to typing, a skill that everyone in an office is expected to possess.
The three classic data roles – data analyst, data scientist and data engineer – are already fading. They have always been poorly defined and relied largely on who uses which tools. The analyst used SQL, data warehouses and lakes, and applications like Excel and Tableau. Data scientists, brothers of computer scientists, were those who coded in Python. Data engineers knew how to clean data, configure ETL (extract, transform, load) pipelines, and build machine learning pipelines.
Separating these functions is no longer an efficient way of working. Anyone dealing with data needs some SQL, Python, and engineering skills. However, thanks to generative AI, they don’t necessarily have to learn these skills from scratch.
Think of it this way: in 2003, after author Michael Lewis published silver ball, the story of the Oakland Athletics’ data-driven recruiting approach, other professional baseball teams have copied them. The industry hired statisticians and quantitative analysts (aka “quants”) who didn’t necessarily know much about baseball but knew how to identify undervalued talent using data.
Industries that compared themselves to competitive sports teams—commodity and stock market companies, for example—attempted to silver ball their industry too. There weren’t enough quants to go around. The shortage of data professionals was at the heart of the old “war for talent” scenario touted by business magazines and consulting firms.
Some 20 years later, the silver ball the skill set is widely available. The National Center for Education Statistics
reports that the number of universities offering master’s degrees in data disciplines increased from six in 2010 to 185 in 2022. Encoura, a marketing and enrollment platform for universities, find that degrees earned in data analytics and science increased from 5,604 in 2012 to 42,408 in 2021, an increase of more than 750%. At Carnegie Mellon University, where I teach, data science courses are among the most popular on campus for students of all disciplines.
Today, an MLB team can easily hire a former college baseball player who knows the mechanics of the game, the questions to ask, and ways to apply data insights – and thus has a good way to high level to blend data trends with business. insight. New generative AI tools will make the technology side even easier to use and fill even more gaps in their technical skills. Whatever label this role takes, it will involve a mix of analytical, science and engineering work.
To be clear, the most complex data problems always require a data scientist. In the same way that Squarespace made everyone pretty good (but not great) at web design, generative AI will make people pretty good at working with data. For something special and custom, whether it’s a website or data analysis, organizations always need a real pro. Sometimes AI platforms that promise to “democratize” skills “mediocritize” them. It’s not necessarily a bad thing if, ten years ago, a shortage of data skills was hindering progress.
Data scientists and deep data engineers are not replaceable. That said, many people without The word “data” in their title already regularly performs analytics and data science work to answer questions and inform decisions that can’t wait for a professional. Speed is important because the value of data can diminish over time. A prediction about events in 20 hours is worth $0 after 24 hours.
If 80% of data work is accessible to the generalist and only 20% requires a data specialist, this is roughly what the proportion of data talent in a company will look like. Titles like “machine learning engineer” or “AI engineer” could emerge to differentiate five-star data scientists from the data-savvy generalist. The risk is that companies, overconfident in generative AI, overcorrect and shrink their data teams to the point of benefiting their competitors.
The ubiquity of data skills, both human and machine, poses no threat to data scientists. On the contrary, it will make specialists more influential and more respected. Today’s data scientist sometimes struggles to find professional colleagues who speak their language and understand their methods. Data literacy and data literacy in business will lead to more analysis and forecasting and, perhaps, fewer opinions and instincts.
The narrative that technology will steal your job has been around for at least 200 years. Some roles disappear, but most simply evolve, adapt and merge. The only guarantee against an obsolete job is to continually acquire new skills. This is true across all fields and roles.
About the Author
Dr Jignesh Patel is a professor in the Department of Computer Science at Carnegie Mellon University and co-founder of Data Cat, the no-code generative AI platform for instant analytics. Patel’s research interests include analytics, AI, and scalable data platforms. He has supervised more than 20 doctorates. students, and his research articles have been selected as top articles in several leading database sites, including SIGMOD and VLDB. He is a member of the AAAS, ACM and IEEE organizations. Additionally, he received teaching awards at the University of Wisconsin and the University of Michigan, where he was previously a professor. He is keenly interested in technology transfer from university research and has created four startups from his research group. You can contact the author by email.