Data analytics has no shortage of AI tools, making it difficult to determine what is available and what might be useful when it comes to analyzing huge data sets. Let’s try to clear up at least some of the confusion by looking at some AI tools that work within existing systems. data analysis workflow.
Chatbots and virtual assistants
Chatbots are ubiquitous in any type of programming work these days, especially the newer generative AI chatbots (such as ChatGPT). These tools can offer assistance with data analysis in several ways. For example, they can offer coding advice. If you write Python codefor example, you can copy it into the chatbot and ask for help.
Chatbots can also be used for education and training purposes, for example to suggest which tools are best suited for a particular job. Here are some general-purpose chatbots popular among data people and coders:
- ChatGPT: Of course, everyone knows this one; While trained on a whole world of knowledge, ChatGPT also excels in coding and data analysis. Newer versions of ChatGPT (starting with 4) can perform data analysis by writing Python code and then running that code.
- Gemini: It’s Google’s answer to ChatGPT, and it’s arguably just as good. You can provide it with data that it will analyze.
- Github co-pilot: It is a plugin tool that adds chat and AI code suggestions to coding IDEs such as Visual Studio Code. Although based on the technology behind ChatGPT, Copilot has additional training in different coding languages and technologies.
- Amazon Q: Amazon Q is a comprehensive set of products, but at the heart of it is generative AI for assistance with various tasks. Amazon has integrated Q into several of its services, including those related to data. For example, Amazon’s Quicksite lets you use natural (i.e. human) language to analyze data and even generate dashboards.
Data cleaning tools
One of the biggest problems data analysts face almost daily is having to deal with messy data. As any data analyst can tell you, data can come from multiple sources in multiple formats and is not always reliable.
Using statistical tools, data analysts become experts at “cleaning” data by removing bad data or correcting it. For example, if the data analyst analyzes gasoline prices and finds a gas station that charged $0.03 per gallon, this is likely an error that needs to be corrected or removed; otherwise, it would completely skew all results.
Data cleansing can be a long and tedious process, even though it is an integral part of the job. Fortunately, there are now AI tools that greatly help locate problematic data and process it automatically. Here are some tools you’ll want to learn that use AI to clean data. (Note that these are premium tools, not free.)
- Trifacta: It’s one of the biggest names in the industry and was recently acquired by a company called Alteryx.
- Dataiku: This product uses generative AI to help you clean your data.
Generate sample datasets
When learning data analysis or testing data applications, analysts need data samples to work with, ideally as realistic as possible. There are websites that provide sample data, but these datasets can be filled with redundant data that can disrupt your tests. On the other hand, manually constructing hundreds of thousands of rows of data can be too time consuming.
This is where AI data generation tools can help, including:
- Tonic.ai: This product calls itself “real fake data”. On their website they call it synthetic data.
- Mockingbird: This is a free tool that uses AI to generate sample data.
Tell stories
After analyzing the data, analysts typically present a story to stakeholders explaining their findings. This storytelling includes written stories, visual elements such as charts and graphs, and recommendations. AI tools exist for all three aspects of storytelling.
PowerBI and Tableau, which we mention in the next section, now offer these capabilities. But here are a few others you’ll want to explore:
- Wordsmith: It is a product from Automated Insights that uses AI and starts with natural language generation.
- Data Bot: It is a comprehensive suite of tools that includes a sophisticated system that uses AI for storytelling.
AI complements to tools
Data analysts have several tools that they use regularly. Let’s review some of the AI additions to these tools.
Power BI: Microsoft has added its Copilot AI features to PowerBI, and it can help you in several different ways. For example, Copilot can help write Data Analysis Expressions (DAX) queries. (Pro tip: You’ll still want to learn as much as you can about DAX, but Copilot can help you work faster here.) Copilot can also perform an analysis of your dataset and provide you with a quick summary, and even provide tables and graphs in the summary.
Painting: Tableau now includes Tableau Agent, which is a chatbot that will interact directly with Tableau via a prompt. For example, you can ask it questions about the data and it will answer them…and also create tables and graphs. Tableau Agent’s marketing materials say it’s good for new analysts; However, we would say that experienced analysts will want to get familiar and familiar with it, as these plugins are quickly becoming the standard for all levels, not just beginners.
Excel: Microsoft has added its Copilot AI technology to the entire Microsoft Office suite, which includes Excel. In this case, Copilot includes a chatbot that can also interact with and manipulate your sheets. For example, you can ask it to create a bar chart based on data from a particular set of columns on a sheet.
Google Sheets: Google now includes its Gemini AI as an optional premium add-on for Google Workspace, including Gmail, Docs and Sheets. Since we’re looking at data here, let’s just focus on Sheets. As you might expect, you can ask Gemini (via a prompt) for information contained in your sheets, as well as use it to generate sheets. And as with Copilot and Excel, it’s probably more suited to less technical people.
Jupyter notebooks (including Google Colab): Jupyter Notebooks now has an extension called Jupyter AI: the chatbot is called “Jupyternaut” and, like the other tools mentioned here, can understand your prompts and interact with your notebooks. Since Jupyter is like an IDE, you can highlight the code in a notebook and have Jupyternaut analyze it for you. You can ask it questions about your code and then ask it to modify your code with updates or even add comments. You can even ask it to create new notebooks for you that meet a particular purpose.
For example, in the blog linked above there is a demonstration of how you can ask it to create a notebook showing how to use Matplotlib. Things like this can be useful for learning how to use Jupyter and the different Python libraries. You can also choose which LLM it should interact with; This is an advanced and great feature, especially if you work for a company that has developed its own LLMs.
Note: The Jupyter developers have gone to great lengths to ensure that this plugin works on different Jupyter implementations, including Google Colab.
MATLAB: This one is a bit strange, as you’ll soon see, but we’re including it here for completeness. MATLAB, from Mathworks, is a computing and mathematics engine created in the mid-1970s. MATLAB has continued to grow over the decades and is popular today in many fields, including data analysis.
Today, MATLAB offers AI tools and capabilities, including those that help you create and manage AI models and integrate those models into your code, while also helping you develop workflows of data.
Other tools
While you’re at it, you can explore lesser-known tools to see how they can help you. (And it also helps the industry grow by giving smaller names a chance.) Here are a few we looked at:
- Data Cat: This product is billed as a “no code” analytics platform that uses AI to help you analyze your data. It lets you ask questions about your data and then uses generative AI for analysis and answers. It can also work closely with Google’s BigQuery data warehouse system, as well as HubSpot data.
- Zoho: Zoho has offered an online office suite for many years, including a spreadsheet tool. Recently, they added generative AI features, including an assistant with the clever name Zia.
- Domo has been around since 2010 and provides cloud-based tools for business intelligence and data visualization. Its new AI tools include a chatbot for questions about your data; it will provide rapid answers and produce summaries of even the most complex data sets.
Conclusion
Today’s data analysts are needed more than ever. Data analysis positions are increasingand specialists in these roles are tasked with incredibly complex projects. Given the pressure of work, it is essential that data analysts learn and potentially master as many AI tools as possible, as this will allow them to become more productive.