Database development has become more complex over the years due to the introduction of new data and storage formats by big data. This along with modern applications that often depend on multiple databases and APIs, plus newer data analytics and data science needs have raised the bar for database infrastructure skills.”
Fortunately, AI SQL code generators can automate many repetitive tasks and help you simplify database development processes that continually get more complex. AI SQL code generators not only generate SQL, but also explain complex SQL queries and fix errors. They are not perfect in all cases, of course, but they are useful to give help and inspiration, as long as their code can be tested.
In this article, you will learn what AI SQL code generators are in detail and their benefits. In addition, you will get some examples of a SQL generator being used to fix issues and generate SQL queries using AI tools. We will use BlazeSQL because it has built-in domain knowledge curated specifically for database development, because it has been trained specifically on database code. This gives it a deeper understanding of database concepts such as normalization, performance tuning, etc; making it more effective than a general tool like ChatGPT because it specializes in database techniques.
BlazeSQL has a local client (Windows desktop app) and a web client that does not need an installation process. All you need is to connect your database to the BlazeSQL platform. This article use the web client in examples.
Note that you can use BlazeSQL in a few manners, and direct access to the data in your database is optional and how it works varies based on the model that you use. For more information about the tool uses and processes your data, refer to BlazeSQL Privacy Policy. Note that it is imperative that make sure you have the rights and permission to attach to the database in this manner. Many companies have policies that would prohibit allowing access to a database that could have any personal data in them, from any external tool in this manner.
What are SQL Code Generators?
SQL code generators have existed for a long time. Using the metadata of a database, code generators can produce queries that do all sorts of operations for you very quickly. However, before AI was introduced into the process, the programmer basically specified the exact process they wanted, and the tables and columns involved. Very useful, but often quite tedious.
AI based SQL code generators take that to the next level in that they take a natural language text input and the same schema metadata to automate the creation of SQL queries. AI SQL code generators use machine learning models and are implemented on top of large language models (LLM) and generative AI such as GPT 4 and Claude. They are completely powered by these large language models.
Generative AI uses transformer algorithms and neural networks to predict the next SQL query part. The difference between the current generative AI and the traditional AI is that traditional AI analyzes datasets to make predictions while the current AI analyzes the dataset to produce original content related to the dataset.
AI SQL code generators mix algorithms and machine learning techniques to generate the requested output. They follow the steps below:
- Input Interpretation
- Schema Analysis
- Query Synthesis
- Optimization
To make AI SQL code generators better than code generators such as ChatGPT, AI SQL code generators are trained primarily on large amounts of SQL codebases. This fine-tunes and refines them to understand database schemas, relationships, and specific SQL syntax, providing more correct and relevant results compared to a general-purpose model like ChatGPT.
How to Boost Database Development Efficiency Using AI SQL Code Generators
In this article, we will use BlazeSQL to explore AI SQL code generators. It is an AI SQL generator launched in 2023 by Justus Mulli. offers the following features:
- Generate SQL queries from natural language.
- Explain complex queries in plain English.
- Optimize your queries for better performance.
- Connect to your database and run queries. BlazeSQL supports and connects to both SQL and NoSQL databases.
BlazeSQL can find corrupt data in a database or any issue that will cause a query to fail. Issues such as duplicate IDs and incomplete metadata such as a database that does not specify foreign keys. BlazeSQL does not only generate SQL queries it is also able to find errors and explain the root cause of the errors. If there are any missing keys or components in your database BlazeSQL will prompt you to add them before generating a query. You will learn more about this in the next section “Correcting Faulty Code” about how BlazeSQL is able to catch errors.
To connect a database to BlazeSQL navigate to the BlazeSQL app and click on the Database button on the left pane and then click add new. If you don’t have a database demo you can get a demo database for postgres or SQL Server (script for SQL Server) on the SQL server samples GitHub repository.
Next, add the requested details and click Connect when you are done.
The first way of boosting your database development process is by generating queries using AI. BlazeSQL offers a text-to-query feature. This feature allows you to generate a query by providing a prompt in English. For example, you can provide a prompt that asks BlazeSQL to generate a query that extracts the price of a book called “How to Code” from your database. You can also generate a database index using a prompt in English.
BlazeSQL is suitable for teamwork as it allows you to share your work with other teammates and preserve the generated queries. The query editor has a feature that allows you to share the generated output with other developers.
Even though BlazeSQL does not have a large community of users, it provides good user support and thrives to make the UI easy to use.
The screenshot below shows the BlazeSQL chatbot. The chatbot allows you to type in prompts to generate SQL queries and fix issues. On the left pane, there is the database section that shows you the connected database; you will get to see all available records and tables you can analyze in this section.
It is good at introspecting data trends and finding critical points from provided data sets that help you with business intelligence tasks. It helps you make data-driven decisions.
Generating SQL Queries
BlazeSQL scans and analyzes your connected database to generate queries, make sure you have connected your database to BlazeSQL before trying to generate queries. Once you have connected to the database you can start generating queries. For example, I have connected a demo database to BlazeSQL. This database has records about product details, employee details, and distribution regions of a store.
When BlazeSQL is connected to your database, you can retrieve insights from your database such as the most expensive product and which product has the lowest or declining sales. So, if I want to know which product is consistently delivered on time, I will use the following prompt:
Prompt: What is the percentage of orders that are shipped on time?
BlazeSQL gives the following output:
To calculate the percentage of orders that are shipped on time, we need to compare the shipped_date
with the required_date
in the orders table. Here’s the query to get this information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
WITH on_time_orders AS ( SELECT COUNT(*) AS on_time_count FROM public.orders WHERE shipped_date <= required_date ), total_orders AS ( SELECT COUNT(*) AS total_count FROM public.orders ) SELECT (on_time_orders.on_time_count::decimal / total_orders.total_count) * 100 AS on_time_percentage FROM on_time_orders, total_orders; |
This will return the result and a friendly reminder that it can give you more information:
Next, let’s input a prompt that will ask BlazeSQL to create a query that sorts customers to give the most paying customer.
Second prompt: Which customers spent the most money?
BlazeSQL gives the following output:
To decide which customers spent the most money, we need to calculate the total amount spent by each customer. This can be done by joining the orders and order_details
tables, and then summing up the total cost for each customer. Here’s the query to “get the top 3 customers who spent the most money”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
SELECT c.customer_id, c.company_name, SUM(od.unit_price * od.quantity * (1 - od.discount)) AS total_spent FROM public.customers c JOIN public.orders o ON c.customer_id = o.customer_id JOIN public.order_details od ON o.order_id = od.order_id GROUP BY c.customer_id, c.company_name ORDER BY total_spent DESC LIMIT 3; |
The output of this will be:
Explaining Code
Besides generating code, BlazeSQL can take in your query that you need more details and generate comprehensive details about how the query works and its components. Below is an example of the output that I got from BlazeSQL when I asked it to explain the code:
The image below is the continuation of the output from the above image.
Correcting Faulty Code
If you get an error that is caused by components that you do not understand, you can ask BlazeSQL to explain them. You can also input an existing query into BlazeSQL to optimize its performance or fix syntax errors in the query and generic query errors. BlazeSQL has been trained to spot faults on your SQL queries such as:
- Incorrect syntax and semantics
- Faulty logic
- Incomplete or missing components
This feature is good for developers trying to learn SQL, especially those learning on their own without a tutor or mentor. Below is a SQL query that has faults and has been sent to BlazeSQL.
1 2 3 4 5 6 7 8 |
SELECT customer_id, first_name, last_name FROM Customers WHERE total_spent > 1000 JOIN Orders ON Customers.customer_id = Orders.customer_id; |
BlazeSQL was able to notice that the query has the following issues:
- The
JOIN
clause should be placed before theWHERE
clause. - The
total_spent
column does not exist in theCustomers
table. You need to calculate it by joining theOrders
table and summing the order amounts. - The
GROUP BY
clause is needed to aggregate the total spent per customer.
After giving you a report of the issues BlazeSQL generates the corrected code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT c.customer_id, c.first_name, c.last_name FROM Customers c JOIN Orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.first_name, c.last_name HAVING SUM(o.total_amount) > 1000; |
This query joins the Customers
and Orders
tables, groups the results by customer, and filters for customers who have spent more than $1000.
Where Do AI SQL Code Generators Stand at The Moment?
In short, AI SQL code generators don’t have what it takes to replace database developers. They fit in as productivity boosters in a database developer’s workflow. Integrating AI SQL code generators into your database development workflow can lead to significant efficiency improvements.
Code created by code generators such as BlazeSQL has to be checked to ensure that it is relevant and accurate. As large language models improve and get faster, code generators will get better at creating code that is well optimized. The large language model growth has been exponential for the past two years. We should anticipate much better SQL output from SQL output generators.
Challenges and Limits of AI SQL Code Generators
Even though the benefits of the AI SQL code generators are performance boosters they still have their own downside such as:
- Potential errors: While AI systems can be highly accurate, they are not infallible. There’s always a risk of generating incorrect queries, especially for edge cases or unusual requests. This can cause delays as you will have to test the code first.
- Security concerns: AI-generated queries can expose crucial credentials such as passwords when not managed properly. It is important to ensure that the queries generated by AI are checked for any security vulnerabilities such as SQL injection.
- Limited to training data: Any AI model gives a correct solution to an existing and well-documented problem. This means that if you have a unique or new problem that the AI SQL code generator doesn’t know, it won’t give the best query optimization solution. If the training data doesn’t cover certain types of queries or database structures, the AI may struggle with such tasks. The quality of the SQL code generator is highly dependent on the quality of the data used to train the AI SQL code generator.
Conclusion
The future of software development is AI-driven. Currently AI is helping us boost coding performance by 10% or more; with the potential for exponential growth as these technologies evolve. Soon enough we will likely move from 10% to 40% and so forth. In the end, developers are destined to solve hardcore complex problems that have high rewards, in far less time that is required today.
Load comments