VTeam AI

Beginner Tutorials

Langchains for SQL generation, Fact Checking & rewriting

In our next LangChains blog post, we'll explore intriguing use-cases like Automatic SQL generation, Fact-Checking, and Rewriting. Leveraging the capabilities of Large Language Models (LLMs), LangChain allows for cutting-edge applications.

We have already covered a few use cases around langchains and how they can be used to develop numerous apps using LLMs. Extending this LangChains blog series, this time we would be covering very interesting use-cases i.e.

  1. Automatic SQL generation
  2. Fact-Checking i.e. Handling LLM Hallucinations
  3. Rewriting

If you have missed previous posts, we have got you covered here. A brief recapitulation on what LangChain and LLMs are before we start.

LLMs: A Large Language Model (LLM) is an advanced machine learning model that employs deep learning techniques and massive datasets to comprehend, summarize, generate, and predict content in natural language. LLMs, such as OpenAI's GPT series, are known for their ability to perform various natural language processing tasks, including text generation, classification, question answering, and translation. These models are trained on vast amounts of text data, often sourced from the internet, enabling them to understand language patterns and relationships.

LangChain: LangChain is a robust open-source framework designed for developing applications powered by large language models (LLMs). It distinguishes itself by its data-awareness and agentic nature, enabling connections to various data sources. LangChain facilitates advanced applications such as chatbots, generative question answering, summarization, and more by allowing the chaining of different components.

Divider

Tutorial 1: Automatic SQL Generation

In this tutorial, we will learn how we can connect a database with Langchains and automatically create queries over the tables in the database. For this, we would be using a globally available dummy dataset called Chinook which comprises the below tables (not jumping into table schemas for now):

  1. Employees: Holds employee details, including ID, names, and reporting structure.
  2. Customers: Stores customer data with IDs, names, and contact info.
  3. Invoices and invoice_items: These tables manage invoice information and line items.
  4. Artists: Contains artist data, including IDs and names.
  5. Albums: Stores album data linked to artists.
  6. Media_types: Holds information on media formats like MPEG and AAC audio.
  7. Genres: Represents music genres like rock, jazz, and metal.
  8. Tracks: Contains data about individual songs, associated with albums.
  9. Playlists and playlist_track: Manages playlists and track associations in a many-to-many relationship.

As usual, we will start with the required libraries

1

Next, we need to have a connection to Chinook DB

2

Loading LLM object and creating SQL chain

3

Time to run some examples

4

5

You can see the output produced by LangChains which can be straightforwardly used. You can try out more complex queries as well. 

How is Automatic SQL Generation useful?

Automatic SQL query generation offers substantial benefits in various contexts. It streamlines complex query creation, saving time and effort for database developers and analysts. This approach enhances the efficiency and accuracy of querying, enabling faster data retrieval and analysis. Additionally, it helps in the systematic testing of database engines by generating syntactically and semantically correct SQL queries. In the field of clinical studies, automatic SQL generation interfaces enhance data analysis transparency and validity, facilitating the extraction of patient cohorts that meet specific clinical criteria. Overall, automated SQL query generation accelerates development tasks, reduces errors, and improves data-driven decision-making processes.

Tutorial 2: Fact Checking

Many folks have recently reported of LLMs hallucinating i.e. giving ghostly/vague answers. So how can Fact-Checking help? You ask it a question/fact and the Fact-Checking mechanism can point out problems in your questions/fact and correct them. This can be used where we can input the output of one LLM to another for checking for hallucinations.

But why does an LLM hallucinate?

hallucinations arise due to the models prioritizing coherence and context over factual accuracy. These hallucinations occur when LLMs produce information that might sound plausible but is not supported by the actual data. Addressing LLM hallucinations is crucial due to ethical concerns and potential misinformation. Techniques to mitigate these issues include refining training data and employing techniques to encourage accurate output while reducing false or misleading information.

Sounds great !! 

Let’s start with the demo of importing libraries

6

Load an LLm and create a checker chain

7

Time fo testing

8

9

As you can observe, how the Fact-Checker was able to point out issues with the input text.

How Fact-Checker be useful?

A Fact-Checking AI app can be incredibly useful in the battle against misinformation and fake news. It has the ability to quickly and accurately assess the accuracy of claims, news articles, social media posts, and other types of content. By leveraging AI algorithms, these apps can analyze vast amounts of information and cross-reference them with reliable sources to determine the veracity of the claims. This capability empowers individuals, journalists, and organizations to make informed decisions and prevent the spread of false information. Automated fact-checking systems not only save time and resources but also contribute to a more informed and trustworthy information landscape by promoting accuracy and credibility.

Moving a step ahead, in our next tutorial, we will use LangChain to rewrite an incorrect paragraph.

Tutorial 3: Rewriting

So, in this section, we will feed in an incorrect paragraph, and using LangChain, we will rewrite it.

Importing libraries

10

Load LLM and create a summarization checker

11

Input paragraph

12

Let’s rewrite this incorrectly written para

13

14

As you can see, the whole paragraph has been rewritten!!

In summary, the exploration of the Langchains realm has been an enlightening voyage. The tutorials have unveiled the potential of Langchains, a versatile framework woven around the prowess of Large Language Models (LLMs). The adaptable and supple nature of Langchains bestows upon us the capability to craft sophisticated language model applications with simplicity. Grasping the fundamental principles such as components, chains, prompt templates, and agents has unshackled our ability to construct inventive solutions for chatbots, question-answering, summarization, and beyond. As we delve deeper into the intricacies of Langchains, we are equipped with the tools to harness the prowess of LLMs, paving the way for groundbreaking applications in the realm of natural language processing. This journey is but the inception, and with Langchains as our ally, the horizons of possibilities stand boundless. That’s all for today. See you soon with another exciting post on LangChains utilities.

Tweets:

  1. Generating SQL queries has never been easier. Say goodbye to manual coding and hello to efficiency! Checkout our latest post ASAP.
  2. Fact-checking made seamless with LangChains! Correcting inaccuracies in articles has never been this effortless. Checkout our blog on how to do it.
  3. Unlock the potential of LangChain's Checker Chains!  Managing LLM hallucinations is a breeze with this innovative tool.

15

Disclaimer: The views and opinions expressed in this blog post are solely those of the authors and do not reflect the official policy or position of any of the mentioned tools. This blog post is not a form of advertising and no remuneration was received for the creation and publication of this post. The intention is to share our findings and experiences using these tools and is intended purely for informational purposes.