Build RAG in Minutes with GroundX

Enabling users to ask language models about their documents.

The Use Case

Lawyers sift through hundreds of pages of verbose legal documentation every day. In this article we'll be building a system which allows lawyers to understand documents faster and more thoroughly. We'll be using GroundX and OpenAI's GPT models to make it happen.

Why Language Models Won’t Work by Themselves

Language models can't communicate about topics they've never seen. If you're a lawyer, and you try to ask ChatGPT to help you understand a recent development in a case, the model will be unable to help. You could train a language model specifically on a case you want to understand, but this is cost prohibitive and incredibly difficult. On top of that, language models have a tendency to “hallucinate”. Sometimes language models will make up information which sounds plausible but has no basis in reality.

"Attorney Steven A. Schwartz first landed himself in hot water through his use of ChatGPT, which resulted in six fake cases being cited in a legal brief." — A famous example of hallucination in action. source

These issues might make our use case seem impossible. With GroundX it's not only possible, but easy. In the next section we’ll discuss a process called “Retrieval Augmented Generation” and GroundX's unique hallucination suppression features, which can help us fix both the issues of unknown information and hallucination.

Retrieval Augmented Generation

In-context learning is the ability of language models to learn new information on the fly as a result of a carefully constructed prompt. We can give a language model some information, called context, and then ask a question based on that information, all within a single prompt.

An example of in-context learning in action. If we provide information about a question, along with the question itself, we can construct better responses.

The idea behind retrieval augmented generation (RAG) is that, when a user asks a question, we can retrieve the relevant information for that question then inject it into the model as part of a prompt.

An example of RAG in action. A form of in-context learning where information from a document is provided as context in the prompt.

Passing a prompt and context to a language model is simple, but knowing what information to provide as context is tricky. In a large document, which contains a significant amount of information which is irrelevant to the prompt, it's important to retrieve only high quality contextually relevant information. That’s where GroundX comes in.

Retrieval with GroundX

The quality of a RAG system is all about the quality of the retrieved data. GroundX does a few things which sets it apart from other retrieval strategies.

  1. Improved Document Parsing: The formatting of documents can have a massive impact on the quality of retrieved information, and thus the quality of the final generated output. GroundX employs a fine tuned vision model, paired with a variety of parsing techniques, to extract quality information from a variety of document types and formats.
  2. Better Chunking: GroundX employs a proprietary dynamic chunker which identifies and preserves natural breaks within a document. This is a critical step which allows the right information to be fed to the model.
  3. Improved Database: GroundX employs a novel storage paradigm which preserves document context, allowing for improved retrieval. This means better retrieval of multiple chunks within multiple documents.
  4. Improved Context: GroundX provides a cocktail of contextual information, allong with the content of the document itself, which improves the output of generated responses.
  5. Better Search: GroundX employs a novel question analysis process which understands the core meaning of a query. This allows GroundX to provide higher quality chunks and outperform vector similarity.

Implementing a Legal Assistant with GroundX

Now that we know the theory, we can put it into practice. We’ll set up an account with GroundX, upload a document, and use it to create a RAG generated response. This tutorial will be done in Python, but GroundX supports a variety of languages which can be explored in the API reference docs.

Setting Up an Account

To follow this tutorial you'll need an API key. Let's set up a free account and get a key.

First, click "Free Trial" to create an account.
Then fill out the registration form.
This is your GroundX dashboard, we'll be using the default project in this tutorial. Navigate to "API Keys"
Copy the API key, we'll be using that to talk with GroundX via Python.

Setting up Credentials

In this example we’ll be using GroundX for document retrieval, and OpenAI for language generation. First we can install both dependencies:

Then define the API key we’ll be using for both services

Uploading a Document to a Bucket

To support projects including multiple documents, GroundX allows multiple documents to be uploaded to a “bucket”. First we can list all buckets on the account, then add our document to a bucket.

After a few moments we can check the status of the upload to see that it has been completed

The uploaded pdf document used in this example can be found here.

Retrieval

The Dobbs transcript we uploaded is from the fairly recent supreme court ruling which overturned Roe v. Wade. We can use GroundX to look through the transcript and retrieve information which might be useful to a prompt. In this case, the prompt is "summarize the decision".

Augmentation and Generation

In the previous section we retrieved a context based on our query. We can use that context, along with the prompt, to generate a response.

Examples

Below are a few examples of output based on a few queries.

Ex1) Summarization

Prompt: summarize the decision

Generated response using GroundX:

The U.S. Supreme Court, in the case of Thomas E. Dobbs, State Health Officer of the Mississippi Department of Health vs. Jackson Women's Health Organization, ruled that the U.S. Constitution does not grant a right to abortion. This shifts the landmark decisions of Roe v. Wade and Planned Parenthood of Southeastern Pennsylvania v. Casey, thereby returning the authority to regulate abortion back to the people and their elected representatives. Consequently, the Supreme Court effectively overruled Roe v Wade in this decision.

Justice Samuel Alito penned the majority opinion, with Justices Clarence Thomas and Brett Kavanaugh filing concurring opinions. Chief Justice Roberts gave an opinion that concurred with the judgment. Justices Stephen Breyer, Sonia Sotomayor, and Elena Kagan filed a dissenting opinion disagreeing with the Court's ruling. Key discussions in this case revolved around the principles of stare decisis, the judicial adherence to legal precedent, and whether a court's prior decision can be overturned if it was seen as egregiously incorrect at the outset. In this case, the principle was applied to the discussion of overturning Roe v. Wade and Planned Parenthood v. Casey.

Ex2) Outlining

Prompt: outline the main points of the decision

Generated response using GroundX:

1. The Supreme Court in the case of Dobbs v. Jackson Women's Health Organization overruled the landmark case of Roe v. Wade, determining that the U.S. Constitution does not safeguard the right to an abortion.

2. The Court determined that cases like Roe v. Wade and Planned Parenthood v. Casey should not stand as they do not have a basis in the Constitution. The text criticized the rulings for leading to the termination of millions of human lives and creating legal and interpretive challenges.

3. The Court sounded a need to relegate the power to regulate abortion back to the people through their elected representatives. The argument here against the Roe and Casey precedents emphasized the significance of the democratic process and the negative influence of these decisions on the same.

4. The decision illustrated that the Court has often overturned previous judgments, citing cases like Brown v. Board of Education (overruling the 'separate but equal' doctrine) and Lawrence v. Texas (ruling that states cannot criminalize homosexual conduct). The implication here is that, much like these cases, Roe and Casey can be reconsidered and overruled.

5. The decision involved a significant discussion on the principle of stare decisis (the practice of following legal precedents), revolving around whether a court decision could be overturned if viewed as egregiously wrong from the outset.

6. There was an acknowledgment of the societal and individual reliance on the right to abortion established by Roe and Casey. The decision recognized this 'reliance interest' but did not consider it a strong enough reason to not overturn the precedents.

7. Concerning the Mississippi law at the heart of the case, which prohibited abortions after 15 weeks except for certain exceptions, the Court's decision allowed for its enforcement.

Ex3) Filtered Outlining

Prompt: outline alito's arguments

Generated response using GroundX:

Justice Alito's arguments as depicted in this document center around the following key points:

1. Questioning Viability Standard: Justice Alito questions the viability standard established in the landmark precedent cases Roe v. Wade and Planned Parenthood v. Casey. He asks hypothetical questions about defending the viability line, challenging the respondents to explain why it should continue to be the focus of this issue if they were considering it for the first time.

2. Historical Basis for Abortion Rights: Justice Alito seeks to understand the historical underpinnings of abortion rights. Specifically, he asks if any state constitutional provision or judicial decision recognized abortion as a right, liberty, or immunity around 1868 when the Fourteenth Amendment was adopted.

3. Fetal Personhood and Viability Line: Justice Alito discusses the philosophical and legal considerations of when a fetus should be considered a person with rights, focusing particularly on the concept of viability. He seeks to understand a secular philosophical argument for viability being the appropriate line for determining personhood.

4. Overturning Precedents: Alito points to instances in the Court's history where it has overturned its own precedents in order to make progress on important issues.

5. Implications of Removing Viability Standard: Alito explores the potential consequences of eliminating the viability line, raising questions about the interests of the woman and the state once this line is crossed.

Justice Alito's arguments reflect a critical examination of the constitutional basis for abortion rights, the viability standard as the current determinant for legal abortions, and the possibility of overturning established precedent to change the status quo on abortions.

Conclusion

And that’s it! GroundX makes it easy to upload documents with complex formatting and query them seamlessly for a RAG application. Sign up now to start exploring GroundX!

Never miss an update.

Subscribe for spam-free updates and articles.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.