Text summarization is a valuable tool for quickly extracting key information from longer documents. This example demonstrates how to use ControlFlow to create a text summarization function that not only produces a concise summary but also extracts key points, all in a single pass.

Code

The following code creates a function that summarizes a given text and extracts key points. It uses ControlFlow’s task running feature and leverages Pydantic for structured output.

import controlflow as cf
from pydantic import BaseModel

class Summary(BaseModel):
    summary: str
    key_points: list[str]

def summarize_text(text: str, max_words: int = 100) -> Summary:
    return cf.run(
        f"Summarize the given text in no more than {max_words} words and list key points",
        result_type=Summary,
        context={"text": text},
    )

Let’s use this function to summarize a longer text:

Key concepts

This implementation showcases several important ControlFlow features that enable quick development of advanced text processing tools:

  1. Structured outputs: We use a Pydantic model (Summary) as the result_type to define the structure of our output. This ensures that the summarization task returns both a summary and a list of key points in a well-defined format.

    class Summary(BaseModel):
        summary: str
        key_points: list[str]
    
    result_type=Summary
    
  2. Context passing: The context parameter is used to pass the input text and maximum word count to the task.

    context={"text": text}
    
  3. Dynamic instructions: We include the max_words parameter in the task instruction, allowing for flexible control over the summary length.

    f"Summarize the given text in no more than {max_words} words and list key points"
    

By leveraging these ControlFlow features, we can create a powerful text summarization tool with just a few lines of code. This example demonstrates how ControlFlow simplifies the process of building and deploying advanced NLP tools, making it easier for developers to incorporate complex language processing capabilities into their applications.

The use of a Pydantic model for the output is particularly noteworthy, as it allows us to define a clear structure for our summarization results. This structured output makes it easy to work with the summary and key points separately in downstream tasks or when presenting the information to users.