Runnable Interface
The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as language models, output parsers, retrievers, compiled LangGraph graphs and more.
This guide covers the main concepts and methods of the Runnable interface, which allows developers to interact with various LangChain components in a consistent and predictable manner.
- The "Runnable" Interface API Reference provides a detailed overview of the Runnable interface and its methods.
- A list of built-in
Runnables
can be found in the LangChain Core API Reference. Many of these Runnables are useful when composing custom "chains" in LangChain using the LangChain Expression Language (LCEL).
Overview of Runnable Interfaceβ
The Runnable way defines a standard interface that allows a Runnable component to be:
- Invoked: A single input is transformed into an output.
- Batched: Multiple inputs are efficiently transformed into outputs.
- Streamed: Outputs are streamed as they are produced.
- Inspected: Schematic information about Runnable's input, output, and configuration can be accessed.
- Composed: Multiple Runnables can be composed to work together using the LangChain Expression Language (LCEL) to create complex pipelines.
Please review the LCEL Cheatsheet for some common patterns that involve the Runnable interface and LCEL expressions.
Optimized Parallel Execution (Batch)β
LangChain Runnables offer a built-in batch
(and batch_as_completed
) API that allow you to process multiple inputs in parallel.
Using these methods can significantly improve performance when needing to process multiple independent inputs, as the processing can be done in parallel instead of sequentially.
The two batching options are:
batch
: Process multiple inputs in parallel, returning results in the same order as the inputs.batch_as_completed
: Process multiple inputs in parallel, returning results as they complete. Results may arrive out of order, but each includes the input index for matching.
The default implementation of batch
and batch_as_completed
use a thread pool executor to run the invoke
method in parallel. This allows for efficient parallel execution without the need for users to manage threads, and speeds up code that is I/O-bound (e.g., making API requests, reading files, etc.). It will not be as effective for CPU-bound operations, as the GIL (Global Interpreter Lock) in Python will prevent true parallel execution.
Some Runnables may provide their own implementations of batch
and batch_as_completed
that are optimized for their specific use case (e.g.,
rely on a batch
API provided by a model provider).
The async versions of abatch
and abatch_as_completed
these rely on asyncio's gather and as_completed functions to run the ainvoke
method in parallel.
When processing a large number of inputs using batch
or batch_as_completed
, users may want to control the maximum number of parallel calls. This can be done by setting the max_concurrency
attribute in the RunnableConfig
dictionary. See the RunnableConfig for more information.
Chat Models also have a built-in rate limiter that can be used to control the rate at which requests are made.
Asynchronous Supportβ
Runnables expose an asynchronous API, allowing them to be called using the await
syntax in Python. Asynchronous methods can be identified by the "a" prefix (e.g., ainvoke
, abatch
, astream
, abatch_as_completed
).
Please refer to the Async Programming with LangChain guide for more details.
Streaming APIsβ
Streaming is critical in making applications based on LLMs feel responsive to end-users.
Runnables expose the following three streaming APIs:
- sync stream and async astream: yields the output a Runnable as it is generated.
- The async
astream_events
: a more advanced streaming API that allows streaming intermediate steps and final output - The legacy async
astream_log
: a legacy streaming API that streams intermediate steps and final output
Please refer to the Streaming Conceptual Guide for more details on how to stream in LangChain.
Input and Output Typesβ
Every Runnable
is characterized by an input and output type. These input and output types can be any Python object, and are defined by the Runnable itself.
Runnable methods that result in the execution of the Runnable (e.g., invoke
, batch
, stream
, astream_events
) work with these input and output types.
- invoke: Accepts an input and returns an output.
- batch: Accepts a list of inputs and returns a list of outputs.
- stream: Accepts an input and returns a generator that yields outputs.
The input type and output type vary by component:
Component | Input Type | Output Type |
---|---|---|
Prompt | dictionary | PromptValue |
ChatModel | a string, list of chat messages or a PromptValue | ChatMessage |
LLM | a string, list of chat messages or a PromptValue | String |
OutputParser | the output of an LLM or ChatModel | Depends on the parser |
Retriever | a string | List of Documents |
Tool | a string or dictionary, depending on the tool | Depends on the tool |
Please refer to the individual component documentation for more information on the input and output types and how to use them.