Groq | 🦜️🔗 LangChain (2024)

LangChain supports integration with Groq chat models. Groq specializes in fast AI inference.

To get started, you'll first need to install the langchain-groq package:

%pip install -qU langchain-groq

Request an API key and set it as an environment variable:

export GROQ_API_KEY=<YOUR API KEY>

Alternatively, you may configure the API key when you initialize ChatGroq.

Here's an example of it in action:

from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

chat = ChatGroq(
temperature=0,
model="llama3-70b-8192",
# api_key="" # Optional if not set as an environment variable
)

system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | chat
chain.invoke({"text": "Explain the importance of low latency for LLMs."})

API Reference:ChatPromptTemplate | ChatGroq

AIMessage(content="Low latency is crucial for Large Language Models (LLMs) because it directly impacts the user experience, model performance, and overall efficiency. Here are some reasons why low latency is essential for LLMs:\n\n1. **Real-time Interaction**: LLMs are often used in applications that require real-time interaction, such as chatbots, virtual assistants, and language translation. Low latency ensures that the model responds quickly to user input, providing a seamless and engaging experience.\n2. **Conversational Flow**: In conversational AI, latency can disrupt the natural flow of conversation. Low latency helps maintain a smooth conversation, allowing users to respond quickly and naturally, without feeling like they're waiting for the model to catch up.\n3. **Model Performance**: High latency can lead to increased error rates, as the model may struggle to keep up with the input pace. Low latency enables the model to process information more efficiently, resulting in better accuracy and performance.\n4. **Scalability**: As the number of users and requests increases, low latency becomes even more critical. It allows the model to handle a higher volume of requests without sacrificing performance, making it more scalable and efficient.\n5. **Resource Utilization**: Low latency can reduce the computational resources required to process requests. By minimizing latency, you can optimize resource allocation, reduce costs, and improve overall system efficiency.\n6. **User Experience**: High latency can lead to frustration, abandonment, and a poor user experience. Low latency ensures that users receive timely responses, which is essential for building trust and satisfaction.\n7. **Competitive Advantage**: In applications like customer service or language translation, low latency can be a key differentiator. It can provide a competitive advantage by offering a faster and more responsive experience, setting your application apart from others.\n8. **Edge Computing**: With the increasing adoption of edge computing, low latency is critical for processing data closer to the user. This reduces latency even further, enabling real-time processing and analysis of data.\n9. **Real-time Analytics**: Low latency enables real-time analytics and insights, which are essential for applications like sentiment analysis, trend detection, and anomaly detection.\n10. **Future-Proofing**: As LLMs continue to evolve and become more complex, low latency will become even more critical. By prioritizing low latency now, you'll be better prepared to handle the demands of future LLM applications.\n\nIn summary, low latency is vital for LLMs because it ensures a seamless user experience, improves model performance, and enables efficient resource utilization. By prioritizing low latency, you can build more effective, scalable, and efficient LLM applications that meet the demands of real-time interaction and processing.", response_metadata={'token_usage': {'completion_tokens': 541, 'prompt_tokens': 33, 'total_tokens': 574, 'completion_time': 1.499777658, 'prompt_time': 0.008344704, 'queue_time': None, 'total_time': 1.508122362}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_87cbfbbc4d', 'finish_reason': 'stop', 'logprobs': None}, id='run-49dad960-ace8-4cd7-90b3-2db99ecbfa44-0')

You can view the available models here.

Groq chat models support tool calling to generate output matching a specific schema. The model may choose to call multiple tools or the same tool multiple times if appropriate.

Here's an example:

from typing import Optional

from langchain_core.tools import tool


@tool
def get_current_weather(location: str, unit: Optional[str]):
"""Get the current weather in a given location"""
return "Cloudy with a chance of rain."


tool_model = chat.bind_tools([get_current_weather], tool_choice="auto")

res = tool_model.invoke("What is the weather like in San Francisco and Tokyo?")

res.tool_calls

API Reference:tool

[{'name': 'get_current_weather',
'args': {'location': 'San Francisco', 'unit': 'Celsius'},
'id': 'call_pydj'},
{'name': 'get_current_weather',
'args': {'location': 'Tokyo', 'unit': 'Celsius'},
'id': 'call_jgq3'}]

.with_structured_output()

You can also use the convenience .with_structured_output() method to coerce ChatGroq into returning a structured output.Here is an example:

from langchain_core.pydantic_v1 import BaseModel, Field


class Joke(BaseModel):
"""Joke to tell user."""

setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")


structured_llm = chat.with_structured_output(Joke)

structured_llm.invoke("Tell me a joke about cats")
Joke(setup='Why did the cat join a band?', punchline='Because it wanted to be the purr-cussionist!', rating=None)

Behind the scenes, this takes advantage of the above tool calling functionality.

Async

chat = ChatGroq(temperature=0, model="llama3-70b-8192")
prompt = ChatPromptTemplate.from_messages([("human", "Write a Limerick about {topic}")])
chain = prompt | chat
await chain.ainvoke({"topic": "The Sun"})
AIMessage(content='Here is a limerick about the sun:\n\nThere once was a sun in the sky,\nWhose warmth and light caught the eye,\nIt shone bright and bold,\nWith a fiery gold,\nAnd brought life to all, as it flew by.', response_metadata={'token_usage': {'completion_tokens': 51, 'prompt_tokens': 18, 'total_tokens': 69, 'completion_time': 0.144614022, 'prompt_time': 0.00585394, 'queue_time': None, 'total_time': 0.150467962}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_2f30b0b571', 'finish_reason': 'stop', 'logprobs': None}, id='run-e42340ba-f0ad-4b54-af61-8308d8ec8256-0')

Streaming

chat = ChatGroq(temperature=0, model="llama3-70b-8192")
prompt = ChatPromptTemplate.from_messages([("human", "Write a haiku about {topic}")])
chain = prompt | chat
for chunk in chain.stream({"topic": "The Moon"}):
print(chunk.content, end="", flush=True)
Silvery glow bright
Luna's gentle light shines down
Midnight's gentle queen

Passing custom parameters

You can pass other Groq-specific parameters using the model_kwargs argument on initialization. Here's an example of enabling JSON mode:

chat = ChatGroq(
model="llama3-70b-8192", model_kwargs={"response_format": {"type": "json_object"}}
)

system = """
You are a helpful assistant.
Always respond with a JSON object with two string keys: "response" and "followup_question".
"""
human = "{question}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | chat

chain.invoke({"question": "what bear is best?"})
AIMessage(content='{ "response": "That\'s a tough question! There are eight species of bears found in the world, and each one is unique and amazing in its own way. However, if I had to pick one, I\'d say the giant panda is a popular favorite among many people. Who can resist those adorable black and white markings?", "followup_question": "Would you like to know more about the giant panda\'s habitat and diet?" }', response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 50, 'total_tokens': 139, 'completion_time': 0.249032839, 'prompt_time': 0.011134497, 'queue_time': None, 'total_time': 0.260167336}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_2f30b0b571', 'finish_reason': 'stop', 'logprobs': None}, id='run-558ce67e-8c63-43fe-a48f-6ecf181bc922-0')
Groq | 🦜️🔗 LangChain (2024)
Top Articles
Wat is een Business Process Consultant?
13 Facts About Progressive
Craigslist Pets Longview Tx
Professor Qwertyson
No Hard Feelings Showtimes Near Metropolitan Fiesta 5 Theatre
2022 Apple Trade P36
Stolen Touches Neva Altaj Read Online Free
Autozone Locations Near Me
Which Is A Popular Southern Hemisphere Destination Microsoft Rewards
Chastity Brainwash
Regal Stone Pokemon Gaia
Animal Eye Clinic Huntersville Nc
104 Whiley Road Lancaster Ohio
Dit is hoe de 130 nieuwe dubbele -deckers -treinen voor het land eruit zien
Mills and Main Street Tour
Minecraft Jar Google Drive
Immortal Ink Waxahachie
Simpsons Tapped Out Road To Riches
Zoe Mintz Adam Duritz
10 Fun Things to Do in Elk Grove, CA | Explore Elk Grove
Silive Obituary
Violent Night Showtimes Near Century 14 Vallejo
Www.dunkinbaskinrunsonyou.con
Panola County Busted Newspaper
Raw Manga 1000
Wonder Film Wiki
Enduring Word John 15
Grave Digger Wynncraft
Bj's Tires Near Me
Robert A McDougal: XPP Tutorial
Blush Bootcamp Olathe
The Hoplite Revolution and the Rise of the Polis
Six Flags Employee Pay Stubs
Los Amigos Taquería Kalona Menu
1987 Monte Carlo Ss For Sale Craigslist
The Pretty Kitty Tanglewood
The Mad Merchant Wow
Ket2 Schedule
Naya Padkar Newspaper Today
Avance Primary Care Morrisville
Dollar Tree's 1,000 store closure tells the perils of poor acquisitions
10 Rarest and Most Valuable Milk Glass Pieces: Value Guide
Ferguson Showroom West Chester Pa
Best GoMovies Alternatives
Southwest Airlines Departures Atlanta
Reilly Auto Parts Store Hours
Tlc Africa Deaths 2021
The Jazz Scene: Queen Clarinet: Interview with Doreen Ketchens – International Clarinet Association
Steam Input Per Game Setting
Spn 3464 Engine Throttle Actuator 1 Control Command
Craigslist Psl
91 East Freeway Accident Today 2022
Latest Posts
Article information

Author: Dan Stracke

Last Updated:

Views: 6332

Rating: 4.2 / 5 (63 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Dan Stracke

Birthday: 1992-08-25

Address: 2253 Brown Springs, East Alla, OH 38634-0309

Phone: +398735162064

Job: Investor Government Associate

Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.