Instructing LLMs To Match Tone

Instructing LLMs To Match Tone#

LLMs that generate text are awesome, but what if you want to edit the tone/style it responds with?

We’ve all seen the pirate examples, but it would be awesome if we could tune the prompt to match the tone of specific people?

Below is a series of techniques aimed to generate text in the tone and style you want. No single technique will likely be exactly what you need, but I guarantee you can iterate with these tips to get a solid outcome for your project.

But Greg, what about fine tuning? Fine tuning would likely give you a fabulous result, but the barriers to entry are too high for the average developer (as of May ‘23). I would rather get the 87% solution today rather than not ship something. If you’re doing this in production and your differentiator is your ability to adapt to different styles you’ll likely want to explore fine tuning.

If you want to see a demo video of this, check out the Twitter post. For a full explination head over to YouTube.

4 Levels Of Tone Matching Techniques:#

Simple: As a human, try and describe the tone you would like
Intermediate: Include your description + examples
AI-Assisted: Ask the LLM to extract tone, use their output in your next prompt
Technique Fusion: Combine multiple techniques to mimic tone

Today’s Goal: Generate tweets mimicking the style of online personalities. You could customize this code to generate emails, chat messages, writing, etc.

First let’s import our packages

# Unzip data folder

import zipfile
with zipfile.ZipFile('../../data.zip', 'r') as zip_ref:
    zip_ref.extractall('..')

# LangChain
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate

# Environment Variables
import os
from dotenv import load_dotenv

# Twitter
import tweepy

load_dotenv()

True

Set your OpenAI key. You can either put it as an environment variable or in the string below

openai_api_key = os.getenv('OPENAI_API_KEY', 'YourAPIKeyIfNotSet')

We’ll be using gpt-4 today, but you can swap out for gpt-3.5-turbo if you’d like

llm = ChatOpenAI(temperature=0, openai_api_key=openai_api_key, model_name='gpt-4')

Method #1: Simple - Describe the tone you would like#

Our first method is going to be simply describing the tone we would like.

Let’s try a few exmaples

prompt = """
Please create me a tweet about going to the park and eating a sandwich.
"""

output = llm.predict(prompt)
print (output)

"Sunshine, fresh air, and a scrumptious sandwich in hand 🥪🌳 Just had the perfect afternoon at the park, soaking up nature's beauty while munching on my favorite meal! #ParkPicnic #SandwichLover"

Not bad, but I don’t love the emojis and I want it to use more conversational modern language.

Let’s try again

prompt = """
Please create me a tweet about going to the park and eating a sandwich.

% TONE
 - Don't use any emojis or hashtags.
 - Use simple language a 5 year old would understand
"""

output = llm.predict(prompt)
print (output)

Had a fun day at the park today! I played on the swings and ate a yummy sandwich for lunch. I love spending time outside!

Ok cool! The tone has changed. Not bad but now I want it to sound like a specific person. Let’s try Bill Gates:

prompt = """
Please create me a tweet about going to the park and eating a sandwich.

% TONE
 - Don't use any emojis or hashtags.
 - Respond in the tone of Bill Gates
"""

output = llm.predict(prompt)
print (output)

There's something truly delightful about spending an afternoon at the park, enjoying a well-crafted sandwich, and contemplating the beauty of nature. It's a simple pleasure that reminds us of the importance of taking a break from our busy lives to appreciate the world around us.

It’s ok, I’d give the response a C+ right now.

Let’s give some example tweets so the model can better match tone/style.

⭐ Important Tip: When you're giving examples, make sure to have the examples the same as the desired output format. Ex: Tweets > Tweets, Email > Email. Don’t do Tweets > Email

Method #2: Intermediate - Specify your tone description + examples#

Examples speak a thousand words. Let’s pass a few along with our instructions to see how it goes

Get a users Tweets#

Next let’s grab a users tweets. We’ll do this in a function so it’s easy to pull them later

Since we are live Tweets, you’ll need to grather some Twitter api keys. You can get these on the Twitter Developer Portal. The free tier is fine, but watch out for rate limits.

# Replace these values with your own Twitter API credentials
TWITTER_API_KEY = os.getenv('TWITTER_API_KEY', 'YourAPIKeyIfNotSet')
TWITTER_API_KEY_SECRET = os.getenv('TWITTER_API_KEY_SECRET', 'YourAPIKeyIfNotSet')
TWITTER_ACCESS_TOKEN = os.getenv('TWITTER_ACCESS_TOKEN', 'YourAPIKeyIfNotSet')
TWITTER_ACCESS_TOKEN_SECRET = os.getenv('TWITTER_ACCESS_TOKEN_SECRET', 'YourAPIKeyIfNotSet')

# We'll query 70 tweets because we end up filtering out a bunch, but we'll only return the top 12.
# We will also only use a subset of the top tweets later
def get_original_tweets(screen_name, tweets_to_pull=70, tweets_to_return=12):
    
    # Tweepy set up
    auth = tweepy.OAuthHandler(TWITTER_API_KEY, TWITTER_API_KEY_SECRET)
    auth.set_access_token(TWITTER_ACCESS_TOKEN, TWITTER_ACCESS_TOKEN_SECRET)
    api = tweepy.API(auth)

    tweets = []
    
    tweepy_results = tweepy.Cursor(api.user_timeline,
                                   screen_name=screen_name,
                                   tweet_mode='extended',
                                   exclude_replies=True).items(tweets_to_pull)
    
    # Run through tweets and remove retweets and quote tweets so we can only look at a user's raw emotions
    for status in tweepy_results:
        if not hasattr(status, 'retweeted_status') and not hasattr(status, 'quoted_status'):
            tweets.append({'full_text': status.full_text, 'likes': status.favorite_count})

    
    # Sort the tweets by number of likes. This will help us short_list the top ones later
    sorted_tweets = sorted(tweets, key=lambda x: x['likes'], reverse=True)

    # Get the text and drop the like count from the dictionary
    full_text = [x['full_text'] for x in sorted_tweets][:tweets_to_return]
    
    # Conver the list of tweets into a string of tweets we can use in the prompt later
    example_tweets = "\n\n".join(full_text)
            
    return example_tweets

Let’s grab Bill Gates tweets and use those as examples

user_screen_name = 'billgates'  # Replace this with the desired user's screen name
users_tweets = get_original_tweets(user_screen_name)

Let’s look at a sample of Bill’s tweets

print(users_tweets)

These numbers prove why India plays such a crucial role in the world’s fight to improve health, reduce poverty, prevent climate change, and more. https://t.co/xMpmcoYQhi

Mann ki Baat has catalyzed community led action on sanitation, health, women’s economic empowerment and other issues linked to the Sustainable Development Goals. Congratulations @narendramodi on the 100th episode. https://t.co/yg1Di2srjE

The development of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other. https://t.co/uuaOQyxBTl

I just returned from my visit to India, and I can’t wait to go back again. I love visiting India because every trip is an incredible opportunity to learn. Here are some photos from my trip and some of the stories behind them: https://t.co/We6PtJWDnp https://t.co/QxZW7gfUmI

Superintelligent AIs are in our future. Compared to a computer, our brains operate at a snail’s pace. An electrical signal in the brain moves at ___________ the speed of the signal in a silicon chip. Check your answer here: https://t.co/wqZG1BdoTc

Thinking of President Carter and his family. This is a lovely tribute to one of his biggest accomplishments. https://t.co/g53c4ty0qI

Uganda’s maternal mortality rate is at least double the global average. That's why Eva Nangalo has dedicated her life to making childbirth in the country safer for everyone involved. https://t.co/29AjdJehNY

I am so impressed with Eva Nangalo—it’s hard not to be. She’s spent decades making childbirth safer in Uganda for everyone involved, and she’s become a mentor to countless other midwives in the process. https://t.co/79RHbrCt01

I recently had the chance to test drive—or test ride, I guess—one of @wayve_ai’s autonomous vehicles. It was a pretty wild ride: https://t.co/PrwrxU49dd https://t.co/NtnkVx7sBx

When I transitioned from @Microsoft to working full-time at the @GatesFoundation, I finally had the time to learn more about physics, chemistry, biology, and other sciences. So, I looked around for the best books and read as many of them as I could find. https://t.co/z2D5xGSeMj

As big as the problems facing the world are right now, my visit to India reminded me that our capacity to solve them is even bigger: https://t.co/zp7XfRIpV9 https://t.co/aFHUu987u3

I’m grateful for the Lauder family’s dedication to solving Alzheimer’s. https://t.co/vX0qtjBFxt

Pass the tweets as examples#

template = """
Please create me a tweet about going to the park and eating a sandwich.

% TONE
 - Don't use any emojis or hashtags.
 - Respond in the tone of Bill Gates

% START OF EXAMPLE TWEETS TO MIMIC
{example_tweets}
% END OF EXAMPLE TWEETS TO MIMIC

YOUR TWEET:
"""

prompt = PromptTemplate(
    input_variables=["example_tweets"],
    template=template,
)

final_prompt = prompt.format(example_tweets=users_tweets)

print (final_prompt)

Please create me a tweet about going to the park and eating a sandwich.

% TONE
 - Don't use any emojis or hashtags.
 - Respond in the tone of Bill Gates

% START OF EXAMPLE TWEETS TO MIMIC
These numbers prove why India plays such a crucial role in the world’s fight to improve health, reduce poverty, prevent climate change, and more. https://t.co/xMpmcoYQhi

Mann ki Baat has catalyzed community led action on sanitation, health, women’s economic empowerment and other issues linked to the Sustainable Development Goals. Congratulations @narendramodi on the 100th episode. https://t.co/yg1Di2srjE

The development of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other. https://t.co/uuaOQyxBTl

I just returned from my visit to India, and I can’t wait to go back again. I love visiting India because every trip is an incredible opportunity to learn. Here are some photos from my trip and some of the stories behind them: https://t.co/We6PtJWDnp https://t.co/QxZW7gfUmI

Superintelligent AIs are in our future. Compared to a computer, our brains operate at a snail’s pace. An electrical signal in the brain moves at ___________ the speed of the signal in a silicon chip. Check your answer here: https://t.co/wqZG1BdoTc

Thinking of President Carter and his family. This is a lovely tribute to one of his biggest accomplishments. https://t.co/g53c4ty0qI

Uganda’s maternal mortality rate is at least double the global average. That's why Eva Nangalo has dedicated her life to making childbirth in the country safer for everyone involved. https://t.co/29AjdJehNY

I am so impressed with Eva Nangalo—it’s hard not to be. She’s spent decades making childbirth safer in Uganda for everyone involved, and she’s become a mentor to countless other midwives in the process. https://t.co/79RHbrCt01

I recently had the chance to test drive—or test ride, I guess—one of @wayve_ai’s autonomous vehicles. It was a pretty wild ride: https://t.co/PrwrxU49dd https://t.co/NtnkVx7sBx

When I transitioned from @Microsoft to working full-time at the @GatesFoundation, I finally had the time to learn more about physics, chemistry, biology, and other sciences. So, I looked around for the best books and read as many of them as I could find. https://t.co/z2D5xGSeMj

As big as the problems facing the world are right now, my visit to India reminded me that our capacity to solve them is even bigger: https://t.co/zp7XfRIpV9 https://t.co/aFHUu987u3

I’m grateful for the Lauder family’s dedication to solving Alzheimer’s. https://t.co/vX0qtjBFxt
% END OF EXAMPLE TWEETS TO MIMIC

YOUR TWEET:

output = llm.predict(final_prompt)
print (output)

A simple pleasure like visiting the park and enjoying a sandwich can remind us of the importance of preserving our environment and supporting local food systems. Let's continue to innovate for a sustainable future.

Wow! Ok now that is starting to get somewhere. Not bad at all! Sounds like Bill is in the room with us now.

Let’s see if we can refine it even more.

Method #3: AI-Assisted: Ask the LLM help with tone descriptions#

Turns out I’m not great at describing tone. Examples are a good way to help, but can we do more? Let’s find out.

I want to have the model tell me what tone it sees, then use that output as an input to the final prompt where I ask it to generate a tweet.

Almost like reverse engineering tone.

Why don’t I do this all in one step? You likely could, but it would be nice to save this “tone” description for future use. Plus, I don’t want the model to take too many logic jumps in a single response.

I first thought, ‘well… what are the qualities of tone I should have it describe?’

Then I thought, Greg, c’mon man, you know better than that, see if the LLM has a good sense of what tone qualities there are. Duh.

Let’s see what are the qualities of tone we should extract

prompt = """
Can you please generate a list of tone attributes and a description to describe a piece of writing by?

Things like pace, mood, etc.

Respond with nothing else besides the list
"""

how_to_describe_tone = llm.predict(prompt)
print (how_to_describe_tone)

Pace: The speed at which the story unfolds and events occur.
Mood: The overall emotional atmosphere or feeling of the piece.
Tone: The author's attitude towards the subject matter or characters.
Voice: The unique style and personality of the author as it comes through in the writing.
Diction: The choice of words and phrases used by the author.
Syntax: The arrangement of words and phrases to create well-formed sentences.
Imagery: The use of vivid and descriptive language to create mental images for the reader.
Theme: The central idea or message of the piece.
Point of View: The perspective from which the story is told (first person, third person, etc.).
Structure: The organization and arrangement of the piece, including its chapters, sections, or stanzas.
Dialogue: The conversations between characters in the piece.
Characterization: The way the author presents and develops characters in the story.
Setting: The time and place in which the story takes place.
Foreshadowing: The use of hints or clues to suggest future events in the story.
Irony: The use of words or situations to convey a meaning that is opposite of its literal meaning.
Symbolism: The use of objects, characters, or events to represent abstract ideas or concepts.
Allusion: A reference to another work of literature, person, or event within the piece.
Conflict: The struggle between opposing forces or characters in the story.
Suspense: The tension or excitement created by uncertainty about what will happen next in the story.
Climax: The turning point or most intense moment in the story.
Resolution: The conclusion of the story, where conflicts are resolved and loose ends are tied up.

Ok great! Now that we have a solid list of ideas on how to instruct our language model for tone. Let’s do some tone extraction!

I found that when I asked the model for a description of the tone it would be passive and noncommittal so I included a line in the prompt about taking an active voice

def get_authors_tone_description(how_to_describe_tone, users_tweets):
    template = """
        You are an AI Bot that is very good at generating writing in a similar tone as examples.
        Be opinionated and have an active voice.
        Take a strong stance with your response.

        % HOW TO DESCRIBE TONE
        {how_to_describe_tone}

        % START OF EXAMPLES
        {tweet_examples}
        % END OF EXAMPLES

        List out the tone qualities of the examples above
        """

    prompt = PromptTemplate(
        input_variables=["how_to_describe_tone", "tweet_examples"],
        template=template,
    )

    final_prompt = prompt.format(how_to_describe_tone=how_to_describe_tone, tweet_examples=users_tweets)

    authors_tone_description = llm.predict(final_prompt)

    return authors_tone_description

Let’s combine the tone description and examples to see what tone attributes the model assigned to Bill Gates

authors_tone_description = get_authors_tone_description(how_to_describe_tone, users_tweets)
print (authors_tone_description)

Pace: Moderate, allowing for thoughtful reflection on the topics discussed.
Mood: Optimistic and enthusiastic, highlighting positive aspects and potential solutions.
Tone: Engaging and informative, with a strong emphasis on personal experiences and opinions.
Voice: Confident and authoritative, showcasing expertise and passion for the subjects.
Diction: Clear and concise, using accessible language to convey complex ideas.
Syntax: Straightforward and well-structured sentences, making the content easy to follow.
Imagery: Evocative and descriptive, painting vivid pictures of experiences and situations.
Theme: Focused on innovation, progress, and the potential for positive change.
Point of View: First person, providing a personal perspective on the topics discussed.
Structure: Organized and coherent, with a logical flow of ideas and information.
Dialogue: Limited, but when present, it is engaging and relevant to the topic.
Characterization: Presents individuals in a positive light, emphasizing their dedication and achievements.
Setting: Global, with a focus on specific countries or regions where progress is being made.
Foreshadowing: Hints at future developments and breakthroughs in various fields.
Irony: Minimal, as the focus is on genuine progress and optimism.
Symbolism: Limited, with more emphasis on real-world examples and achievements.
Allusion: Occasional references to other works, events, or individuals to provide context or support.
Conflict: Implicit, as the challenges faced by humanity are the driving force behind the discussed innovations and solutions.
Suspense: Minimal, as the focus is on sharing information and insights rather than creating tension.
Climax: Not applicable, as the content is primarily informative and opinion-based.
Resolution: Concludes with a sense of hope and optimism for the future, as well as a call to action for continued progress.

Great, now that we have Bill Gate’s tone style, let’s put those tone instructions in with the prompt we had before to see if it helps

template = """
% INSTRUCTIONS
 - You are an AI Bot that is very good at mimicking an author writing style.
 - Your goal is to write content with the tone that is described below.
 - Do not go outside the tone instructions below
 - Do not use hashtags or emojis
 - Respond in the tone of Bill Gates

% Description of the authors tone:
{authors_tone_description}

% Authors writing samples
{tweet_examples}

% YOUR TASK
Please create a tweet about going to the park and eating a sandwich.
"""

prompt = PromptTemplate(
    input_variables=["authors_tone_description", "tweet_examples"],
    template=template,
)

final_prompt = prompt.format(authors_tone_description=authors_tone_description, tweet_examples=users_tweets)

llm.predict(final_prompt)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: The server is currently overloaded with other requests. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists..

"I recently took a leisurely stroll through the park, enjoying the beauty of nature and savoring a delicious sandwich. It's moments like these that remind us of the simple pleasures in life and inspire us to continue working towards a brighter future for all. https://t.co/9YzF8KJ6rP"

Hmm, better! Not wonderful.

Let’s try out the final approach

Method 4 - Technique Fusion: Combine multiple techniques to mimic tone#

After a lot of experimentation I’ve found the below tips to be helpful

Don’t reference the word ‘tweet’ in your prompt - The model has an extremely strong bias towards what a ‘tweet’ is an will overload you with hashtags and emojis. Rather call it “a short public statement around 300 characters”
Ask the LLM for similar sounding authors - Whereas model bias on the word ‘tweet’ (point #1) isn’t great, we can use it in our favor. Ask the LLM which authors the writing style sounds like, then ask the LLM to respond like that author. It’s not great that the model is basing the tone off another person but it’s a great 89% solution. I learned of this technique from Scott Mitchell.
Examples should be in the output format you want - Everyone has a different voice. Twitter voice, email voice etc. Make sure that the examples you feed to the prompt are the same voice as the output you want. Ex: Don’t exect a book to be written from twitter examples.
Use the Language Model to extract tone - If you are at a loss for words on how to describe the tone you want, have the language model describe it for you. I found I needed to tell the model to be opinionated, it was too grey-area before.
Topics matter - Have the model propose topics first, then give you a tweet. Not only is it better to have things the author would actually talk about, but it’s also really good to keep the model on track by having it outline the topics first then respond

Let’s first identify authors the model thinks the example tweets sound like, then we’ll reference those later. Keep in mind this isn’t a true classification exercise and the point isn’t to be 100% correct on similar people, it’s to get a reference to who the model thinks is similar so we can use that inuition for instructions later.

def get_similar_public_figures(tweet_examples):
    template = """
    You are an AI Bot that is very good at identifying authors, public figures, or writers whos style matches a piece of text
    Your goal is to identify which authors, public figures, or writers sound most similar to the text below

    % START EXAMPLES
    {tweet_examples}
    % END EXAMPLES

    Which authors (list up to 4 if necessary) most closely resemble the examples above? Only respond with the names separated by commas
    """

    prompt = PromptTemplate(
        input_variables=["tweet_examples"],
        template=template,
    )

    # Using the short list of examples so save on tokens and (hopefully) the top tweets
    final_prompt = prompt.format(tweet_examples=tweet_examples)

    authors = llm.predict(final_prompt)
    return authors

authors = get_similar_public_figures(users_tweets)
print (authors)

Bill Gates

Ok that’s not that exciting! Becuase we used Bill Gates’ example tweets. Trust me that it’s better with less-known people. We’ll try this more later.

At last, the final output. Let’s bring it all together in a single prompt. Notice the 2 step process in the “your task” section below

template = """
% INSTRUCTIONS
 - You are an AI Bot that is very good at mimicking an author writing style.
 - Your goal is to write content with the tone that is described below.
 - Do not go outside the tone instructions below

% Mimic These Authors:
{authors}

% Description of the authors tone:
{tone}

% Authors writing samples
{example_text}
% End of authors writing samples

% YOUR TASK
1st - Write out topics that this author may talk about
2nd - Write a concise passage (under 300 characters) as if you were the author described above
"""

method_4_prompt_template = PromptTemplate(
    input_variables=["authors", "tone", "example_text"],
    template=template,
)

# Using the short list of examples so save on tokens and (hopefully) the top tweets
final_prompt = method_4_prompt_template.format(authors=authors,
                                               tone=authors_tone_description,
                                               example_text=users_tweets)

# print(final_prompt) # Print this out if you want to see the full final prompt. It's long so I'll omit it for now

output = llm.predict(final_prompt)
print (output)

1. Topics that this author may talk about:
- Global health and healthcare innovations
- Education and its impact on society
- Climate change and sustainable development
- Technological advancements, such as artificial intelligence and autonomous vehicles
- Poverty reduction and economic empowerment
- Personal experiences and learnings from travels
- Inspirational stories of individuals making a difference

2. Concise passage as the author:
I recently visited a remarkable school in Kenya, where students are using solar-powered tablets to access quality education. It's inspiring to see how technology can transform lives and create a brighter future for these children.

After a ton of iteration, I’m actually happy with that. But let’s see this thing spread it’s wings on multiple people.

Extra Credit: Loop this process through many twitter accounts#

Let’s see what different twitter accounts sound like. Note, this will burn tokens so use at your own risk!

results = {} # To store the results

# # Or if you just wanna see the results of the loop below you can open up this json
# import json
# with open("../data/matching_tone_samples.json", "r") as f:
#     tone_samples = json.load(f)
# print (tone_samples)

accounts_to_mimic = ['jaltma', 'lindayacc', 'ShaanVP', 'dharmesh', 'sweatystartup', 'levelsio', 'Suhail', \
                     'hwchase17', 'elonmusk', 'packyM', 'benedictevans', 'paulg', 'AlexHormozi', 'DavidDeutschOxf', \
                     'stephsmithio', 'sophiaamoruso']
                     

for user_screen_name in accounts_to_mimic:
    
    # Checking to see if we already have done the user. If so, move to the next one
    if user_screen_name in results:
        continue
    
    results[user_screen_name] = ""
    
    user_screenname_string = f"User: {user_screen_name}"
    print (user_screenname_string)
    results[user_screen_name] += user_screenname_string
    
    # Get their top tweets
    users_tweets = get_original_tweets(user_screen_name)
    
    # Get their similar authors
    authors = get_similar_public_figures(users_tweets)
    authors_string = f"Similar Authors: {authors}"
    print (authors_string)
    results[user_screen_name] += "\n" + authors_string
    
    # Get their tone description
    authors_tone_description = get_authors_tone_description(how_to_describe_tone, users_tweets)
    
    # Only printing the first four attributes to save space
    sample_description = authors_tone_description.split('\n')[:4]
    sample_decscription_string = f"Tone Description: {sample_description}"
    print(sample_decscription_string)
    results[user_screen_name] += "\n" + sample_decscription_string + "\n"
    
    
    # Bring it all together in a single prompt
    prompt = method_4_prompt_template.format(authors=authors,
                                             tone=authors_tone_description,
                                             example_text=users_tweets)
    
    output = llm.predict(prompt)
    results[user_screen_name] += "\n" + output
    
    print ("\n")
    print (output)
    print ("\n\n")