Introduction to General LLM Use

Session Overview

Architecture: Most LLMs, including GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), are based on the transformer architecture. This enables them to process words in relation to all other words in a sentence, rather than one at a time, allowing for a deep understanding of context and nuances in language.

Tokenization Process: LLMs convert text into tokens, which are numerical representations of words or subwords. This process allows the model to efficiently process and generate text. Tokenization plays a crucial role in handling diverse vocabularies, including scientific terminology, by breaking down complex words into manageable pieces.

Training and Data: LLMs are trained on vast datasets comprising a wide range of text sources, from books and articles to websites and more. This extensive training enables them to learn language patterns, grammar, and knowledge across various domains, including science. The quality and diversity of training data significantly influence the model’s performance and bias. GPT-4 was trained on ~1 PB of data and is thought to have 1.7 trillion parameters.

Ethical Considerations and Limitations: While LLMs are powerful tools, it’s important to acknowledge their limitations and ethical concerns, including biases in the training data, potential for generating misleading information, and the need for human oversight in interpreting and validating their outputs in scientific contexts. LLMs potentially “understand” language, in the sense that they have emergent capabilities as their network sizes scale.

An emerging body of experience, thought, and research on how to best create prompts that ellicit optimal responses from LLMs and their associated chatbots.
Prompt engineering is still developing, although there are some settled best practices for the current generation of LLMs.
Treat an LLM like a smart, capable person who has little-to-no contextual background for the problem you are asking them to solve.
Log Into ChatGPT and initialize a new chat.

Assign clear and specific instructions

Tactic 1: Use delimiters to clearly indicate distinct parts of the input
- Delimiters can be anything like: `, """, < >, <tag> </tag>, :
Tactic 2: Ask the model to check whether conditions are satisfied
Tactic 3: “Few-shot” prompting
- Provide the model with 1-3 examples in order to generate desired output.

Give the model time to “think”

Tactic 1: Specify the steps required to complete a task.
Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion.

Assign the model to play a specific role

LLM chatbots are excellent at role-playing and will give results optimized to specific audiences based on the roles you assign them to play.
Assign the models to play roles that are conducive to optimal results (ie, “frontend software developer”, “statistician specializing in uncertainty quantification”, “deep learning expert with expertise in time series forecasting”).

Iteratively refine your prompts

Prompts are rarely optimized the first time their given. Or even the first several times.
Refine on your prompts to make them more thorough and more specific, in order to gain your desired output.