Prompt Engineeringď
Prompt engineering is the act of adjusting the text you send (your prompt) to an Large Language Model (LLM) to give the AI the best chance at producing output that is useful to you. This is a fundamental skill independent of the specific LLM you are interacting with (Gemini, ChatGPT, Claude, etcâŚ). The goal of this page is to provide you with some basic guidelines to consider when crafting your prompts.
Remember: Expect any interaction with a LLM to be an iterative back-and-forth. It is normal not to get the right answer at first! A fundamental step in prompt engineering is simply âtry againâ. If an LLMâs output isnât useful to you, that is a good indicator to take a critical look at your own text.
Getting the Most from AI Assistantsď
Be Specific, Include as Much Context as Possible
â âThis doesnât workâ
â âIâm getting a âNameError: name âdatasetâ is not definedâ when trying to run ij.py.show(dataset). Hereâs my code: [paste code]â
Avoid Bias Leading Towards a Particular Answer
â âWhy is Colab better than other Jupyter notebook environments?â
â âGive me an unbiased assessment of the current options for running Jupyter notebooks in a table with pros and consâ
Ask for Explanations, Not Just Code
â âWrite code to filter an imageâ
â âExplain the difference between Gaussian blur and median filtering, explain the options for both in PyImageJ and their pros & consâ
Request Step-by-Step Breakdowns
â âHow do I segment cells?â
â âWhat steps are needed for segmenting the cells in a DAPI-stained image?â
â âExplain this process of cell segmentation, creating well-documented cells using PyImageJ for each step: preprocessing â thresholding â watershedâ
Ask Gemini to Defend its Choices
â âThanks for this notebook, now Iâm going to publish it!â
â âWhy did you convert these images to numpy for processing? Could it be done in PyImageJ directly?â
Understanding Hallucinationsď
Hallucinations are when an LLM generates false outputs. Fundamentally, this underscores a core principle of all LLMs:
â ď¸LLMs are Statistical Constructs that DO NOT understand meaningâ ď¸
It is important to understand that LLMs:
Work by generating the âmost likelyâ responses, based on their context and training data
Are tuned to be sensitive to your words: they can pick up on subtle bias and reflect that back to you
Tend to be overly positive, optimistic, and self-confident
Common areas you may run into hallucinations include:
Specific code usage: methods or packages that donât exist
Recent facts: LLMâs knowledge is frozen at their time of training and rely on internal web searches for the latest information
Obscure facts: particulars from publications, legal documents, etcâŚ
In references themselves: made-up names, URLs, paper titles
In Google Colab we have also noticed hallucinations in terms of understanding incorrect but not failing code cell output. For example, imagine a cell that segments an image and reports the number of objects found: Gemini may hallucinate a particular number of cells (âYou found 67 cells. Segmentation works!â) when in reality segmentation failed and only one or zero cells were found. This may be a sign that the Gemini chatbot has access to all cell outputs, but doesnât necessarily understand all formats of output.
General approaches for handling hallucinations include:
Be skeptical of verbatim LLM output. Ask for sources and verify they exist (e.g. using your own searching, or other LLMs).
Rely on your prompt engineering skills. Try again, be specific, and tell the LLM what it got wrong and what you expected.
More Conversation Startersď
Gemini is your tutor in the AI tutorial notebook: the hope is to have a natural conversation about your goals, so that Gemini can guide you to success.
Here are some ideas for more useful interactions:
Real-world Use:
I work with [microscopy type] images of [sample type] - what image processing skills would be most useful to me?
I currently use [other software] for [task] - how would I do this in PyImageJ?
Focus on Specific Areas:
What coding fundamentals should I practice for scientific computing?
What Colab features can help with my research?
Objective Navigation:
Show me a learning pathway for cell segmentation
What should I try next based on my current skill level?
Context Switching:
I want to focus on real research applications
What can I practice that will help with reproducible workflows?
Best Practices:
Do you have any thoughts on my approach to [previous activity]
What common mistakes should I watch out for?
Theory & Understanding:
Explain the theory behind [concept I just practiced]
Break down [complex topic] into smaller steps
Environment-Specific:
What should I know about running this locally instead of in Colab?
How would this work in a headless environment?