AI Literacy Program

The AI Literacy program is a practical learning series for staff focused on building confidence with today’s AI tools, and using them thoughtfully in day-to-day work.This is  a practical, beginner-friendly way to explore AI tools available to staff, build shared vocabulary, and develop habits that lead to better outcomes.

Staff currently have access to CU-Chat, Gemini, and NotebookLM. This program explores what these tools can do, how to get started, and how to use them effectively. We begin with the essentials, access, setup, and prompting, and progress toward more advanced practices, including building prompt libraries, creating reusable workflows (such as Gems), exploring tool settings and parameters, and experimenting with basic coding support using AI.

The goal is to help staff develop AI literacy and identify responsible, practical ways to make workflows more efficient. This page serves as the public home for our program materials. Everyone is welcome to follow along, learn at their own pace, and return to specific topics as needed.

If you have further questions, please contact [email protected]

AI Course Materials

This guide is designed to make AI feel practical and familiar. It covers what today's tools are, how they differ from search, what they're good at, where they fail, and how to use them responsibly in a university context. One theme will come up again and again: AI can speed up work, but it doesn't replace judgment.

By the end of this module, you should be able to explain how generative AI works at a high level, choose the right tool for common tasks, write clearer prompts, and verify outputs—especially when accuracy, policy, or people are involved.

Learning Objectives

  • Define core terms (AI, LLM, generative AI, prompt, hallucination, RAG)
  • Use a simple mental model for how chat-based AI produces answers
  • Identify "low-risk, high-value" use cases: summarization, rewriting, structuring messy text, planning support
  • Recognize common failure modes (hallucinations, outdated knowledge, prompt sensitivity)
  • Explain predictive-only vs RAG (retrieval-augmented generation) and when to use each
  • Apply basic safety practices: data classification awareness, privacy/IP caution, and verification habits (Provost policy + responsible use expectations)

Glossary

  • AI (Artificial Intelligence): Computer systems that perform tasks that often require human intelligence (language understanding, pattern recognition, problem-solving, learning).
  • LLM (Large Language Model): A type of AI trained on large amounts of text to predict and generate language.
  • Generative AI: AI that produces new content (text, images, audio, code) rather than only labeling or classifying existing data.
  • Prompt: The instructions or input you give the model.
  • Hallucination: When a model generates information that sounds plausible but is incorrect or invented. (This is a known limitation, not a rare edge case.)
  • RAG (Retrieval-Augmented Generation): A system that retrieves documents first, then generates an answer grounded in those documents (often with citations).

1) How generative AI works: the "super-autocomplete" mental model

A simple way to understand chat-based tools is: super-autocomplete.

Instead of predicting the next word in a sentence, the model predicts likely continuations across paragraphs, explanations, and even code—based on patterns learned during training. This is why it can be remarkably fluent and useful. It's also why it can be wrong.

Key implication: The model is not "checking truth." It is generating what looks like a plausible answer. That's why verification matters and why confidence ≠ correctness.

2) Strengths

For many CUIT workflows, the most reliable early wins are language-heavy tasks:

  • Drafting and rewriting messages and documentation
  • Summarizing long text you provide
  • Extracting structure from messy text (turning threads/notes into action items, owners, next steps)
  • Reformatting content (bullets → table, long → short, formal → friendly)

3) Limits and failure modes

This guide emphasizes that limitations are expected and manageable—if you know what they look like.

Hallucinations

The model may invent details (names, steps, citations, causes) in a confident tone.

What to do:

  • Ask for sources/citations when appropriate
  • Require the model to label uncertainty
  • Verify against trusted documents

Outdated knowledge

Even strong models can be out of date relative to current systems, policies, and changes.

What to do:

  • Use grounded tools (RAG / NotebookLM) for document-based questions
  • Provide current context, and still verify

Prompt sensitivity

Vague prompts tend to produce vague (or oddly specific) answers.

What to do: Give clear context, audience, constraints, and a desired output format.

4) Myths

  • "It knows everything."
    Reality: It estimates based on patterns; always verify.
  • "I can paste anything into it."
    Reality: Follow data classification and policy; some things must never go in.
  • "If it sounds confident, it's correct."
    Reality: Confidence ≠ correctness.
  • "AI will replace my job."
    Reality: It will change tasks; humans still own judgment and accountability.

5) Prompting fundamentals (Week 1 level)

You'll learn more prompting later in the program, but here is a baseline that immediately improves results.

A simple prompt formula

Role + Task + Context + Constraints + Output format

Example:

  • "You are an IT communications assistant. Draft an email to staff about planned downtime. Audience is non-technical. Keep it under 150 words. Include a subject line and a 3-bullet summary."

Before/after examples

Example A: Summarization

  • Weak: "Summarize this."
  • Better: "Summarize this for a non-technical manager. Keep dates and numbers intact. Output: 5 bullets + 3 action items."

Example B: Rewriting

  • Weak: "Rewrite this nicer."
  • Better: "Rewrite in a friendly, professional tone and cut by 30%. Keep the deadline and required steps unchanged."

Example C: When you want it to not guess

  • Train the model to admit uncertainty:
    "If you don't know or can't verify the answer, say 'I don't know' rather than guessing."

6) The AI tool landscape available to CUIT staff

AI tools come in an ecosystem:

  • Text models (chat tools)
  • Image/video generation tools
  • Voice tools (speech ↔ text)

For CUIT staff in this pilot, the focus is on three CUIT-supported tools:

  • CHAT (CU-Chat) — Columbia's AI chat platform designed for flexible, customizable AI interactions.
  • Gemini — A Google AI tool that can work across modalities and integrates with Google's ecosystem.
  • NotebookLM — Designed for grounded work: it analyzes your sources and helps synthesize them, reducing hallucinations by tying outputs to what you uploaded or linked.

Choosing the right tool (General Guide):

  • Use CHAT/CU-Chat for drafting, rewriting, summarizing text you provide, outlining, and planning.
  • Use Gemini when you're working closely in Google tools and want AI embedded into that workflow.
  • Use NotebookLM when you need answers grounded in specific documents (policies, SOPs, guidance) and want to check outputs against sources.

7) Predictive-only vs RAG

Predictive-only ("answer from memory")

The model answers based on patterns from training and whatever you provide in the prompt. Great for writing help and summarizing what you paste.

RAG ("go read, then answer")

RAG adds a retrieval step: the system fetches relevant documents (web pages, KB articles, your uploaded sources), then generates an answer grounded in those documents. This approach is strongest when you care about recency, policy accuracy, and traceable sources.

Scenario: "What policy applies here?"

  • Predictive-only chat may produce a plausible answer but risk missing updates or inventing details.
  • A grounded workflow (e.g., NotebookLM with the Provost policy or relevant CUIT SOPs) can produce answers that you can check against citations/snippets.

8) Guided exercises

Exercise 1: Summarize with citations

The goal is to produce a useful summary and make it easy to verify against original sources.

Try:

  1. Pick a dense article or policy excerpt.
  2. Prompt: "Summarize in 6 bullets. Include a short quote or citation marker for each bullet so I can verify."
  3. Check: Can you trace each claim back to the original?

Exercise 2: Build your reusable "prompt profile"

Create a starter prompt you can reuse across tools:

  • Your role
  • 3–5 tasks you actually do
  • Tone preference
  • Constraints

This becomes your "default context" for better results.

Exercise 3: NotebookLM "policy brain" (step-by-step lab)

  1. Open NotebookLM and create a new notebook.
  2. Add at least two short help docs as sources (links or uploads).
  3. Ask doc-grounded questions like: "According to these sources, what are the exact steps to…?"
  4. Check whether answers match the docs and whether citations/snippets are present.

⚠ Note: The original text appears truncated here: "Generate a succinct guide/FAQ (aim fo…" — this sentence needs to be completed.

Challenge question: Did it stay within the source? Can you tell?

AI tools can be powerful and practical—but the quality of what you get depends heavily on how you set the task up. This module introduces two related skills:

  • Prompt engineering: Writing clear instructions so the model knows what you want.
  • Context engineering: Designing everything the model "sees" so it has the right information, constraints, and examples to work from.

The core idea: better outcomes come from reducing ambiguity. If the model has to guess your audience, goal, format, or constraints, it will.

Learning Objectives

  • Turn vague requests into clear, reusable prompts
  • Use a consistent structure to communicate role, objective, constraints, and outputs
  • Add guardrails that prevent guessing, overconfidence, and policy drift
  • Provide context in ways that improve accuracy without overwhelming the model
  • Break complex tasks into steps and choose the right "mode" (fast vs deep reasoning) for the job

1) Prompt engineering: move from "asking" to "specifying"

A common reason AI outputs disappoint is that the prompt doesn't include the details a human would naturally ask for: "Who is this for? What tone? How long? What can't I change?"

A simple prompt formula

A reliable baseline is:

Role + Task + Audience + Constraints + Output format

Example:

"You are an IT communications assistant. Draft an email informing staff of planned downtime. Audience is non-technical. Keep it under 150 words. Include a subject line and 3 bullet points with next steps."

Weak vs strong prompts

Weak: "Rewrite this email."

Strong: "Rewrite this email in a friendly, professional tone, cut by ~30%, and keep the deadline and required steps unchanged. Output a subject line plus the final email."

Strong prompts work because they reduce guessing.

Practice drill: Prompt upgrade

Goal: Turn a vague prompt into a prompt that's clear, constrained, and easy to evaluate.

  1. Pick a task you do often (rewrite, summarize, plan, respond).
  2. Write a "weak" version of the prompt in one sentence.
  3. Rewrite it using the formula Role + Task + Audience + Constraints + Output format.

Copy/paste starter:

You are a [role]. Create [deliverable] for [audience]. Constraints: [length/tone/must-keep/must-avoid]. Output format: [bullets/table/email/checklist].

Success check: Could someone else read your prompt and produce the same output you expect?

2) A reusable template: ROSES

When you want prompts you can reuse (and share with a team), it helps to use a structured template.

ROSES:

  • Role: Who the model should act as (and what values/constraints it should follow)
  • Objective: What you want produced, and for whom
  • Scenario: Context, constraints, and what not to assume
  • Expected solution: The format, must-haves, and exclusions
  • Steps: How to proceed (draft → check → revise)

ROSES example (general)

  • Role: "You are a service desk analyst writing for non-technical staff."
  • Objective: "Explain X in plain language and reduce confusion."
  • Scenario: "This may be forwarded; avoid making assumptions about someone's access or role."
  • Expected solution: "120–150 words, neutral tone, include one example and one 'what to do next' bullet."
  • Steps: "Draft, then check for assumptions, then revise for clarity."

Why this helps: ROSES prompts tend to produce outputs that are more consistent, safer, and easier to review.

Practice drill: ROSES rewrite

Goal: Create a reusable prompt that can be saved in a prompt library.

  1. Choose one task you do repeatedly (summarizing email threads, drafting announcements, writing KB blurbs, turning notes into action items).
  2. Fill in ROSES.
  3. Run the prompt, then revise one section (usually Scenario or Expected solution) to tighten results.

Copy/paste ROSES template:

Role: You are a [your role] writing/working for [audience]. Objective: Produce [deliverable] that helps [goal]. Scenario: Context: [2–3 lines]. Constraints: [what must stay true / what to avoid]. Expected solution: Output format: [format]. Length: [limit]. Tone: [tone]. Must include: [items]. Must not include: [items]. Steps: Draft → list assumptions → revise for clarity and risk → final output.

Success check: Does this prompt reduce back-and-forth and editing time?

3) Common prompting pitfalls

Pitfall: Missing context

  • If the model doesn't know your environment, constraints, or audience, it will fill gaps with generic assumptions.
  • Fix: Provide the minimum context needed: who, what, for whom, and what matters.

Pitfall: No constraints

  • Without boundaries, outputs may become too long, too informal, too risky, or too detailed.
  • Fix: Set constraints: word count, tone, reading level, what must stay unchanged, and what to avoid.

Pitfall: "Do everything in one prompt"

  • Complex tasks often fail when you ask for the final deliverable immediately (especially when multiple inputs are involved).
  • Fix: Break the work into steps: summarize → extract key points → propose outline → draft → refine.

Pitfall: Treating outputs as final

  • Even strong prompts don't guarantee accuracy.
  • Fix: Make verification part of the prompt: "List assumptions," "Flag uncertainty," "Ask questions," or "Cite sources when applicable."

4) Guardrails that improve reliability

Add plain-language "behavior rules" to reduce risk. Guardrails are especially useful for policy-adjacent work, communications, and anything that may be shared broadly.

  • Always flag uncertainty and assumptions
  • Never invent citations or specific facts
  • Always ask clarifying questions if needed
  • Never provide steps that conflict with policy or privacy expectations
  • Always keep sensitive data out of outputs unless explicitly provided and approved

Practice drill: Summarize with constraints

Goal: Practice specificity and accuracy.

Paste a non-sensitive long email thread or meeting notes.

Copy/paste prompt:

Summarize the content for a non-technical manager. Constraints: keep dates, names, and numbers exactly as written. Output: (1) 6 bullets, (2) 5 action items with "Owner / Next step / Due date (if stated)", (3) 3 open questions.

Success check: If any detail isn't in the text, the model should write "Not provided."

5) Context engineering

Prompt engineering is about the instruction. Context engineering is about the whole environment:

  • Instructions
  • Examples
  • Source text / evidence
  • The conversation history
  • Attached documents
  • Tools (search, citations, code execution, templates)

The "right amount" of context

More context is not always better. Too much content can bury what matters, or cause the model to focus on irrelevant sections.

General advice:

  • Provide what the model needs to succeed
  • Highlight the most important parts
  • Format content so boundaries are clear

Delimit evidence from instructions

When you want the model to use only certain information, separate it clearly.

Example structure:

  • Instructions
  • Evidence
  • Task

And include a rule like:

"Answer only using the Evidence section. If the evidence doesn't contain the answer, say you don't know."

Break down complex work

For complex deliverables, ask the model to proceed in stages:

  1. Confirm understanding
  2. List assumptions + open questions
  3. Propose an outline
  4. Draft section by section
  5. Run a final check (accuracy, tone, constraints)

This reduces errors and makes review easier.

Practice drill: Answer only from evidence

Goal: Practice grounded outputs using delimitation.

Copy/paste prompt:

Instructions: Answer using only the Evidence. If the Evidence doesn't contain the answer, say "I don't know based on the provided evidence." Evidence: [paste your excerpt here] Task: 1. Provide a 7-bullet summary. 2. Quote the exact line/phrase from the Evidence that supports each bullet (short excerpt). 3. List 3 missing pieces of information needed to act safely.

Success check: Every bullet should map to evidence. If it doesn't, tighten the evidence pack or clarify headings.

6) Choosing the right mode

Not every task needs the same level of reasoning.

  • Use fast modes for rewriting, summarizing, formatting, and straightforward drafting.
  • Use deep reasoning for planning, multi-step problems, careful explanations, and structured decision support.

A simple heuristic: if the task has lots of dependencies, trade-offs, or risk, slow down and use the more deliberate mode—and require the model to show assumptions and checks.

Optional reading:

AI tools can be powerful and practical—but outcomes depend heavily on where you do the work. Different tools are optimized for different goals: drafting and thinking, building polished deliverables, working from trusted sources, or producing research-style summaries. The core idea: choose the tool based on your output and your source of truth, then prompt.

Learning Objectives

  • Match common tasks to the right workflow type (drafting/thinking, deliverable creation, source-grounded work, research synthesis)
  • Choose the right tool/platform based on where the "source of truth" lives
  • Select an appropriate surface (workspace vs research vs learning vs visual) based on the output type
  • Choose fast vs deep reasoning mode based on task complexity and risk
  • Recognize common tool mismatches and correct them quickly

1) Start with two questions: output + source of truth

Most tool confusion disappears if you answer these first:

  • What output do I need? (draft, polished artifact, FAQ, brief, comparison, checklist)
  • Where does truth live? (documents, a workspace deliverable, your reasoning, external sources)

If the tool has to guess what you want or what it should be faithful to, you'll get inconsistent results.

Quick mental model

  • Drafting/thinking → chat-first
  • Polished artifact → workspace/canvas
  • Must follow documents → source-grounded
  • Compare/synthesize → research workflow

Practice drill: Output + source of truth

Goal: Build the habit of deciding before prompting.

Pick 5 tasks you do regularly and write:

  1. Output type
  2. Source of truth

Example starters:

  • "Draft an email…" → output: draft message; source: your intent + constraints
  • "Create an FAQ from these docs…" → output: FAQ; source: documents
  • "Compare 3 tools…" → output: brief; source: external sources + your requirements

Success check: If you can't name the source of truth, the tool won't be able to either.

2) Four workflow types (what each is best for)

Drafting and thinking (chat-first)

Best for:

  • Drafting and rewriting
  • Outlining and planning
  • Brainstorming options and next steps
  • Explaining concepts for different audiences
  • Iterating quickly

Building deliverables (workspace/canvas)

Best for:

  • Slide decks and training guides
  • Reusable templates and polished artifacts
  • Structured documents that need revisions and formatting

Source-grounded work (documents-first)

Best for:

  • Answers that must align to specific policies, SOPs, runbooks, or guidance
  • Summaries that need to stay faithful to provided sources
  • Creating structured outputs (FAQ, outline, checklist) strictly from documents

What "source-grounded" means in practice:

  • You provide the documents (uploads/links)
  • The output should be based on those documents
  • If the answer isn't in the sources, the tool should say so rather than guessing

Research synthesis (research-first)

Best for:

  • Structured comparisons and briefs
  • Synthesizing a landscape (options, pros/cons, trends)
  • Outputs where you want separation between findings and recommendations

Verification note:

  • Keep "what sources say" separate from "what you recommend"
  • Confirm high-impact claims (requirements, dates, costs) against sources before sharing

Practice drill: Route the task

Goal: Choose the right workflow type quickly.

Label each task as: Drafting/thinking / Deliverable / Source-grounded / Research synthesis

  1. Draft a message announcing planned downtime
  2. Turn meeting notes into action items
  3. Create an FAQ strictly from these policy documents
  4. Build a training deck outline for staff
  5. Compare three platforms and write a brief for leadership

Success check: Your chosen workflow should match the source of truth.

3) Surfaces: pick the right workspace for the job

Some platforms offer multiple "surfaces." The same tool can behave very differently depending on the surface you're using.

General guide:

  • Use a workspace/canvas when you're building a deliverable (doc, deck, template)
  • Use a learning surface when your goal is understanding and practice
  • Use a research surface when your goal is structured synthesis
  • Use a visual surface when the deliverable is an image/graphic asset

Key point: if the final product is shareable and reusable, it should live in a structured surface—not as a one-off chat response.

Practice drill: Surface selection

Goal: Choose a surface that matches the shape of the output.

Pick one deliverable you need soon:

  • Onboarding guide
  • Policy FAQ
  • Research brief
  • Slide deck
  • Reusable template

Decide:

  • Which workflow type fits (chat/deliverable/grounded/research)?
  • Which surface fits (workspace/learning/research/visual)?

Success check: If you're building a real artifact, choose a workspace/canvas surface.

4) Choosing the right mode: fast vs deep reasoning

Not every task needs the same level of reasoning.

Use fast mode for:

  • Rewriting, summarizing, formatting
  • Quick drafts
  • Straightforward transformations

Use deep reasoning for:

  • Multi-step planning
  • Workflows with constraints and dependencies
  • Edge cases and risk-aware thinking
  • Structured decision support

A simple heuristic:

  • If the task has many dependencies, trade-offs, or risk → slow down and use deep reasoning
  • If it's mainly a transformation task → fast is usually enough

Practice drill: Mode selection

Goal: Match mode to task complexity.

Choose fast or deep reasoning for each:

  1. Rewrite a paragraph to be clearer and shorter
  2. Generate 10 subject lines
  3. Draft a rollout plan with constraints and stakeholders
  4. Identify workflow edge cases and mitigations
  5. Convert a messy thread into an action-item table

Success check: If you expect the output to affect decisions or processes, choose deep reasoning.

5) Common tool mismatches (and fixes)

Mismatch: Asking a chat tool to behave like a verified knowledge base
Fix: Use source-grounded workflow when the answer must match documents

Mismatch: Trying to build a polished guide/deck in a chat thread
Fix: Move to a workspace/canvas surface

Mismatch: Asking for document-faithful answers without providing documents
Fix: Supply sources and instruct "answer only from sources"

Mismatch: Trying to do research synthesis without a structured research workflow
Fix: Use a research surface and require separation of findings vs recommendations

Mismatch: Using fast mode for complex, risk-aware planning
Fix: Use deep reasoning and ask for assumptions + checks

Optional reading:

AI Literacy Training Program

CUIT is launching a free AI Literacy Training Program for faculty, researchers, and administrative leaders, running May through August. Choose from a Basic or Advanced track — each is two one-hour Zoom sessions held on Tuesdays and Thursdays throughout the summer. Since content is consistent across offerings, you only need to register for one session per track.

Participants will get hands-on practice with AI tools, prompt engineering, security considerations, and workflow integration. Office hours and cross-campus "Lunch & Learn" sessions will also be available for deeper engagement and collaboration.

Instructional faculty can also connect with the Center for Teaching and Learning (CTL) for hands-on clinics, course design workshops, and AI resources. Register at ctl.columbia.edu/events.

Training Schedule

Office Hours

To register for these trainings, click here.