AI: Community of Practice

Emerging Technologies' AI: Community of Practice (AICoP) is a multidisciplinary congregation of curious minds, eager to delve into the realms of artificial intelligence (AI) and machine learning (ML). The community is a platform for learning, discussion, and application of AI principles across various fields of study at Columbia University. We aim to demystify AI, spur innovation, and approach challenges with a fresh, AI-centric perspective through regular meetings, workshops, and collaborative projects. All while fostering a culture of inclusivity, respect, and collective growth.

We encourage Columbia Researchers, Faculty, and Administrators interested in joining our AI Community of Practice mailing list for future events and updates.

Discussion Highlights

AICoP: CAiSEY an AI-powered Course Tool

The AI Community of Practice closed out the spring semester with Dan Wang, the Lambert Family Professor of Social Enterprise at Columbia Business School and faculty co-director of the Tamer Institute for Social Enterprise and Climate Change, for a session on CAiSEY — a voice-based AI tool he founded that helps students learn through real-time conversation. Dan was joined by CAiSEY co-founder Jill Cohen (CBS '20) and curriculum and operations specialist Joanna Chouraqui. The session walked through the problem CAiSEY was built to solve, evidence of its classroom impact, a look under the hood at how it works, and how instructors can adopt it.

What's New

  • CAiSEY, a voice-first discussion partner: CAiSEY immerses students in voice-first, real-time discussion to deepen critical thinking about course material. Rather than a text chatbot, it acts as a "knowledgeable peer" — adversarial enough to push back, friendly enough to keep students engaged — that students talk through a case or prompt with the night before class. A typical engagement runs about 10 minutes.
  • Built on a voice-native model: CAiSEY runs on OpenAI's Realtime API, a voice-native model trained on audio rather than the more common speech-to-text-to-speech pipeline. Dan described an architecture layer beneath CAiSEY that allows switching to other voice-native models. The payoff is near-zero response lag and the ability to pick up tone, passion, and other audio cues that text would miss.
  • Instructor-controlled setup: Instructors supply a discussion topic, teaching materials that serve as CAiSEY's knowledge base (e.g., the case itself, teaching notes), and optional custom instructions — for example, having the discussion unfold as if at the time of the case. Instructors get a dashboard showing enrolled students, submitted assignments, full transcripts, discussion summaries, and conversation data such as length and the student's chosen position.
  • A four-step adoption path: Instructors fill out an interest form at casey.me, work with the CAiSEY team to build their course, preview it, then issue a registration link that auto-enrolls students. For Columbia community members, CAiSEY is offered essentially at cost through a chart string, with API costs absorbed by the CAiSEY team.

Why It Matters

  • It addresses a real post-ChatGPT problem: Dan framed CAiSEY against the case-method crisis many instructors have felt since late 2022 — students arriving unprepared, or substituting AI-generated summaries and ready-made arguments for their own thinking. CAiSEY is positioned not as a shortcut but as a way to introduce productive "learning frictions" that slow students down and help them arrive prepared.
  • The evidence goes beyond satisfaction surveys: In a spring 2025 pilot with 275 MBA students generating roughly 1,300 conversations, Dan reported that about 90% of free-response feedback was positive, with over 70% citing improved critical thinking and close to 70% saying they felt better prepared for discussion. Rather than rely on self-report, the team then ran a randomized controlled trial.
  • A controlled trial showed a measurable effect: In a fall 2025 field experiment with 760 students in the required Strategy Formulation course, students were randomly assigned to use CAiSEY in two of twelve sessions. Dan reported that students who used it twice before the midpoint were 18% more likely to report feeling more comfortable participating in class — a meaningful shift from roughly 20 minutes of total engagement.
  • It supports inclusive participation: Students described CAiSEY as an inclusive tool, particularly valuable for the many Columbia students who speak English as a second language or have learning differences such as dyslexia and dysgraphia. Because it is clearly practice with an AI, students reported feeling psychological safety to try out arguments they might self-censor in front of a professor or TA.

Session Highlights

  • A live demo: Parixit role-played a CBS student preparing for a case on Netflix's content strategy, debating CAiSEY in voice mode over about five minutes. He noted afterward that the experience forced him to slow down and actually think through the question — the opposite of how AI is often used to skim.
  • Rapid, broad adoption: CAiSEY grew from Dan's single class to more than 4,000 students across 40 courses at 23 institutions worldwide as of spring 2026, spanning leadership, ethics, accounting, economics, creative writing, and pilots in engineering and medical education.
  • Instructor benefits: Dan summarized feedback from a professor at HEC Paris around four points — saved prep time, higher-quality discussion from better-prepared students, expanded grading insight into student learning, and improved teaching ratings.
  • Discussion summaries as study notes: Parixit highlighted CAiSEY's post-conversation summary, which lays out opening arguments, supporting points, CAiSEY's counterarguments, and areas of agreement and disagreement. Dan noted students often turn these into PDFs or printouts to bring to class as their notes.

Key Discussion Points

  • Data privacy is a core principle: Asked about research use of the growing student population, Dan was emphatic that data belongs to the adopting institutions and instructors. CAiSEY does not mine student submissions or instructor guidelines to train anything, which he called the most important principle for keeping students and instructors safe using the tool.
  • Ongoing research: Three studies are underway or in peer review — the RCT on motivation and participation; a comparison of voice versus text interaction, which found voice produces more creative reasoning (running counter to research suggesting AI constrains creativity); and a study of how often students are persuaded to change their minds, which happens roughly 10–15% of the time.
  • Hallucination and a "knowledgeable peer": Responding to chat questions about hallucination and repetition, Jill re-framed these around the knowledgeable-peer design pitched at the grad-student level — noting that even a peer in a real meeting may misremember facts or repeat themselves, and that this is part of what makes the practice feel realistic, while the team works to reduce repetition as models improve.
  • Security and guardrails: Dan noted CAiSEY has been red-teamed as part of security reviews, with system-prompt instructions tuned over time to keep it on topic and prevent misuse, and natural conversation-ending triggers so sessions don't run too long.
  • Instructor time at scale: Dan acknowledged he can't read 200 transcripts before class and described a feature on the horizon (targeted for the summer) that surfaces major student points and supplies discussion "seeds" for the instructor to use in class.

Resources

The CAiSEY team will share the session recording and slide deck as a follow-up. Anyone can try the Netflix case demo at caisey.me/preview, and instructors can request a portal to build their own assignment. For Columbia adoption, on-boarding, or questions, community members can reach out through the CAiSEY interest form at caisey.me.

Takeaway

CAiSEY offers a counter-narrative to the common worry that AI in higher education is mainly a shortcut machine. By making students talk through their reasoning out loud before class, it turns a fear about AI eroding the case method into a tool for strengthening it — adding friction rather than removing it. With early experimental evidence, fast cross-disciplinary adoption, and a clear emphasis on data privacy and instructor control, CAiSEY is a notable example of AI built to deepen learning rather than bypass it.

You can find out more detail in the News Section here.

AICoP: How SPS Is Embedding AI School-Wide

The AI Community of Practice welcomed Mark Ritzmann, Mandeep Brahmbhatt, and Michael Fleming from Columbia's School of Professional Studies for a session on how SPS has embedded AI across its academic, administrative, and faculty-development work. The discussion covered the school's overall strategy, faculty-facing resources, administrative training, and the newly launched SPS AI Lab — a five-part initiative funded by senior leadership to consolidate the school's AI efforts into a single home.

What's New

  • The SPS AI Lab: SPS has kicked off a lab (the name was deliberately chosen over "hub" or "center of excellence") with five pillars: thought leadership white papers (5–7 pages, topic-specific — AI in finance, healthcare, legal, etc.), applied research focused on AI job skills and workplace applications, corporate partnerships tied into the school's 19 master's capstones, a software sandbox giving students, faculty, and alumni access to industry tools via academic licenses, and a centralized asset library. The site will live at ai.sps.columbia.edu and is targeted to go live in the next few weeks.
  • Six-Part Admin Training Series: Mandeep and Mark designed a lunch-and-learn series that has become the best-attended programming in SPS history. Four sessions have been delivered (Intro to AI and data classifications, additional tools like Zoom AI Companion, advanced use cases, and an upcoming show-and-tell), with data analytics planned next. The series will continue to evolve based on staff feedback.
  • Faculty AI Teaching & Learning Resource Guide: An asynchronous resource built around a three-part structure — an AI literacy pathway for instructors (adapted from Stanford's model, covering functional, ethical, rhetorical, and pedagogical literacy), core principles for AI integration from Quality Matters, and instructor access to AI tools.
  • Practical AI Tools Training: A new hands-on faculty training launched this spring, now part of the standard ed-tech program run every semester. It covers CU-CHAT, Gemini, and Notebook LM — the three centrally available tools — with ChatGPT and Claude training planned as access expands.
  • Claude Soft Launch Announced: Parixit confirmed during the session that Columbia has signed its contract with Anthropic and is doing a soft launch of Claude for web and API. Pricing is $300 per user per year (same as ChatGPT Education), with API usage as a pass-through at Anthropic's token rates and a hard budget cap that shuts off API access when reached. Access is available to faculty, researchers, and administrators (not students), and requests go to [email protected] with a chart string. Claude is approved for sensitive and confidential data; the BAA is still in progress, so HIPAA data should not be uploaded yet.

Why It Matters

  • Culture is the foundation: Mark repeatedly returned to the point that SPS's AI progress is only possible because of top-down support from Dean Troy Eggers and senior associate deans. When a school-wide admin meeting turned into an impromptu AI show-and-tell — including a facilities team using AI to generate renderings and purchasable product links for office space planning — it demonstrated how deeply embedded the technology has become.
  • Structured faculty pathway: Michael's three-level model — AI literacy → tool proficiency → pedagogical integration — gives faculty a clear progression rather than dropping them into tools cold. The framing ("this makes you a better version of you," credited to Karen McFadden) has helped bring along cautious and critical faculty by shifting the conversation from "should we use AI" to "how do we apply it well."
  • Experiential over theoretical: Mandeep emphasized that training can only go so far before people need to "put their fingers on a keyboard." Low-stakes group exercises — including a memorable scenario about drafting a press release, signage, and parent communications for "violent aggressive squirrels attacking students on campus" — have been especially effective at getting hesitant adopters to experiment.
  • Governance as an extension, not a separate track: SPS treats AI data governance as an extension of existing Columbia data governance, privacy, and security policies rather than a parallel regime. Work is underway on masking and de-identifying sensitive data before it enters AI tools, then re-identifying results for action — particularly relevant for the finance, planning, and analytics groups SPS supports.

Session Highlights

  • Faculty AI Summits: Two in-person summits (fall and spring) led by Karen McFadden and Eric Nelson brought full-time and part-time faculty together. A standout session featured an AI rubric generator Gem paired with a live instructional designer interrogating everything that could go wrong — a live demonstration of human-in-the-loop practice. Collaborative work time at the end let faculty apply what they'd just learned.
  • Gemini Gems, Framed Tool-Agnostically: In the practical tools training, the team deliberately re-frames Gems around transferable skills ("anatomy of a Gem": system prompt, knowledge grounding, tools) rather than Google-specific terminology, preparing faculty and students for skills that will transfer across platforms.
  • Administrative Experiments: Course planning, faculty on-boarding, and program communications have emerged as the top three areas where SPS program administrators are experimenting with AI-driven efficiency gains, with an emphasis on methodical documentation so successful patterns can be scaled.
  • Business Intelligence Direction: SPS produces BI reports for university and senior leadership, and is leaning into AI-driven dashboards — with significant NotebookLM use for segmented, shareable data sets.

Key Discussion Points

  • Vendor Feature Releases Without Warning: Victoria Mulaney Brown (Director of Academic Integrity) raised concerns about vendors like Google silently pushing new features — she cited the sudden appearance of Google Studio inside Workspace with no advance communication. Parixit acknowledged the pain: the team actively monitors admin panels and vendor blogs, but some features still appear with no notice. The team is exploring whether contract language can require advance notification.
  • BAAs Are Not a Blanket Protection: Zahid (Associate Director, IT Risk Management at CUIT) offered a public service announcement: a BAA alone does not make a system HIPAA-eligible — CUIMC security must also certify usage. Additionally, contracts can still permit training on customer inputs, so the contract's actual terms matter as much as the BAA's existence.
  • Third-Party Integration Risk: Nimish noted that AI features accessed through partners like Workday, Salesforce, or cloud LLM integrations may not be covered under Columbia's direct BAAs with OpenAI or Anthropic. Any connector request goes through a security and contractual review before it's turned on. Users should confirm they're using enterprise account credentials rather than signing up directly with third parties.
  • Personal AI Usage and Data Leakage: A reminder from the team that free-tier consumer AI tools often train on user inputs, meaning uploaded files or questions could surface in another user's output. The "walled garden" enterprise configuration is how Columbia protects institutional IP and data.
  • Decentralization vs. Centralization: Victoria and Michael both raised the long-standing Columbia challenge of school-by-school silos. Parixit confirmed CUIT is working on a centralized AI literacy program that will draw directly from SPS's work rather than duplicate it.
  • Sustainability Concerns: Environmental impact — water and power consumption — comes up in nearly every SPS session. Steve Cohen and the Earth Institute are involved in studies on offsets and alternative energy approaches.

Resources

SPS will share the full session recording and slide deck as a follow-up. The AI Lab website (ai.sps.columbia.edu) is launching in the coming weeks. For Claude access requests, soft-launch on-boarding, or individual workflow consultations, community members can reach out to [email protected] (include a chart string for Claude access).

Takeaway

SPS offers a model for what coordinated, school-wide AI adoption looks like when senior leadership, academic technology, faculty development, and administrative support all pull in the same direction. The three-layer faculty pathway, the six-part admin series, and the new AI Lab aren't separate programs — they're designed as connected infrastructure, with an emphasis on experiential learning, responsible use, and building on Columbia's existing governance frameworks rather than inventing parallel ones. As more schools lean into AI and CUIT moves toward centralized literacy resources, the SPS approach is a strong reference point for how to turn enthusiasm into durable practice.

You can find out more detail in the News Section here.

AICoP: OpenAI Codex: Data, Development, and Decision-Making

The AI Community of Practice welcomed Fabio Mori and Keelan Schule from OpenAI for a session on Codex—OpenAI's agentic coding and automation platform. The event featured a presentation on how Codex has evolved beyond a developer-only tool, followed by live demos showcasing how anyone on campus—technical or not—can use Codex to turn ideas into working applications, automate workflows, and build interactive dashboards.

What's New

  • Agentic Delegation (Phase 3): Codex has evolved from code completion (Phase 1) to pair programming (Phase 2) to its current phase: agentic delegation. Users can now assign multi-step, long-running tasks to Codex—including overnight batch work with subagents—and return to completed results. The newer models maintain context over extended tasks without drifting from the original instructions.
  • Codex Standalone App: In addition to the CLI, VS Code extension, and Cursor extension, Codex is now available as a standalone desktop app. The app interacts directly with your local file system—creating folders, manipulating files, opening applications, and running terminal commands on your behalf—making it accessible to non-developers.
  • Powered by GPT-5.4: Codex now runs on GPT-5.4, which wraps in the dedicated Codex model. A faster "Spark" model is also available for quick co-development tasks, while 5.4 is suited for longer-running, reasoning-heavy work with plan mode.
  • Plugins and Skills: Codex supports plugin integrations (GitHub, Gmail, Google Calendar, Vercel, Slack) for reading from and pushing to external apps. Skills—repeatable, exportable workflow templates—allow users to automate multi-step processes like code review, versioning, and deployment.
  • Plan Mode: Before executing code, Codex can run a structured planning phase—analyzing data, outlining steps, surfacing assumptions, and proposing a test plan. Users can review and steer the plan before any code is written, ensuring alignment with their requirements and coding standards.

Why It Matters

  • Lowers the Technical Barrier: Only about 16% of building a useful tool is actual code development. Codex handles the full lifecycle—design, coding, testing, documentation, and deployment—so faculty, staff, and students without deep technical expertise can turn ideas into working applications.
  • Campus-Wide Use Cases: Codex is being used across institutions not just for software engineering, but for organizing files via OCR, creating interactive dashboards from CSV data, automating email follow-ups, generating one-pager summaries from code repositories, modernizing legacy systems, and onboarding workflows.
  • Accessible via ChatGPT Education Accounts: Columbia community members can sign in to the Codex app using their ChatGPT Education account. Education accounts also have enhanced Codex limits through the end of May. Students can additionally claim $100 in Codex credits with their student email, and advanced users can apply for the Codex Ambassador program.
  • Built-in Safety Practices: Codex includes security-aware behaviors—flagging hardcoded passwords, warning against committing secrets to repositories, and suggesting key rotation—helping users avoid common security pitfalls without needing specialized knowledge.

Live Demos

  • Desktop File Organization with OCR: Codex scanned 16 screenshots on a cluttered desktop, performed OCR on images without a text layer, categorized them by content (admin, development, travel, presentations), created organized folder structures, and renamed files—all in about two minutes. The task is fully reversible through Codex's built-in audit and revert functionality.
  • Campus Analytics Dashboard from CSV: A synthetic dataset of ~1,000 rows covering course enrollment and grading data was transformed into a fully interactive dashboard with dropdown filters, charts, and visualizations. Codex selected appropriate Python libraries, generated the frontend, and produced a companion talk track explaining how to use and share the dashboard. The entire process took approximately four minutes.
  • Flight Dashboard from Calendar Data: Using plugin connectors, Codex pulled flight information from calendar and email data, then autonomously built an interactive map dashboard with plotted flight paths, city locations, flight numbers, and estimated mileage—inferring the need for geographic visualization without being explicitly told.
  • Code Repository One-Pager: Codex analyzed a Discord chatbot codebase and generated a human-readable PDF one-pager summarizing the application's purpose, audience, architecture, file structure, and setup instructions—using a pre-built PDF skill for consistent formatting.

Key Discussion Points

  • Plugins and Institutional Policy: OpenAI presenters noted that plugins (Gmail, Calendar, Slack, etc.) are available in Codex, but Columbia has third-party plugins disabled due to privacy and security settings. Users can still import data manually (e.g., exporting calendars as PDF or Excel) to work around this limitation.
  • Thread and Agent Management: Each Codex conversation functions as an independent agent. For complex projects, users should create separate threads for distinct tasks (e.g., front-end vs. back-end) rather than switching models or reasoning levels mid-conversation. Subagents within threads can run concurrently, and if two agents modify the same code, Codex can merge their changes automatically.
  • Reasoning Effort Settings: Medium is the recommended default. Low works for single-step, explicit tasks. High or extra-high is appropriate for multi-step tasks involving tool calls, plugins, or concurrent subagents. Extra-high is rarely needed outside of complex, multi-agent orchestration scenarios.
  • Code Quality and Evaluation: Codex follows industry best practices by default but can be steered with custom unit tests, coding style guides, and specific guardrails from existing repositories. Plan mode allows users to review and adjust the model's approach before any code execution begins.

Resources

OpenAI shared several resources for getting started with Codex, including the Codex documentation and cookbooks (QR codes provided during the session). Student credit claims, Ambassador program applications, and the Columbia ChatGPT Education login page links will be included in the session follow-up. For individual workflow consultations, community members can reach out to [email protected].

Takeaway

Codex represents a shift from AI-assisted coding to AI-delegated project execution. By handling the full build lifecycle—planning, coding, testing, documenting, and deploying—Codex makes it possible for anyone on campus to move from an idea to a working tool without needing to write or understand code. With the standalone desktop app, plugin ecosystem, and plan mode, the barrier to entry for building useful campus applications has dropped significantly. As with all AI tools in the Columbia ecosystem, users should be mindful of institutional security policies, particularly around third-party plugin access and sensitive data handling.

You can find out more detail in the News Section here.

AI Community of Practice Hosts HP: Local AI for Higher Education

The AI Community of Practice welcomed Curtis Burkhalter, Technical Marketing Manager at HP, for a session on running large language models locally using air-gapped hardware. The event featured a presentation on HP's ZGX Nano device and live demos showcasing how local AI can support research, teaching, and administrative work across the university—without sending data to the cloud.

What's New

  • Air-Gapped Local AI Hardware: HP's ZGX Nano is a palm-sized device (6" x 6" x 2") powered by NVIDIA's Blackwell GB10 GPU with unified memory architecture. It can fine-tune models up to 70 billion parameters and run inference on models up to 200 billion parameters—all without an internet connection.
  • Unified Memory Architecture: Unlike traditional systems that separate CPU and GPU memory, the ZGX Nano shares memory across both processors, eliminating data transfer bottlenecks and enabling significantly larger models to run on a desktop device.
  • Scalable Design: Two ZGX Nano units can be linked together to double capacity—supporting fine-tuning of 140B parameter models and inference on 400B parameter models.
  • ZGX Toolkit: A free VS Code extension that auto-discovers the device on your local network, sets up SSH authentication, and provides one-click installation of the open-source AI stack (Ollama, JupyterLab, Open WebUI, and more).

Why It Matters

  • Data Privacy and Compliance: Researchers working with HIPAA, FERPA, or other regulated data can use AI without sending sensitive information to the cloud. HP even offers a version with de-soldered Wi-Fi and Bluetooth antennas for complete air-gap security.
  • Simplified IRB Reviews: By keeping all data processing on-premises, local AI can shorten IRB approval timelines by removing cloud-related compliance hurdles.
  • Predictable Costs: Unlike cloud-based AI with unpredictable per-token billing, local hardware is a one-time line item with zero marginal cost per query—making it compatible with grant budgeting.
  • Reproducibility: Open-source models pulled from Hugging Face are version-locked by default, ensuring consistent results without the risk of cloud providers updating or deprecating endpoints mid-project.

Key Case Studies Presented

  • Texas A&M Foundation: Automated review of 2,500+ scholarship applications and transcript compliance checks that previously consumed entire holiday breaks. Each transcript now processes in 17 seconds. Also using local AI for secure handling of sensitive donor information.
  • Winona University: Professor Patrick Paulson built RAG-based chatbots for each course using open-source textbooks, plans to deploy ZGX Nanos as shared student resources in an immersive tech lab, and reduced his semester curriculum update process from six weeks to two days using AI.
  • APITA (France): Researchers achieved 16x faster development iterations for 3D brain tumor segmentation from MRI scans compared to CPU-only processing, with all patient data remaining on-premises. The project code is publicly available on GitHub.

Live Demos

  • Polypharmacy Clinical Decision Support: A tool fine-tuned on FDA drug labels, the Stanford drug interaction database (48,000+ interactions), and PubMed studies. It analyzes complex patient medication profiles and generates clinical risk reports in approximately 11 seconds—all running locally on the ZGX Nano.
  • Maritime Surveillance AI: A compound AI system combining a vision language model with an LLM to analyze aerial imagery, identify vessels, assess threat levels, and generate intelligence reports—demonstrating how multiple models can run together on a single local device.

Key Discussion Points

  • Model Selection and Bias: Participants raised important questions about bias in open-source models, particularly those developed by foreign governments. An example was shared of DeepSeek censoring responses about Tiananmen Square, highlighting the need for informed model evaluation in humanities and social science contexts.
  • Security Considerations: Discussion emphasized that even with local hardware, researchers handling PII or PHI should consult with CUIT for secure data management practices. Downloading models only from recognized, security-reviewed providers on Hugging Face was recommended.
  • OS and Support: The ZGX Nano runs a customized Linux-based operating system (DGX OS), which adds a new platform to support in the institutional ecosystem. Participants noted that standard Mac or Windows devices with sufficient RAM may offer similar benefits for less technical users.

What's Coming

HP has provided Columbia with a ZGX Nano unit on-site for testing and experimentation. Researchers interested in utilizing the device can reach out to learn about availability. All of Curtis's demo code and fine-tuned models are open-source and available on his GitHub for anyone to explore.

Takeaway

Local AI hardware has reached a tipping point where cutting-edge models can run on a palm-sized device, offering a compelling alternative for research workflows that require strict data privacy, predictable costs, and full reproducibility. As the AI landscape continues to evolve, having both cloud-based and local AI options gives the Columbia community the flexibility to match the right tool to the right use case.

You can find out more detail in the News Section here.

AI Community of Practice Hosts Anthropic 

We opened 2026 with a session introducing Anthropic, the AI safety company behind Claude. The event featured a presentation and live demos highlighting how Claude can support teaching, research, and administrative work across the university.

What’s New

  • Safety-First AI Development: Founded by former OpenAI researchers, Anthropic prioritizes safety research alongside model capabilities. Its Constitutional AI framework trains Claude to be helpful, harmless, and honest, with methods that are publicly documented for broader adoption.
     
  • Interpretability Research: Anthropic invests in understanding how models operate internally, enabling earlier detection and mitigation of problematic behaviors as part of a “race to the top” in responsible AI.
     
  • Responsible Scaling Policy: New models and features are released only when appropriate safety measures are in place.

     

Why It Matters

  • Privacy Protection: Claude for Education does not train on user data or chats, providing a secure environment for faculty, students, and staff to work with sensitive materials.
     
  • Personalized Learning at Scale: Demos showed how students can use Claude as a study companion to create custom study guides, practice tests, and interactive visualizations tailored to individual learning needs.
     
  • Time Savings for Faculty and Staff: Research indicates that content creation and course development are top educator use cases, with additional efficiency gains for staff in marketing, communications, HR, and finance.

     

Key Capabilities Demonstrated

  • Projects: Shared workspaces for syllabi, lecture notes, and course materials, allowing students and instructors to work from the same curated sources with transparency.
     
  • Artifacts: Interactive outputs created directly in Claude, such as animated historical maps and procurement dashboards, without requiring coding.
     
  • Research Support: Tools for literature review, grant writing scaffolding, and data analysis already in use for large-scale research workflows.
     
  • Learning Mode: A Socratic option that prompts students with questions rather than direct answers, designed to support deeper learning.

     

What’s Coming

Claude for Education will be made available to the Columbia community, including the chat interface and Claude Code for advanced development work. Pricing and technical details will be shared in follow-up communications.

Takeaway

Anthropic’s approach combines rapidly advancing AI capabilities with a strong emphasis on safety, transparency, and privacy. As Claude becomes available at Columbia, the community will gain access to an AI assistant designed to support teaching, research, and administrative work within a secure framework.

For questions contact: [email protected] 

 

You can find out more detail in the News section here

December 2025

Columbia AI Community of Practice Explores Responsible AI - Safe & Ethical Use

The Columbia AI Community of Practice hosted its final session of 2025, shifting the focus from the technical capabilities of AI to the critical frameworks needed to use it safely. The session was introduced by Senior Director Parixit Davé and featured a presentation by Spencer Ames, Associate AI Analyst from CUIT’s Emerging Technologies team.

What’s New

  • A Shift in Focus: While 2025 was a year of exploring "agentic" systems and reasoning models, the adoption of these tools has outpaced our thinking about their responsible use.
  • Defining Responsible AI: The presentation defined Responsible AI not as being "anti-AI," but as critically choosing and designing systems that align with human values and academic goals while being attentive to risks.
  • Six Core Principles: A new framework was introduced to guide the community, adapted from Microsoft and NIST standards: Human Agency, Fairness, Privacy, Transparency, Safety, and Ongoing Monitoring.

Why It Matters

  • Protecting Institutional Trust: Columbia’s work relies on credibility in teaching and research. Responsible AI helps protect that legitimacy while still allowing for innovation.
  • New Failure Modes: AI introduces unique risks, such as being "confidently wrong" or amplifying patterns of bias at a speed and scale that human-only workflows do not .
  • General-Purpose Capabilities: AI is no longer a single tool but a capability embedded across writing, coding, and analysis workflows, requiring a broader safety strategy.

Principles of Responsible AI

  • Human Agency & Accountability: Humans must remain responsible for decisions, especially in high-stakes contexts. The "Human-in-the-Loop" concept is vital—responsibility does not transfer to the vendor just because they provided the tool.
  • Fairness & Inclusion: Small biases in data can compound into unequal outcomes for the diverse populations Columbia serves. For example, AI is not ideologically neutral. Models often reflect a Western, Educated, Industrialized, Rich, Democratic cultural bias.
  • Privacy & Data Respect: Columbia University stewards highly sensitive data, including student records, staff records, and health data. Refer to our Data Classification Policy and AI Tool Data Classification Chart for best practices. Personal non-UNI LLMs (e.g., standard ChatGPT or Gemini) are classified as Public and are not suitable for Internal, Confidential, or Sensitive data.
  • Transparency & Explainability: Trust depends on clarity. Stakeholders deserve to understand when AI is involved in a decision or output. This includes utilizing "Model Cards" provided by developers that detail limitations and training data. We can adopt similar transparency and explainability measures for using AI in our systems and work to foster a culture of mutual understanding, trust, and accountability.
  • Safety, Security, & Reliability: Reliability is not the same as accuracy. The session reviewed safety benchmarks (measuring overconfidence and deception), noting that while newer models like GPT-5.1 and Claude 3 Opus show improved safety scores compared to older models like GPT-4o, vigilance is still required.
  • Ongoing Monitoring Compliance isn't a "one-and-done" checklist. Because AI behavior can drift and capabilities change, we must treat every output as a draft and continuously verify facts.

Community Insights from the Q&A

  • Student Education: During the Q&A, Dr. Victoria Malaney-Brown (Director of Academic Integrity) emphasized the urgent need for student training on AI ethics and values, potentially during the upcoming Integrity Week.
  • AI Integration: Robert Cartolano (AVP, Technology and Preservation) noted that by 2026, AI will be embedded in most standard tools (like JSTOR and Adobe), making responsible use a necessity across all disciplines, not just for "AI tools."
  • Upcoming Events: Barnard College announced a symposium on "Responsible AI in the Liberal Arts" scheduled for February 6th.

Takeaway

Responsible AI is an ongoing practice of mapping, measuring, and managing risk. By adhering to core principles like human accountability and data privacy, the Columbia community can explore the benefits of these emerging technologies without compromising the university's values or integrity.

For questions or collaboration, contact: [email protected]

You can access a recording of the AICoP in the News section here.

November 2025

Columbia AI Community of Practice Explores Prompt Engineering and Context Engineering

The Columbia AI Community of Practice hosted a webinar focused on the evolution from traditional prompt engineering to the broader, more powerful method known as context engineering. The session was led by John P. Martin and featured an in-depth presentation and demos by Spencer Ames from CUIT’s Emerging Technologies team.

What’s New

  • Context engineering re-frames how users work with LLMs by designing everything the model sees, not just the prompt.
  • Prompt engineering is now understood as a subset of context engineering rather than a fading skill.
  • New techniques emphasize clarity, specificity, structured instructions, and curated sources.

Why It Matters

  • Helps Columbia users get more accurate, reliable outputs while reducing hallucinations.
  • Supports teaching, research, and administrative workflows that depend on trustworthy AI-generated content.
  • Gives the community strategies for privacy-safe AI use inside Columbia-approved platforms like ChatGPT (Enterprise), Gemini, and NotebookLM.

Key Concepts

  • Prompt Engineering:
    • Clear instructions written directly to the model.
    • Still valuable but limited without added context.
  • Context Engineering:
    • Designs the entire context window: instructions, examples, sources, data, and task history.
    • Treats attention as a limited resource and focuses on giving the model only what it needs.
    • Uses structured templates such as ROSES (Role, Objective, Scenario, Expected Output, Steps).
  • Demos showed:
    • How to rewrite weak prompts for higher accuracy.
    • How to use delimiters (XML, brackets, markdown) to reduce ambiguity.
    • How to start new conversations with “Give me a summary of this chat and a master prompt” to avoid memory bloat.

Use Cases

  • Faculty can craft stronger course content, rubrics, and research prompts.
  • Admins can create accurate reports, communications, and policy-aligned outputs.
  • Researchers can increase reliability when summarizing, comparing, or validating literature.

Security and Governance

  • Columbia-approved AI platforms keep data private and out of commercial training pipelines.
  • Consumer AI tools like public ChatGPT, Claude, or Gemini do not offer the same guarantees.
  • Guidance emphasized avoiding sensitive or copyrighted material unless using university-licensed systems.

Takeaway

Context engineering gives the Columbia community a clearer, more reliable way to guide AI systems. By combining structured prompts, curated sources, and strong guardrails, users can drive LLMs to produce higher-quality results across teaching, research, and administrative work.

For questions or collaboration, contact: [email protected]

You can access a recording of the AICoP in the News section here.

October 2025

Columbia AI Community of Practice Showcases Google Gemini and NotebookLM

The Columbia AI Community of Practice hosted a special Halloween session to introduce Google Gemini and NotebookLM, now available to all active faculty, researchers, and administrators through Columbia’s LionMail workspace. The session featured Jillian Yoerges from Google and demonstrations by John P. Martin and Spencer Ames of CUIT’s Emerging Technologies team.

What’s New

  • Gemini and NotebookLM are now part of Columbia’s Google Workspace.
  • Both tools are free to use under existing university licenses.
  • A new Gemini pilot is integrating AI directly into CourseWorks for teaching and learning.

Why It Matters

  • Gives Columbia users secure access to generative AI inside the Google ecosystem.
  • Supports teaching, research, and administrative tasks under strict data privacy and compliance standards.
  • Encourages AI literacy and responsible use through CUIT-led training and office hours.

Key Features

  • Gemini:
    • Two AI models: 2.5 Flash (general use) and 2.5 Pro (advanced coding, math, and reasoning).
    • Tools include Guided Learning, Deep Research, Canvas for real-time editing, and Image Generation.
    • Can write scripts, organize data, debug code, and generate creative or technical content.
  • NotebookLM:
    • Uploads PDFs, Google Docs, Sheets, and even YouTube videos.
    • Summarizes, cites, and analyzes content from multiple sources.
    • Generates reports, flashcards, quizzes, and interactive audio or video explainers.
    • New feature: now supports Google Sheets for data analysis and forecasting.

Use Cases

  • Faculty can generate lecture notes, quizzes, and summaries.
  • Admins can analyze budgets, applications, or reports using uploaded data.
  • Researchers can compare papers, transcribe media, and create study guides.

Security and Governance

  • All tools are covered under Google Workspace for Education terms.
  • No ads and no data used for model training.
  • Fully encrypted and compliant with major privacy standards (SOC 1, SOC 2).

Takeaway

Gemini and NotebookLM expand Columbia’s secure AI ecosystem, giving the community new ways to teach, research, and work smarter—while keeping privacy and responsible use at the core of every tool.

For questions or collaboration, contact: [email protected]

You can access a recording of the AICoP in the News section here.


September 2025

Columbia AI Community of Practice Introduces CHAT (CUIT Hosted AI Toolkit)

Columbia University’s AI Community of Practice wrapped up its September session with a deep dive into CHAT—(CUIT Hosted AI Toolkit). This platform is designed to bring advanced conversational AI tools to the university community, offering flexibility, productivity, and security under Columbia’s framework. The session was led by John P. Martin, Emerging Technologist at Columbia University Information Technology (CUIT).

What is CHAT?

  • CHAT replaces the earlier CU-GPT service.
  • Built on LibreChat, it supports multiple AI models and will soon expand beyond GPT to include Gemini, Claude, and select open-source models.
  • It’s a pay-as-you-go service, starting October 1, with a default daily spend limit of 50 cents per user.

Why It Matters

  • Provides access to multiple models in one platform.
  • Boosts productivity for faculty, staff, and researchers.
  • Offers Columbia users a secure environment, vetted through risk and security reviews.

Key Features

  • Model switching: Move between GPT-4, GPT-5, Gemini, and others mid-conversation.
  • OCR and image generation: Extract text from images and create visuals.
  • File support: Upload PDFs, Word docs, and CSVs for summarization, analysis, and comparison.
  • Custom agents: Build AI-powered assistants similar to custom GPTs, tailored to specific tasks.
  • Prompt library: Save and reuse advanced prompts for writing, coding, or research.

Testing and Use Cases

  • Demonstrations included generating professional emails, summarizing syllabi, analyzing data files, and extracting details from flyers.
  • Users can create agents for tasks like web search or image generation.
  • The system supports prompt engineering for repeatable workflows, such as coding help or email templates.

Cost and Oversight

  • Daily spend limits are customizable per department.
  • Most users spend under 20 cents a day, well below the 50-cent cap.
  • Usage is fully transparent with spend history and dashboards for administrators.

Security and Governance

  • Only models that pass Columbia’s data privacy and risk reviews are added.
  • CHAT is not yet HIPAA-compliant, so it’s currently positioned for administrative and research use rather than sensitive teaching or clinical contexts.
  • Broader strategy discussions are underway to ensure responsible and ethical use across teaching, learning, and research.

Takeaway

CHAT is not just another AI tool—it’s Columbia’s effort to provide a secure, flexible, and practical AI platform for its community. With features like multi-model access, custom agents, and strong oversight, it’s designed to support productivity while respecting security and governance needs.

For questions or collaboration, contact: [email protected]

You can access a recording of the AICoP in the News section here.

May 2025

Columbia AI Community of Practice Showcases Legal Aid Chatbot for Housing Justice

Columbia University’s AI Community of Practice hosted its final session of the season, featuring a real-world application of AI for social impact. Led by Professor Conrad Johnson and Basem Aly from Columbia Law School, the session spotlighted a custom-built chatbot designed to support Legal Aid Society paralegals in assisting tenants facing eviction.

AI for Legal Aid: Project Overview

  • Developed by the Lawyering in the Digital Age Clinic, this AI tool supports Legal Aid’s Housing Justice Helpline.
  • The chatbot, powered by GPT-4 and Retrieval-Augmented Generation (RAG), delivers fast, vetted legal information to paralegals fielding high-pressure calls from tenants.
  • It was designed using Legal Aid’s internal wiki as a closed knowledge base to ensure accuracy and avoid hallucinations.

Why It Matters

  • The Legal Aid Society faces overwhelming demand—over 127,000 eviction cases in NYC last year.
  • The chatbot cuts call handling time from 45 minutes to a fraction, allowing paralegals to serve more clients efficiently.
  • It augments—not replaces—human labor, especially under extreme time constraints.

Key Features

  • Rapid, plain-language answers to legal queries.
  • Context-aware follow-ups for more nuanced housing scenarios (e.g., pets in NYCHA housing).
  • One-click conversion to email or Spanish translation for client follow-up.
  • Escalation logic: flags complex or sensitive issues (e.g., domestic violence) for supervisor review.

Testing and Evaluation

  • Students shadowed paralegals and tested the chatbot across a range of questions.
  • A 10-question rubric helped evaluate accuracy—from simple lookups to complex edge cases.
  • Supervisors review chatbot interactions weekly, monthly, and quarterly to maintain quality control.

Results and Recognition

  • The project won $1.1 million from the Robin Hood Foundation’s AI Poverty Challenge.
  • Legal Aid plans to replicate the model for other service lines and possibly migrate to Microsoft Copilot.
  • The effort helped shift Legal Aid’s generative AI policy from blanket prohibition to responsible experimentation.

Lessons Learned

  • Closed, well-structured knowledge bases are key to performance.
  • Concise prompts outperform complex instruction sets.
  • Human-in-the-loop design ensures legal accountability and builds trust.
  • Ongoing collaboration between technologists, lawyers, and end users drives success.

Takeaway
This wasn’t about replacing lawyers—it was about giving paralegals the tools to help more people, faster. The project stands as a model for practical, ethical AI deployment in public service.

For collaboration or follow-ups, email: [email protected]

You can access a recording of the AICoP in the News section here.

April 2025

Columbia University’s AI Community of Practice Explores Agentic AI and Practical Applications

Columbia University’s AI Community of Practice, led by the Emerging Technologies team, hosted its latest session focused on understanding agentic AI and its role in modern AI development. The featured presentation, led by ML Engineer Marc Chen, explored how agentic systems enhance large language models (LLMs) with structured decision-making and action-taking abilities.

Agentic AI: Key Insights

What Is Agentic AI?

Agentic AI extends traditional LLMs by:

  • Combining non-deterministic AI outputs with deterministic code logic (e.g., rules and workflows).
  • Allowing models to make controlled decisions and take actions based on user input and system-defined rules.
  • Reducing unpredictability while leveraging the creativity and flexibility of LLMs.

Marc emphasized that many systems we interact with today, such as advanced customer service bots are early forms of agentic AI, blending freeform AI responses with strict rule-based frameworks.

Practical Demonstration: Agentic Retrieval-Augmented Generation (RAG)

The session included a live demo of a simple agentic RAG system designed to:

  • Classify users (staff or student) based on input.
  • Retrieve information from the appropriate database (staff benefits or student benefits).
  • Reject irrelevant queries when outside the supported domain.

The system used a router LLM to decide the appropriate action and demonstrated how introducing basic decision-making dramatically improves reliability and reduces hallucinations.

Key demo highlights included:

  • Prompting for user clarification when identity was unclear.
  • Filtering answers based on user type.
  • Rejecting queries not related to benefits.
  • Logging agent decisions to enhance transparency.

Challenges and Considerations

The session also explored technical challenges:

  • Similarity Scores: Semantic matching does not guarantee perfect accuracy, and tuning thresholds for document retrieval remains complex.
  • Evaluation Complexity: Agentic systems require careful evaluation at every step—routing, retrieval, and response generation.
  • Security Risks: Prompt injection, remote code execution, and privilege escalation risks must be managed through sanitization, vendor vetting, and secure API practices.

Participants discussed the trade-offs between simplicity, transparency, and system openness, especially when connecting to external sources like the internet.

Future Developments in Agentic AI

The group reflected on the difference between simple agentic workflows and autonomous agents that can:

  • Plan multiple steps independently.
  • Reflect on intermediate results.
  • Adjust their own workflows dynamically.

While fully autonomous agents remain difficult to deploy successfully, simple agentic systems are already proving useful in:

  • Research assistance (e.g., triaging patient notes).
  • Administrative automation (e.g., moderating forums, verifying user actions).
  • Knowledge management tasks (e.g., personalized FAQ bots).

Marc encouraged the community to start exploring focused, low-risk applications of agentic AI today, emphasizing that transparency, simplicity, and human oversight are key for success.

For collaboration or follow-ups, email: [email protected]

You can access a recording of the AICoP in the News section here.

March 2025

Columbia University’s AI Community of Practice, led by the Emerging Technologies team, hosted its March session exploring the rise of reasoning models in AI and how they differ from traditional LLMs. The featured presentation by AI Analyst Majd Shammout focused on DeepSeek R1, a new reasoning model that emphasizes step-by-step logical analysis and internal “thinking” before generating answers.

Reasoning Models: Key Insights

What Are Reasoning Models?

Reasoning models extend traditional large language models by:

  • Breaking down prompts into logical sub-parts
  • Performing internal analysis before producing final answers
  • Prioritizing accuracy through deliberate token-by-token reasoning

Maj demonstrated how DeepSeek R1 and other reasoning models differ from standard LLMs like GPT-4 by producing slower, more detailed answers—often mimicking how humans think through complex problems.

DeepSeek R1: Reinforcement Learning and Structure

The session focused heavily on DeepSeek’s R1 and R1.0 models:

  • R1.0 introduced reinforcement learning into base LLMs, using accuracy and format as reward signals
  • Responses followed a structured template with a clear “think” step and final summary
  • The model included “aha moments” and displayed reasoning in human-like ways

Despite these advances, R1.0 had issues:

  • Poor readability
  • Language mixing (e.g., switching between English and Chinese)
  • Slow response times
     

DeepSeek R1: Improved Performance with Human Touch

To improve R1.0, the DeepSeek team:

  • Collected cold-start data using prompting, R1.0 outputs, and human annotations
  • Fine-tuned the base model with both reasoning and non-reasoning data
  • Introduced additional reward signals for language consistency and response quality

The result: a model that is both more readable and better at aligning with user intent.

Distillation and Open Research

  • DeepSeek showed that smaller models trained through distillation from R1 can outperform base models trained from scratch
  • Hugging Face is building an open-source version of R1 to replicate and expand on these findings
  • DeepSeek’s open publication of architecture and methods sets it apart from proprietary models like GPT-4 and Claude

Ethical Reflections and Future Outlook

The session concluded with a community discussion on:

  • The parallels between AI model training and human learning
  • Whether pedagogy techniques used in education could help guide AI development
  • The future of reasoning models in relation to AGI and whether algorithmic advances will matter more than increased data or compute

Attendees were encouraged to explore R1 further and continue the discussion in upcoming sessions.

For collaboration or follow-ups, email: [email protected]

You can access a recording of the AICoP in the News section here.

February 2025

Columbia University’s AI Community of Practice, led by the Emerging Technologies team, held its February session with a focus on Google’s latest AI advancements and their impact on education, research, and enterprise applications. The session featured a presentation by Charles Elliott, a Google AI expert, who provided insights into Notebook LM, Agentic AI, Multimodal models, and AI-driven research tools.

Google’s AI Ecosystem: Key Takeaways

Notebook LM: AI-Enhanced Research and Study Tool

Notebook LM was highlighted as a document-grounded AI assistant designed to help users:

  • Analyze and summarize large documents while maintaining accuracy.
  • Generate study guides and FAQs based on uploaded content.
  • Convert text into AI-generated audio summaries or podcasts for on-the-go learning.

By limiting responses to uploaded files, Notebook LM reduces AI hallucination risks, ensuring trusted and verifiable outputs for academic use.

The Rise of Agentic AI

Charles discussed Agentic AI, which enables AI models to autonomously perform complex tasks by reasoning, planning, and executing decisions. Key applications include:

  • Personalized Learning Assistants: AI tutors that adapt to students’ learning styles and coursework.
  • Administrative AI Agents: Automating document workflows, transcript processing, and data management.
  • Research Assistants: AI-driven hypothesis generation, ranking, and validation to support scientific discovery.

Google’s Co-Scientist initiative was introduced as an emerging research tool that can assist scholars by ranking hypotheses and generating novel research ideas.

Multimodal AI and Visual Intelligence

Google’s latest Gemini AI models go beyond text-based interactions, incorporating image, video, and spatial reasoning capabilities. Live demonstrations showcased:

  • AI-powered image understanding and editing, such as modifying objects in photos.
  • Google Maps AI integration for real-time geospatial insights and historical analysis.
  • AI-generated videos and graphics using Google’s V02 model.

These advancements expand AI’s potential in education, digital content creation, and data analysis.

Challenges and Ethical Considerations

The discussion also addressed key concerns in AI adoption:

  • Data Privacy & Compliance: Google reaffirmed its commitment to not using customer data for training in enterprise and educational applications.
  • AI Transparency & Accuracy: Strategies for grounding AI responses in verified sources to minimize misinformation.
  • Institutional AI Integration: Best practices for deploying AI within universities while ensuring compliance with security protocols.

Future Outlook and Next Steps

  • AI-powered career guidance tools are being developed to help students align coursework with real-world job opportunities.
  • AI-driven administrative workflows will streamline admissions, student services, and research processes.
  • Google is continuing to refine Agentic AI models for broader applications in education, science, and enterprise solutions.

The session concluded with an invitation for Columbia faculty, staff, and students to explore these AI tools further and participate in upcoming AI Community of Practice meetings. Attendees were encouraged to reach out via [email protected] for inquiries and collaboration opportunities.

You can access a recording of the AICoP in the News section here.

January 2025

The AI Community of Practice at Columbia University, led by the Emerging Technologies team, held its first session of 2025, focusing on OpenAI tools and their evolving role in education. This session featured an engaging presentation by Joe Casson, OpenAI’s Education Lead for Solutions Engineering, who provided insights into higher education use cases and demonstrated OpenAI's latest advancements.

Highlights of the Meeting

  1. Welcome and Context:
    • Parixit Dave, Senior Director for Emerging Technologies, introduced the session.
    • The AI Community of Practice, started in January 2024, continues to foster collaboration and exploration of AI and machine learning across Columbia.
       
  2. OpenAI's Contributions to Education:
    • Joe Casson emphasized OpenAI’s mission to support universities with AI tools like ChatGPT and the newly introduced O1 models.
    • He discussed their application in personalized learning, operational efficiency, curriculum development, and research enhancement.
       
  3. Live Demonstrations:
    • Joe showcased OpenAI’s tools in action, including:
      • Custom GPTs for tailored use cases, such as generating curriculum content and analyzing university mission statements.
      • The new Canvas feature for collaborative content iteration.
      • File analysis capabilities for document-based workflows.
    • The session also touched on Operator, a new tool enabling task automation, such as conducting research and summarizing data.
       
  4. Key Topics Discussed:
    • Strategies for promoting ethical AI use among students.
    • Addressing academic integrity concerns.
    • Enhancing transparency through citations and data attribution.
       
  5. Future Outlook:
    • OpenAI teased upcoming features for educators, emphasizing collaboration with university leaders and researchers to align AI development with academic goals.

The session concluded with an invitation to participate in February’s meeting and explore OpenAI’s resources further. Columbia remains committed to integrating AI responsibly into teaching, research, and operations.

You can access a recording of the AICoP in the News section here.

October 2024

Columbia University’s AI Community of Practice, led by the Emerging Technologies team, convened its October session with a focus on fine-tuning large language models (LLMs). This session, presented by Mark Chen, Machine Learning Engineer at CUIT, explored the growing importance of fine-tuning for a variety of specialized use cases, from localized data processing to enhanced AI behavior customization.

Key Features and Advantages of Fine-Tuning for Specialized Use Cases

  • Data Security: Fine-tuning smaller models locally ensures that sensitive data remains on-device, reducing latency and bypassing the risks associated with sending data to external servers. This is especially relevant for organizations dealing with sensitive or proprietary information.
  • Efficiency: Fine-tuned, smaller models can outperform large, generalized models like GPT-4 for specific tasks. These models require less computational power, making them faster and cheaper for narrow, task-specific operations.
  • Customization: Fine-tuning allows users to adapt models to their unique needs. Mark emphasized how models could be trained for various applications, such as summarizing complex documents or adopting specialized behaviors like customer service interactions or technical troubleshooting.

Technical Approach

Mark provided insights into the technical methods used to fine-tune models, such as low-rank adaptation, which involves updating only a fraction of a model’s parameters to optimize its performance for a specific task. This method reduces the computational demand, making it feasible to fine-tune models on consumer-grade hardware, and contrasts with the more resource-intensive full-weight updates.

The session demonstrated how fine-tuning can make open-source models more efficient for tasks like summarizing technical or research documents, highlighting the practical applications of this technique in a range of fields.

Challenges and Solutions

  • Overfitting: Mark discussed the issue of overfitting, where models trained for too long on limited data begin to memorize the training set rather than generalizing effectively. Monitoring evaluation metrics and adjusting training steps were suggested as key strategies to prevent overfitting.
  • Data Requirements: Fine-tuning requires high-quality datasets, with at least 1,000 examples formatted in question-and-answer style. Mark recommended generating training data using larger models like GPT-4 and verifying outputs to ensure accuracy.

Future Developments and Use Cases

The session explored potential use cases for fine-tuned models, including:

  • Chatbots: Fine-tuning can help create more specialized chatbots tailored to different domains, whether it's technical support, customer service, or instructional assistants.
  • Text-to-SQL Models: Fine-tuning could enable LLMs to translate natural language queries into structured SQL code, streamlining database interactions.
  • Multilingual Systems: Fine-tuned models can be trained to provide seamless responses in multiple languages, broadening their applicability for diverse user bases.
  • Agentic Systems: Small fine-tuned models could serve as decision-makers in more complex AI ecosystems, determining which tasks or larger models to delegate to based on user input.

Conclusion and Next Steps

The session concluded with an invitation for attendees to explore fine-tuning in their own projects. Mark encouraged participants to collaborate and share their experiences. Columbia’s AI Community of Practice will continue to host these monthly sessions, offering a platform for exploring innovative AI applications across a wide range of fields.

You can access a recording of the AICoP in the News section here.

November 2024

The AI Community of Practice at Columbia University, led by the Emerging Technologies team, held its final session of the year with a spotlight on two transformative technologies: LibreChat and AI Avatars. This November session provided attendees with demonstrations and discussions on how these innovations are reshaping productivity, training, and ethical considerations in artificial intelligence.


LibreChat: A Cost-Effective and Versatile AI Solution

The session began with a presentation by Parixit Dave, Senior Director for Emerging Technologies at CUIT, who introduced LibreChat, the upcoming replacement for CU GPT. CU GPT, a Columbia University innovation, offered a cost-efficient alternative to individual ChatGPT licenses by using OpenAI APIs. LibreChat, however, takes this further, integrating a wider range of models and functionalities to support diverse academic and professional needs.

Key Features of LibreChat

  1. Model Diversity:
    LibreChat supports multiple models, including OpenAI’s GPT series, Anthropic’s Claude, and Google’s Gemini, as well as custom GPTs, enabling users to compare and select the best tool for their tasks.
  2. Real-Time Processing:
    With rapid real-time streaming of responses, LibreChat minimizes waiting times for outputs.
  3. Document Analysis:
    Users can upload documents such as PDFs, Word files, and Excel spreadsheets for analysis and summary generation.
  4. Internet Search:
    The platform integrates real-time internet search capabilities, ensuring users can access the latest data beyond pre-trained model cutoffs.
  5. Image Creation and Processing:
    LibreChat enables users to generate and analyze images, opening possibilities for use cases in medical imaging, education, and creative projects.

Productivity Enhancements

Parikshit showcased examples of LibreChat's capabilities, from drafting formal emails to generating complex project statements of work (SOWs) and conducting financial analysis. These demonstrations highlighted the tool’s potential to dramatically enhance productivity while maintaining cost efficiency.

The session concluded with an invitation to explore LibreChat further upon its planned launch in early 2025, positioning it as a central tool for advancing Columbia’s AI capabilities.


AI Avatars: Transforming Training and Communication

The second half of the session was led by John P. Martin, Emerging Technologist at CUIT, who presented on the use of AI avatars for creating realistic, automated video content. John shared insights from a recent project with the Maryland School of Public Health, where over 400 videos were produced in under three weeks using AI-driven tools like Wondershare Virbo for avatars and ElevenLabs for voice generation.

Applications and Advantages

  1. Efficient Content Production:
    AI avatars were employed to deliver training videos with realistic lip-syncing and audio, bypassing the need for on-camera human actors and extensive video editing.
  2. Customization and Cloning:
    ElevenLabs allowed for the cloning of voices to maintain a consistent tone across videos while adapting to different scripts.
  3. Ethical Considerations:
    John emphasized transparency in AI-generated content, suggesting that creators attribute videos to AI to ensure audiences are informed.

Technical Overview

John explained the underlying technology, Stable Diffusion, which powers both AI-generated images and avatars. Stable Diffusion processes noise into refined outputs through iterative denoising, leveraging large datasets of visual and audio data. This mechanism ensures high-quality, realistic results in both static and dynamic applications.

Expanding Use Cases

From training modules to research communication, the applications for AI avatars and LibreChat are vast:

  • Training and Education: Quickly create instructional videos for courses or corporate training.
  • Research: Enhance study designs by anonymizing and preserving audio data for longitudinal analysis.
  • Creative Content: Leverage avatars for storytelling, marketing, or virtual events.

Looking Ahead

The session ended with an invitation to join upcoming discussions in 2025 as Columbia University continues to lead in exploring the intersection of AI, ethics, and practical applications. With tools like LibreChat and AI Avatars at the forefront, the university is fostering an environment where innovation and responsibility go hand in hand.

The AI Community of Practice remains a vital platform for sharing insights, challenges, and advancements, ensuring Columbia stays at the cutting edge of AI technologies

You can access a recording of the AICoP in the News section here.

September 2024

Columbia University’s AI Community of Practice, led by the Emerging Technologies team, kicked off its fall semester with a focus on the collaboration between academia and industry. This community provides a forum for faculty, staff, and industry partners to discuss AI and machine learning (ML) applications across teaching, research, and administrative areas. In this session, Anthropic was the featured guest, with a deep dive into their AI platform, Claude.

Key Features and Advantages of Claude for Enterprise

  • Data Security: Anthropic emphasized the importance of AI safety, urging participants to avoid inputting confidential or protected health information (PHI) in the free version of Claude, as data may be used for training purposes. Claude’s enterprise version offers stronger safeguards, ensuring that data used within organizations remains secure.
  • Custom Knowledge Base: Claude allows users to upload documents and other content to create project-specific knowledge bases. This enables tailored AI interactions that reflect the organization's needs, helping users access and process vast amounts of information efficiently.
  • Expanded Context Windows: Claude’s enterprise version offers a large token limit, allowing users to work with complex documents, such as research papers or grant applications, without losing valuable context. This enhances its usefulness in academic settings where large volumes of data need to be processed.
  • Integration with GitHub: While currently limited to GitHub, Anthropic is working on expanding integration capabilities with other platforms like Google Drive and Microsoft OneDrive. This allows developers and faculty to seamlessly access their code bases and other relevant documents without leaving the Claude environment.

Technical Approach

Claude leverages Constitutional AI, a safety-driven framework designed to prevent harmful or biased outputs, making it particularly well-suited for responsible AI usage. Claude's team continuously refines these safeguards to ensure that the model cannot be tricked into producing undesirable responses, a process known as "jail breaking."

Participants also discussed how Claude could assist with various practical use cases in academia, such as helping faculty and researchers apply for grants by analyzing research documents and providing suggestions based on the uploaded data.

Future Plans and Improvements

Anthropic is focused on enhancing Claude’s capabilities, including plans for internet search functionality and additional API integrations. These future developments aim to provide greater flexibility and usability for both research and administrative purposes. The company is also working on achieving HIPAA compliance to better support users from medical institutions.

With its commitment to safety and performance, Anthropic’s Claude is positioned to become a powerful tool in academia, helping institutions like Columbia University harness the potential of AI for a variety of use cases, from research to administrative tasks.

The session concluded with a commitment to ongoing collaboration, and a follow-up session was proposed to explore the API capabilities of Claude in more detail.

You can access a recording of the AICoP in the News section here.

June 2024

Columbia University's AI Community of Practice, spearheaded by Columbia University Information Technology (CUIT), has embarked on a pioneering project to develop CU-GPT, an AI-driven interface designed to serve as a cost-effective and secure alternative to ChatGPT. This initiative aims to enhance the academic and administrative experience by integrating advanced AI capabilities into the university's infrastructure.

Key Advantages of CU-GPT

  • Cost Efficiency: CU-GPT operates on a pay-as-you-go model, significantly reducing costs compared to ChatGPT's enterprise license. Departments can set daily usage limits, and users can monitor their real-time spending, ensuring transparent and manageable costs.
  • Data Security: Running on Columbia's infrastructure, CU-GPT leverages an enterprise contract with OpenAI, ensuring robust data security measures. This setup protects the intellectual property and research data of users, addressing privacy concerns.
  • Customizability and Flexibility: CU-GPT recognizes user roles and responsibilities, providing tailored access to specific GPT instances relevant to different departments and research needs. This ensures that users have access to pertinent AI tools and resources.
  • Scalability and Performance: Utilizing AWS Lambda, CU-GPT offers scalable computing power, efficiently handling multiple user requests. This serverless architecture ensures cost-effectiveness by charging only for the compute time used, without maintaining continuous infrastructure.
  • Enhanced User Experience: The platform maintains user chat history, allows for file uploads, and features an intuitive interface similar to ChatGPT, making it accessible and user-friendly.

Technical Implementation 

  • Architecture: CU-GPT's architecture includes document vectorization, converting plain text into numerical embeddings that capture the semantic meaning of documents. This allows for more accurate retrieval based on the context rather than just keywords. The system uses cosine similarity algorithms to rank documents according to their relevance to user queries. The results are combined with a large language model to produce detailed summaries and recommendations.
  • Scalability Considerations: To ensure high performance and accuracy, the CU-GPT team focuses on robust computing strategies. These include pre-building vector databases and exploring efficient computational methods, such as running large language models (LLMs) in-house to manage and reduce long-term costs. The AWS Lambda architecture supports scalable computing, handling high query volumes efficiently.

Future Developments

The CU-GPT team is continuously working on enhancing the platform’s capabilities, including potential support for image processing and expanded customization options. They are also pursuing HIPAA compliance to securely handle health-related data. As the project progresses, it aims to set new standards for AI integration in academic research and administrative processes, demonstrating Columbia University's commitment to leveraging advanced technology for the benefit of its community.

You can access a recording of the AICoP in the News section here.

May 2024

Columbia Libraries/Information Services, the heart of the university's intellectual life, embarked on a journey to improve discovery processes using AI and large language models (LLMs), particularly focusing on the Columbia CLIO library search system. The system uses retrieval augmented generation (RAG) to enhance search capabilities, connecting an AI system to a database of documents and using vector databases to capture semantic meanings.

Key Advantages of AI-enhanced University Library Search System

  • Specificity: ability to handle complex research topics and provide highly relevant documents by understanding the semantic meaning of the queries.
  • Natural Language: allows for natural language queries, making it easier for users, especially students, to find relevant documents without needing specialized search syntax.
  • Translation: ability to find and translate documents from different languages, expanding access to non-English documents.
  • Feedback Loop: ability to refine search results based on user feedback, continuously improving the relevance of the documents provided.

Technical Implementation 

  • Architecture: RAG begins with creating vector embeddings for all documents in a database so that they can be searched according to their semantic meaning. The user’s query is first used by the AI system/LLM to retrieve the most relevant documents from the databases by using their vector embeddings. The documents are included in the final prompt to the LLM, allowing the LLM to answer queries using its natural language capabilities and the source material from the database. The vector database can provide more contextually relevant search results compared to traditional keyword searches.
  • Scalability considerations: When it comes to scalability, considerations include scaling the AI system to handle large query volumes while managing costs. One strategy is to pre-build vector databases and explore more efficient computational methods, like running LLMs in-house to spread out costs over time.

You can access a recording of the AICoP in the News section here.

April 2024

This session provided a comprehensive understanding of the transformative effects of Attention technology in large language models (LLMs), exploring both its groundbreaking applications and the challenges it presents in terms of computational demands and potential ethical concerns. Attention is central to the functionality of LLMs, such as those used in GPT (Generative Pre-trained Transformer) architectures.

Overview of Attention Mechanism

  • LLMs are based on a transformer architecture, heavily relying on a mechanism known as Attention to process language data.
  • Attention helps the model determine which parts of the input data are relevant, which improves its ability to generate coherent and contextually appropriate responses. It is pivotal for the performance improvements seen in models like GPT.

Practical Applications and Implications

  • Attention in models applies in practical applications from simple text generation to complex tasks like multimodal inputs (integrating text, image, and video).
  • Computational demand of these models and their reliance on large, diverse datasets may not always be of high quality or free from bias.
  • Conversation also touched on potential future advancements in AI and how attention-based models could revolutionize various fields by processing complex, multimodal data.

You can access a recording of the AICoP in the News section here.

March 2024

The highly informative session demonstrated different ways to create custom bots using large language models (LLMs) and how to practically apply these concepts. These methods include no-code options, open-source tools, and more technical solutions such as retrieval-augmented generation (RAG) fine-tuning using custom datasets.

Key highlights included:

  • Creating custom bots using ChatGPT Enterprise offers an accessible, no-code solution to tailor LLMs for specific needs.
  • Running open-source LLMs on proprietary hardware or utilizing cloud computing platforms provides various options based on technical expertise and resource availability.
  • RAG fine-tuning technique offers paths to significantly enhance the performance and accuracy of LLMs using custom datasets.
  • Technical discussions centered around the importance of clean, well-prepared datasets for training and the potential for custom LLMs to be integrated into various applications and services.

The meeting discussed the practical aspects of using LLMs at different technical levels. The focus was on customization, data privacy, and finding a balance between model accuracy and resource investment. It highlighted the fast-paced development of LLM technology and tools and suggested that more user-friendly and effective solutions for fine-tuning and customization are likely to emerge soon.

You can access a recording of the AICoP in the News section here.

February 2024

AICoP convened a session dedicated to the overview of ChatGPT Enterprise for Columbia University. The agenda encompasses an in-depth presentation of ChatGPT Enterprise, highlighting its features, enhanced security and privacy, and benefits of ChatGPT's custom GPTs to the Columbia University community. 

Columbia University ChatGPT Enterprise

  • CUIT has finalized an enterprise-wide license agreement for ChatGPT, marking a significant step forward in incorporating AI tools into the university's toolkit.
  • The process involved extensive reviews, including security, architecture, and legal considerations, to ensure data protection and compliance.

Security and Data Protection

  • A 'walled garden' approach ensures high-level data protection, privacy, and encryption, with compliance with GDPR and HIPAA.
  • User data remains private and is not shared externally or used for model fine-tuning without permission.

Features and Benefits

  • ChatGPT offers advanced AI capabilities, including the most mature AI model with image creation, data analytics, and code interpretation features.
  • Enterprise customers can create internal-only GPTs for specific business needs, departments, or proprietary datasets, without coding.
  • The enterprise license ensures dedicated, reliable, and scalable access, distinguishing it from free or commercial versions.

Integration and Use Cases

  • ChatGPT demonstrates the potential for integration into Columbia's systems, such as CourseWorks, to enhance educational tools and create personalized learning experiences.
  • CUIT demoed various use cases and initiated discussions on API usage for broader application and customization possibilities, including building and sharing bots for specific departmental or research needs.

You can access a recording of the AICoP in the News section here.

January 2024
 

Columbia Technology Ventures (CTV), the technology transfer arm of Columbia University, provided insights into the potential use of ChatGPT to enhance various office functions and automate tasks.

  • Experiments and Projects: CTV showcased various AI-integrated projects, including the automation of mass email campaigns using a Python package, the on-the-fly drafting of legal language for licensing agreements, improving negotiation efficiency, and analyzing data tables to streamline internal business functions using ChatGPT (DAT GPT). One of the most significant findings was ChatGPT's proficiency in high-skill tasks such as drafting legal language and performing data analysis.
  • Automation Potential: The exploration revealed a considerable potential for automating repetitive tasks, which could benefit those unfamiliar with coding. However, exporting non-textual files remains a challenge, albeit one expected to improve as technology advances.
  • Learning and Future Applications: CTV is optimistic about further integrating ChatGPT into their workflows, emphasizing the need for continued experimentation and adaptation to leverage AI tools effectively. The session emphasized the importance of setting up system prompts and scope prompts to guide ChatGPT's interactions, enhancing its efficiency and relevance to specific tasks.
  • Community Feedback and Interest: The exploration generated significant interest among the participants, with discussions on how to set up effective prompts for ChatGPT and the potential for its application in various projects.

You can access a recording of the AICoP in the News section here.