Columbia University Information Technology Develops CU-GPT: An Alternative to ChatGPT

June 28, 2024

Columbia University's AI Community of Practice recently hosted an engaging session led by Parixit Dave (Sr. Director Emerging Technologies) and Maneesha Aggarwal (AVP, Academic Services) from Columbia University Information Technology (CUIT), with a technical presentation by Osman Kabir (Senior Application Systems Administrator). The session introduced an innovative project, CU-GPT, designed to serve as an alternative to ChatGPT. This initiative aims to provide a cost-effective and secure AI-driven interface for the university community, addressing the specific needs of academic and administrative users.

The Vision Behind CU GPT

CU-GPT was developed to mitigate the high costs associated with ChatGPT's enterprise license while maintaining robust data security and privacy standards essential for a research institution. With an enterprise license costing $720 per person per year, the need for a more affordable solution became apparent. CU-GPT offers a similar interface and functionality to ChatGPT, allowing users to interact with the latest AI models, including GPT-4, for text generation, coding, statistical analysis, and more. However, image processing remains outside its current capabilities.

Key Features and Advantages

  1. Cost Efficiency: CU GPT is designed with a pay-as-you-go model, making it more affordable for users. Departments can set daily usage limits, and users can see their real-time spending, ensuring cost management and transparency.
  2. Data Security: Ensuring data protection was paramount. CU GPT runs on Columbia's infrastructure, leveraging an enterprise contract with OpenAI that includes comprehensive data security measures. This setup protects the intellectual property and research data of its users.
  3. Customizability and Flexibility: The system is designed to recognize user roles and responsibilities, allowing tailored access to specific GPT instances relevant to different departments and research needs. This customization ensures that users have access to pertinent AI tools and resources.
  4. Scalability and Performance: Built on AWS Lambda, CU GPT offers scalable computing power, handling multiple user requests efficiently. This serverless architecture ensures cost-effectiveness by charging only for the compute time used, without maintaining continuous infrastructure.
  5. Enhanced User Experience: The platform maintains user chat history, allows for file uploads, and features an intuitive interface similar to ChatGPT, making it accessible and user-friendly.

Technical Architecture

The technical framework of CU-GPT includes:

  1. Document Vectorization: Converts plain text into numerical embeddings, capturing the semantic meaning of documents for more accurate retrieval.
  2. Vector-Based Search: Uses cosine similarity algorithms to rank documents based on their relevance to user queries.
  3. AI-Assisted Retrieval: Combines search results with a large language model to produce detailed summaries and recommendations. 

The development process emphasizes security, with data encrypted both at rest and in transit. CU-GPT also integrates identity and access management controls to ensure that only authorized users access specific data.

Future Developments

Looking ahead, the CU-GPT team plans to enhance the platform's capabilities, including potential support for image processing and expanded customization options. They are also working towards HIPAA compliance to enable the handling of health-related data securely.

As the project progresses, it promises to set new standards for AI integration in academic research and administrative processes, demonstrating Columbia University's commitment to leveraging advanced technology to benefit its community.

The session concluded with a call for participation in a larger pilot set for mid to late July, targeting up to a thousand users. This pilot will provide valuable insights into user needs and help refine the system further before its full launch in the fall semester.

Get Involved

Faculty, researchers, and administrative members are encouraged to join the pilot program and provide feedback. This collaboration will be crucial in shaping the future of CU-GPT, ensuring it meets the diverse needs of Columbia University's community.

Emerging Technologies' mission is to empower faculty, researchers, and administrative members to utilize next-generation technologies to explore, adopt and integrate new applications and strategies. Please visit our AI: Community of Practice for discussion highlights on AI topics and an interest intake form to join the monthly gatherings.