Popular AIs head-to-head: OpenAI beats DeepSeek on sentence-level reasoning
Large language model AIs can ingest long documents and answer questions about them, but a key question is how well they ‘understand’ individual sentences in the documents.
Manas Gaur, Assistant Professor of Computer Science and Electrical Engineering, University of Maryland, Baltimore County •
conversation
April 17, 2025 • ~8 min
April 17, 2025 • ~8 min
Getting AIs working toward human goals − study shows how to measure misalignment
Aligning AIs with people’s goals and values is tricky. A new technique quantifies how far off human and machine are from each other.
Aidan Kierans, Ph.D. Student in Computer Science and Engineering, University of Connecticut •
conversation
April 14, 2025 • ~5 min
April 14, 2025 • ~5 min
What are AI hallucinations? Why AIs sometimes make things up
When AI systems try to bridge gaps in their training data, the results can be wildly off the mark: fabrications and non sequiturs researchers call hallucinations.
Katelyn Mei, Ph.D. Student in Information Science, University of Washington •
conversation
March 21, 2025 • ~7 min
March 21, 2025 • ~7 min
Software is increasingly being built by AI – so it’s vital to know if it can be trusted
Handing over the tasks once done by human developers comes with some major risks.
Jordi Cabot, Head of the Software Engineering RDI Unit at LIST. FNR Pearl Chair. Affiliate Professor in CS at University of Luxembourg, Luxembourg Institute of Science and Technology (LIST) •
conversation
March 17, 2025 • ~6 min
March 17, 2025 • ~6 min
/
41