About the Event

Explore the latest in Natural Language Processing and Large Language Models at the Munich NLP in-person meetup! Join us on Febuary 10, 2026, starting at 18:00 at the TNG office in Munich. We’ll have expert talks on cutting-edge LLM applications, followed by a networking session. All are welcome!

Agenda - TIMETABLE:

Agenda

18:00 - Open Door
18:30 - Welcome + Intro to MunichNLP and TNG Technology Consulting GmbH
18:40 - AI Research @ TNG: How to process 20 billion tokens per day on OpenRouter - Henrik Klagges & Fabian Klemm & Daniel Klingmann
19:20 - Break (5 min)
19:25 - Benchmarking Memory in LLMs: Retrieval, Long Context, and Multi-Turn Interaction - Ali Modarressi
20:05 - Pizza + Networking
21:30 - Close

Talks

AI Research @ TNG: How to process 20 billion tokens per day on OpenRouter
Daniel Klingmann & Fabian Klemm & Henrik Klagges
TNG combined efficient use of limited GPU resources with innovative approaches to construct high-performance child LLMs based on DeepSeek parent models. The talk outlines some of the approaches that worked, and some that did not, and how the new model variants differ. The practical relevance of the variants has been demonstrated empirically by the over 20 billion tokens processed by these models every day, sometimes reaching 105k requests per hour. The Chimera models, for example, resulted in TNG becoming one of Open Router’s Top 10 open-source model creators. Benchmarking Memory in LLMs: Retrieval, Long Context, and Multi-Turn Interaction
Ali Modarressi:
As LLM systems increasingly rely on retrieval, long contexts, and extended interaction, it becomes essential to benchmark how reliably they access and use information. This talk first presents controlled evidence that dense retrievers can exhibit systematic biases toward heuristic cues—favoring shorter documents, earlier mentions, repeated entities, or literal matches—sometimes ranking these above evidence that actually contains the answer. It then introduces implicit fact retrieval settings in which relevance depends on facts stated only implicitly in documents, requiring temporal, arithmetic, or world-knowledge inference despite superficially simple queries. Next, the talk examines long-context evaluation beyond literal matching, showing substantial performance degradation as context grows once lexical-overlap cues are removed. Finally, it covers dialogue-conditioned benchmarks for extended interactions that quantify drift and trade-offs among persona consistency, instruction following, and safety behavior over long conversations. The talk concludes by highlighting how these benchmarks can guide design decisions for memory-augmented LLM systems.
Ali Modarressi is a third-year PhD student at the Center for Information and Language Processing (CIS) at LMU Munich, supervised by Prof. Hinrich Schütze. Their current research focuses on memory-augmented large language models and, more broadly, long-context language modeling. They have also worked on interactive language generation and information extraction. Ali began their NLP research during their MSc under the supervision of Mohammad Taher Pilehvar, where they studied explainability methods and the interpretability of pre-trained language models—topics that remain relevant to their current work, particularly in analyzing retrieval models and knowledge probing.

Thank you to TNG Technology Consulting GmbH for sponsoring and supporting the organization of this event.

Save the Date!

RSVP

About the Event

Agenda - TIMETABLE:

Talks

Sponsor