TuringPost (Turing Post)

Kseniase

posted an update 3 days ago

Post

1584

10 Recent Advancements in Math Reasoning

Over the last few weeks, we have witnessed a surge in AI models' math reasoning capabilities. Top companies like Microsoft, NVIDIA, and Alibaba Qwen have already joined this race to make models "smarter" in mathematics. But why is this shift happening now?

Complex math calculations require advanced multi-step reasoning, making mathematics an ideal domain for demonstrating a model's strong "thinking" capabilities. Additionally, as AI continues to evolve and is applied in math-intensive fields such as machine learning and quantum computing (which is predicted to see significant growth in 2025), it must meet the demands of complex reasoning.
Moreover, AI models can be integrated with external tools like symbolic solvers or computational engines to tackle large-scale math problems, which also needs high-quality math reasoning.

So here’s a list of 10 recent advancements in math reasoning of AI models:

1. NVIDIA: AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling (2412.15084)

2. Qwen, Alibaba: Qwen2.5-Math-PRM The Lessons of Developing Process Reward Models in Mathematical Reasoning (2501.07301) and PROCESSBENCH evaluation ProcessBench: Identifying Process Errors in Mathematical Reasoning (2412.06559)

3. Microsoft Research: rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking (2501.04519)

4. BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning (2501.03226)

5. URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics (2501.04686)

6. U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs (2412.03205)

7. Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs (2501.06430)

8. End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach (2501.04425)

9. Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning (2501.03035)

10. System-2 Mathematical Reasoning via Enriched Instruction Tuning (2412.16964)

Kseniase

posted an update 5 days ago

Post

503

Today, we spoke with Snowflake’s AI Research Team Leads, Yuxiong He and Samyam Rajbhandari ( @samyam ) (he is also one the researchers behind DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference (2401.08671) and other DeepSpeed papers)

Collaborating with their co-authors to reduce inference costs for enterprise-specific tasks, they observed that inputs are often significantly larger than outputs. This is because it’s in the nature of enterprises to analyze enormous amounts of information trying to extract valuable insights, which are much shorter. To address this, they developed SwiftKV SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation (2410.03960), an optimization that reduces LLM inference costs by up to 75% for Meta Llama LLMs, enhancing efficiency and performance in enterprise AI tasks.

Today they are open-sourcing SwiftKV ( Snowflake/Llama-3.1-SwiftKV-8B-Instruct) and ArcticTrainging Platform.
In our new episode "15 minutes with a Researcher" they explain how SwiftKV works, its applicability to other architectures, its limitations, and additional methods to further reduce computation costs in inference.
Watch the full 15 min interview here (https://youtu.be/9x1k7eXe-6Q?si=4_HQOyi1CPHgvlrx)

Kseniase

posted an update 10 days ago

Post

1794

10 AI Systems for Scientific Research

Almost every AI researcher has studied or conducted a large number of AI research papers. So, it's quite logical that researchers are trying to create AI systems to help conduct research. Creating scientific research could be much easier and more varied if we use LLMs and AI assistants tailored for this purpose. Just imagine how interesting it would be to read high-quality research about AI made by an AI agent.

Today, we offer you to explore these 10 AI systems for scientific research:

1. Agent Laboratory framework helps researchers input their ideas by generating a research report and code repository: Agent Laboratory: Using LLM Agents as Research Assistants (2501.04227)

2. AI Scientist performs fully automated scientific discovery including creating ideas: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2408.06292)

3. SciMON generates new ideas derived from the scientific literature: Learning to Generate Novel Scientific Directions with Contextualized Literature-based Discovery (2305.14259)

4. ResearchAgent implements LLMs to automate idea generation, methods, and experiment design, and ReviewingAgents' feedback to refine ideas: ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models (2404.07738)

5. Scientific Generative Agent (SGA) discovers novel, coherent solutions in physics and molecular design: LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery (2405.09783)

6. MLRCopilot boosts machine learning research: MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents (2408.14033)

7. SciAgents accelerates material science discovery through combining knowledge graphs, LLMs, and multi-agent systems. SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (2409.05556)

8. VirSci multi-agent system mimics teamwork among scientists. Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation (2410.09403)

9. Chain-of-Ideas (CoI) agent organizes research into a chain structure. Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents (2410.13185)

10. A system with CycleResearcher and CycleReviewer generates research papers and peer reviews: CycleResearcher: Improving Automated Research via Automated Review (2411.00816)

LLM4SR: A Survey on Large Language Models for Scientific Research (2501.04306) is worth exploring to study and analyze more systems for scientific research

Kseniase

posted an update 24 days ago

Post

2567

10 Free Comprehensive Datasets for Supervised Fine-Tuning

High-quality datasets, their size and relevance directly impact the effectiveness of fine-tuning and the models' real-world applications. Among the numerous datasets for different tasks, it can be challenging to choose the most comprehensive dataset that best suits your purposes.

So today, we invite you to explore top 10 free datasets on natural language processing and maths:

1. fka/awesome-chatgpt-prompts proposes a huge variety of prompts that can be used with ChatGPT. Over 700 models were trained on this dataset.

2. HuggingFaceFW/fineweb from Hugging Face includes 15T tokens of cleaned and deduplicated English web data. It’s suitable for LLM training, benchmarking, model validation.

3. HuggingFaceFW/fineweb-2 is an another version of FineWeb with high-quality pretraining data to over 1000 languages.

4. O1-OPEN/OpenO1-SFT with Chinese and English data can be used for Chain-of-Thought activation.

5. yahma/alpaca-cleaned is a curated version of the original Alpaca Dataset released by Stanford.

6. lmsys/lmsys-chat-1m with 1 million real-world conversations with 25 state-of-the-art LLMs offers diverse use cases, like content moderation, safety benchmarks, and training instruction-following models.

7. allenai/dolma from Allen AI includes 3T tokens from a diverse mix of web content, academic publications, code, books, and encyclopedic materials.

Math datasets:

1. HuggingFaceTB/finemath consists of educational math content and has two versions: 34B tokens and 54B tokens.

2. amphora/QwQ-LongCoT-130K for training O1-like LLMs.

3. openai/gsm8k for training multi-step reasoning.

Kseniase

updated a Space 30 days ago

Configuration error

🚀

README

Kseniase

posted an update 30 days ago

Post

3060

**15 Agentic Systems and Frameworks of 2024**

This year, we started our “AI Agents and Agentic Workflows” series (https://www.turingpost.com/t/AI-Agents) to explore everything about AI agents step by step: all the vocabulary, how they work, and how to build them.
The huge interest in this series and the large number of studies conducted on agents showed that it was one of the most popular and important themes of the year. In 2025, most likely, agents will reach new highs – we will be covering that for you. Now, let’s review the agentic systems that have emerged this year.

Here is a list of 15 agentic systems and frameworks of 2024:

1. GUI Agents: A Survey (2412.13501)

2. Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level (2411.03562)

3. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2408.06292)

4. MALT: Improving Reasoning with Multi-Agent LLM Training (2412.01928)

5. Agent S: An Open Agentic Framework that Uses Computers Like a Human (2410.08164)

6. Automated Design of Agentic Systems (2408.08435)

7. AgentInstruct: Toward Generative Teaching with Agentic Flows (2407.03502)

8. AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant (2410.18603)

9. WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents (2410.07484)

10. Generative Agent Simulations of 1,000 People (2411.10109)

11. DynaSaur: Large Language Agents Beyond Predefined Actions (2411.01747)

12. PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking (2410.12375)

13. Generative World Explorer (2411.11844)

14. Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines (2412.14684)

15. AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions (2410.20424)

Thanks for reading Turing Post!
Subscribe to receive new posts straight into your inbox -> https://www.turingpost.com/subscribe

Kseniase

posted an update about 1 month ago

Post

2842

TL;DR: The Story of Attention's Development by @karpathy

Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho , and Yoshua Bengio in Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473) . Inspired by cognitive processes and later renamed from "RNNSearch."

Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.

Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: Attention Is All You Need (1706.03762) (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Graves’s Neural Turing Machines (1410.5401) and Jason Weston’s Memory Networks (1410.3916) .

Attention to history: Jürgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term “attention” was absent, and there’s no evidence it influenced Bahdanau, Cho, and Bengio’s 2014 work. Paying attention (!) to history might have brought us to genAI earlier – but credit for the breakthrough still goes to Montreal.

Referenced Papers:
Attention Origin: Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Transformers: Attention Is All You Need (1706.03762)
Alex Graves' Work: Neural Turing Machines (1410.5401), Generating Sequences With Recurrent Neural Networks (1308.0850)
Jason Weston @spermwhale 's Memory Networks (1410.3916)
Sequence to Sequence Learning with Neural Networks (1409.3215) by Ilya Sutskever ( @ilyasut ), Oriol Vinyals, Quoc V. Le

Who else deserves recognition in this groundbreaking narrative of innovation? Let’s ensure every contributor gets the credit they deserve. Leave a comment below 👇🏻🤗

5 replies

·

Turing Post

AI & ML interests

Recent Activity

TuringPost's activity

README

AI & ML interests

Recent Activity

Team members 3

TuringPost's activity

README