Applications and Uses
updated
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper
•
2506.09790
•
Published
•
53
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety
Assurance
Paper
•
2506.06444
•
Published
•
73
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper
•
2506.11763
•
Published
•
74
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper
•
2502.04644
•
Published
•
4
Deep Research Agents: A Systematic Examination And Roadmap
Paper
•
2506.18096
•
Published
•
3
Can LLMs Identify Critical Limitations within Scientific Research? A
Systematic Evaluation on AI Research Papers
Paper
•
2507.02694
•
Published
•
19
Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs
More Realistic and Less Risky
Paper
•
2507.03336
•
Published
•
7
PresentAgent: Multimodal Agent for Presentation Video Generation
Paper
•
2507.04036
•
Published
•
11
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning
Systems in LLMs
Paper
•
2507.09477
•
Published
•
88
AbGen: Evaluating Large Language Models in Ablation Study Design and
Evaluation for Scientific Research
Paper
•
2507.13300
•
Published
•
20
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional
Virtual Try-On and Try-Off
Paper
•
2508.04825
•
Published
•
60
Complex Logical Instruction Generation
Paper
•
2508.09125
•
Published
•
40
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Paper
•
2508.18076
•
Published
•
6
UQ: Assessing Language Models on Unsolved Questions
Paper
•
2508.17580
•
Published
•
15
A Survey of Scientific Large Language Models: From Data Foundations to
Agent Frontiers
Paper
•
2508.21148
•
Published
•
140
AutoIntent: AutoML for Text Classification
Paper
•
2509.21138
•
Published
•
36
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?
Paper
•
2510.02209
•
Published
•
54
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel
Translation
Paper
•
2510.09116
•
Published
•
96
Back to Basics: Let Denoising Generative Models Denoise
Paper
•
2511.13720
•
Published
•
69
Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Paper
•
2512.06421
•
Published
•
7
MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics
Paper
•
2601.02075
•
Published
•
8
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models
Paper
•
2601.01321
•
Published
•
18
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility
Paper
•
2601.17027
•
Published
•
41
OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution
Paper
•
2601.20380
•
Published
•
8
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
Paper
•
2601.22060
•
Published
•
145