Cohere Labs Community

community

https://cohere.com/research

Cohere_Labs

Cohere-Labs-Community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

ljvmiranda921 authored a paper 15 days ago

Multilinguality at the Edge: Developing Language Models for the Global South

muhammadravi251001 authored a paper 29 days ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

muhammadravi251001 authored a paper 29 days ago

CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data

View all activity

Cartinoe5930

authored 2 papers 6 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Paper • 2601.06165 • Published Jan 7 • 16

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

Paper • 2604.13058 • Published Mar 18 • 2

Cartinoe5930

authored a paper 7 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 10 days ago • 77

Sri-Vigneshwar-DJ

posted an update 17 days ago

Post

124

![Feather DB LongMemEval Results]( Hawky-ai/longmemeval-results)

We ran Feather DB v0.8.0 on LongMemEval (ICLR 2025) — 500 questions across real multi-session conversations, up to 115K tokens each.

**Score: 0.693** · GPT-4o full-context baseline: 0.640
Full 500-question run with Gemini-Flash: **$2.40**

Per-axis breakdown:
→ Info-extraction: **0.942**
→ Knowledge-update: **0.714**
→ Multi-session: **0.606**
→ Temporal: **0.477** ← the hard one, Phase 9 addresses this

Architecture: Hybrid BM25+dense · adaptive temporal decay · embedded (no server) · p50 = 0.19ms · MIT

pip install feather-db

Raw results + audit JSONs: Hawky-ai/longmemeval-results

alexrs

authored 5 papers 21 days ago

One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

Paper • 2506.10766 • Published Jun 12, 2025 • 1

Tiny Aya: Bridging Scale and Multilingual Depth

Paper • 2603.11510 • Published Mar 12 • 8

kenza-ily

authored a paper about 2 months ago

DISCO: Document Intelligence Suite for COmparative Evaluation

Paper • 2603.23511 • Published Mar 4

peaceAsh

authored a paper 2 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 24

kenza-ily

authored 3 papers 2 months ago

Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models

Paper • 2407.16470 • Published Jul 23, 2024

Retrieval or Representation? Reassessing Benchmark Gaps in Multilingual and Visually Rich RAG

Paper • 2603.04238 • Published Mar 4

How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making?

Paper • 2410.16574 • Published Oct 21, 2024

Cartinoe5930

authored a paper 3 months ago

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24

Sri-Vigneshwar-DJ

posted an update 3 months ago

Post

1460

Just released a new dataset designed for training reasoning models on Meta (Facebook/Instagram) advertising fatigue detection!

What is it? A GRPO (Group Relative Policy Optimization) training dataset with 200+ carefully crafted scenarios covering:

🔍 Fatigue Signal Detection: CTR drops, CPM spikes, frequency analysis
🩺 Performance Diagnosis: Root cause analysis frameworks
📋 Strategy: Creative refresh cadence, testing frameworks
📊 Analysis: ROI calculations, metric interpretation
Why GRPO? GRPO training helps models learn structured reasoning. Each response follows the <thinking> and <answer> format.

Check it out here: Sri-Vigneshwar-DJ/meta-fatigue-grpo-dataset

Sri-Vigneshwar-DJ

posted an update 4 months ago

Post

241

🏙️ Hugging Face Community Post
Title: 🧬 Experimenting with "Dynamic Chaos" in Tamil SLMs

Hi everyone! I just published a new experimental study on Small Language Model (SLM) resilience.

I took the Qwen2.5-0.5B model and put it through a "Chaos Phase" to see how much weight data a tiny model can lose before its understanding of classical Tamil grammar breaks.

Key highlights of the study:

Target Data: Fine-tuned on the Thirukkural (1,330 couplets + modern explanations).
The Chaos Step: Applied 20% random weight pruning but implemented "Layer Protection" for the Token Embeddings and LM Head to keep the characters readable.
Compression: 4-bit (Q4_K_M) quantization for extreme efficiency.
Result: A surrealist classical Tamil model that is ultra-light (~300MB) and ultra-fast!

Check out the model and the experiment logic here: Sri-Vigneshwar-DJ/qwen-tamil-chaos-v1

bryanlimy

authored a paper 4 months ago

V1T: large-scale mouse V1 response prediction using a Vision Transformer

Paper • 2302.03023 • Published Feb 6, 2023

BounharAbdelaziz

authored a paper 4 months ago

YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

Paper • 2601.08441 • Published Jan 13 • 8

BounharAbdelaziz

submitted a paper to Daily Papers 4 months ago

YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

Paper • 2601.08441 • Published Jan 13 • 8

AI & ML interests

Recent Activity

Team members 171

CohereLabsCommunity's activity