MarkTechPost

ReasonFlux: Elevating LLM Reasoning with Hierarchical Template Scaling

ReasonFlux: Elevating LLM Reasoning with Hierarchical T...

Feb 15, 2025 0

Large language models (LLMs) have demonstrated exceptional problem-solving abili...

Google DeepMind Researchers Propose Matryoshka Quantization: A Technique to Enhance Deep Learning Efficiency by Optimizing Multi-Precision Models without Sacrificing Accuracy

Google DeepMind Researchers Propose Matryoshka Quantiza...

Feb 15, 2025 0

Quantization is a crucial technique in deep learning for reducing computational ...

TransMLA: Transforming GQA-based Models Into MLA-based Models

TransMLA: Transforming GQA-based Models Into MLA-based ...

Feb 15, 2025 0

Large Language Models (LLMs) have gained significant importance as productivity ...

Microsoft Research Introduces Data Formulator: An AI Application that Leverages LLMs to Transform Data and Create Rich Visualizations

Microsoft Research Introduces Data Formulator: An AI Ap...

Feb 15, 2025 0

Most modern visualization authoring tools like Charticulator, Data Illustrator, ...

This AI Paper from UC Berkeley Introduces a Data-Efficient Approach to Long Chain-of-Thought Reasoning for Large Language Models

This AI Paper from UC Berkeley Introduces a Data-Effici...

Feb 15, 2025 0

Large language models (LLMs) process extensive datasets to generate coherent ou...

Salesforce AI Research Introduces Reward-Guided Speculative Decoding (RSD): A Novel Framework that Improves the Efficiency of Inference in Large Language Models (LLMs) Up To 4.4× Fewer FLOPs

Salesforce AI Research Introduces Reward-Guided Specula...

Feb 14, 2025 0

In recent years, the rapid scaling of large language models (LLMs) has led to ex...

Layer Parallelism: Enhancing LLM Inference Efficiency Through Parallel Execution of Transformer Layers

Layer Parallelism: Enhancing LLM Inference Efficiency T...

Feb 14, 2025 0

LLMs have demonstrated exceptional capabilities, but their substantial computati...

ByteDance Introduces UltraMem: A Novel AI Architecture for High-Performance, Resource-Efficient Language Models

ByteDance Introduces UltraMem: A Novel AI Architecture ...

Feb 14, 2025 0

Large Language Models (LLMs) have revolutionized natural language processing (NL...

Step by Step Guide on How to Build an AI News Summarizer Using Streamlit, Groq and Tavily

Step by Step Guide on How to Build an AI News Summarize...

Feb 14, 2025 0

Introduction In this tutorial, we will build an advanced AI-powered news agent t...

Open O1: Revolutionizing Open-Source AI with Cutting-Edge Reasoning and Performance

Open O1: Revolutionizing Open-Source AI with Cutting-Ed...

Feb 14, 2025 0

The Open O1 project is a groundbreaking initiative aimed at matching the powerfu...

Can Users Fix AI Bias? Exploring User-Driven Value Alignment in AI Companions

Can Users Fix AI Bias? Exploring User-Driven Value Alig...

Feb 14, 2025 0

Large language model (LLM)–based AI companions have evolved from simple chatbots...

Google DeepMind Research Introduces WebLI-100B: Scaling Vision-Language Pretraining to 100 Billion Examples for Cultural Diversity and Multilingualit

Google DeepMind Research Introduces WebLI-100B: Scaling...

Feb 14, 2025 0

Machines learn to connect images and text by training on large datasets, where m...

Meta AI Introduces CoCoMix: A Pretraining Framework Integrating Token Prediction with Continuous Concepts

Meta AI Introduces CoCoMix: A Pretraining Framework Int...

Feb 13, 2025 0

The dominant approach to pretraining large language models (LLMs) relies on next...

Anthropic AI Launches the Anthropic Economic Index: A Data-Driven Look at AI’s Economic Role

Anthropic AI Launches the Anthropic Economic Index: A D...

Feb 13, 2025 0

Artificial Intelligence is increasingly integrated into various sectors, yet the...

Can 1B LLM Surpass 405B LLM? Optimizing Computation for Small LLMs to Outperform Larger Models

Can 1B LLM Surpass 405B LLM? Optimizing Computation for...

Feb 13, 2025 0

Test-Time Scaling (TTS) is a crucial technique for enhancing the performance of ...

Meet Huginn-3.5B: A New AI Reasoning Model with Scalable Latent Computation

Meet Huginn-3.5B: A New AI Reasoning Model with Scalabl...

Feb 13, 2025 0

Artificial intelligence models face a fundamental challenge in efficiently scali...

23
24
25
26
27

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.