LLM360 | Open-Source LLMs towards Community-Driven AGI 🚀

TxT360

A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend. The first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources from diverse domains (e.g., FreeLaw, PG-19, etc.). The large-scale deduplication process and rich metadata stored enables precise control over data distribution.

K2-65B

A 65B parameter language model trained on 1.4T tokens. It outperforms Llama 2 70B, but uses approximately 35% less compute to train.

Crystal-7B

A 7B parameter language model, distinctively trained on the SlimPajama and StarCoder datasets, eclipsing the Llama 2 frontier, skillfully balances language and coding. Its instruction-following variant, CrystalChat, stands out as a top-scoring 7B chat model, trained on a carefully selected mix publicly available language and code datasets.

Amber-7B

A 7B parameter English language model based on the LLaMA architecture has two fine-tuned instruction-following models named AmberChat and AmberSafe.

Projects

Analysis360: Open Implementations of LLM Analyses

Analysis360 provides open reference implementations for a variety of downstream analyses that can be done with and for LLM360 models, covering a range of topics including: mechanistic interpretability, visualization, machine unlearning, data memorization, AI safety, assessing toxicity & bias, and a large set of evaluation metrics.

Learn more

Papers

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

In this paper, we present LLM360 K2-65B, the most powerful fully transparent open-source large language model (LLM) released to date. K2 is a 65 billion parameter LLM, which follows best practices for reproducibility from the LLM360 project. Despite numerous efforts to develop and release open-source LLMs, full transparency around the training process still remains limited...

Learn more

LLM360: Towards Fully Transparent Open-Source LLMs

The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics...

Learn more

News

SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model

announcement
agent
Maitrix

We introduce SimuRA - a general architecture for optimal goal-oriented agent based on simulation with LLM-based world model, which reasons and plans across environments in the latent space of natural language. Web browsing experiments show improvement over baselines by up to 124%.

2nd Place, Berkeley LLM Agents Hackathon
(Fundamentals Track, 2 of ~3,000 Participants)

Learn more

TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend

announcement
dataset

We introduce TxT360 (Trillion eXtracted Text), the first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources from diverse domains (e.g., FreeLaw, PG-19, etc.). The large-scale deduplication process and rich metadata stored enables precise control over data distribution.

Learn more

Decentralized Arena via Collective LLM Intelligence

Maitrix
announcement
benchmark

LLM360 and Maitrix.org proudly release Decentralized Arena that automates and scales “Chatbot Arena” for LLM evaluation across various fine-grained dimensions (e.g., math – algebra, geometry, probability; logical reasoning, social reasoning, biology, chemistry, …). The evaluation is decentralized and democratic, with all LLMs participating in evaluating others.

Learn more

Introducing K2-65B: Charting the Blueprint Towards Open-Source Artificial General Intelligence

announcement
model

LLM360 is excited to announce several new releases to further our mission enabling community-owned AGI by creating standards and tools to advance the bleeding edge of LLM capability and empower knowledge transfer, research, and development.

Learn more

Introducing LLM360: Fully Transparent Open-Source LLMs

announcement
model

In recent months, the open-source large language model (LLM) community has seen tremendous model contributions. However, model weight releases and overview technical reports do not contain enough information to cover the complexity of LLM training, which hinders openness and transparency, the mechanisms behind trustworthy and innovative research and science for decades.

Learn more

LLM360 enables community-owned AI through open-source large model research and development.

Datasets