Author Topic: Open-R1: a Fully Open Reproduction Of DeepSeek-R1 (Read 72 times)

GenevaKort · « **on:** January 31, 2025, 10:14:50 pm »

Hey there! This blog post is an intro to the task, not a claim that we've recreated R1 yet. We're constructing in the open, so as soon as we have evaluation numbers, we'll share them. You can follow our progress on Hugging Face and GitHub.

True, but it appears like there's nothing to be assessed as of today. I assume the ultimate objective is to train a new thinking design and then utilize the exact same assessment metrics as o1 and the DeepSeek-R1.

Well, there need to be at least some peace of mind check and validation to make sure the design was trained properly.

Oh yes, if you are speaking about the assessment number of deepseek's model it's coming very quickly!

As mentioned in the post there is no model called Open-R1 to evaluate at all ... not yet anyway. This is a blog site detailing that Hugging face will take the R1 Deepseek model, exercise how it was constructed as outlined in the paper and from what they released, and after that duplicate that process.

in truth this is pretty much how science works ... A comes up with a strategy, discovery or development and it is checked by B, C and D to see if it is reproduceable. Thats been the foundation of research now for a couple of centuries.

This blog is not saying they have actually already done so ... Its a blog site describing an intent to start training a model like R1 and calling it Open-R1.

Also DeepSeek-R1 was just launched recently, and even in their paper they outlined the compute hours required. While those are low calculate hours for a SOTA model this does not imply you can train said design in a week. I 'd personally love to be able to train a transformer model in a week, but we may require to wait a while for that level of compute innovation.

So there are no benchmarks for a model that has not been constructed yet right? As laid out in the blog, and again in reply to your question.

However fear not, there is a GitHub Repo currently and factors (hell I might join myself), some prelim work done, and a master plan. A good starting position.

n
@edbeeching
has actually evaluated the released designs currently

( src: https://x.com/edwardbeeching/status/1884273209136275742)

R1 simply trained on o1 outputs, so jointly .../ s. This is what the new AI czars are stating

Hi! This blog site post is an introduction to the task, not a claim that we've reproduced R1 yet. We will totally share the missing out on piece when we have them, you can expect the designs and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo

That's great and crucial to understand this incredible hype that does not have technical understanding and explanation. Science has to do with recreation, and if they declare to be open, let them fullfill the open part.

Please do release the training cost.

We will!

Excalidraw Hi n
@bojan2501
thanks, we will indeed be working hard to make sure this training recipe can work for little language designs on customer hardware given that not everybody has a cluster of H100s at home:-RRB- The tool we used for the images was Excalidraw! https://excalidraw.com

eagerly anticipating it! WTF are your discussing?

should be a joke

It's really cool to see how the entire open source community comes together!

Ops ...

5.5 M is number reporter in the deepseekv3 tech report (simply the training, not the experiment afaik), for R1 difficult to estimate tbh but much less than 5.5 M imo

Historically, they have actually never released code or datasets of their LLM training, so I would not expect this time to be various. If they would launch it that would be remarkable of course!

Yes of course!

So essentially you're asking to replace existing censorship with another flavour of censorship?

The code for the models are inside the model repositories, e.g. for V3: https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/modeling_deepseek.py

Hello Team, I'm Ray Bernard, the author and developer of EQUATOR. My research study team will be working on a paper focused on reproducing particular parts of DeepSeek R1. Our goal is to reproduce the cold start and supply your group with a dataset that consists of COT and other strategies to support these efforts. We like to contribute our work to assist. Please let me understand if you find this beneficial. Best, Ray Bernard https://www.facebook.com/groups/1186310571520299/

Where is the evaluation numbers? without it you can't call it recreation.

8 replies

True, however it looks like there's absolutely nothing to be examined as of right now. I presume the ultimate goal is to train a brand-new reasoning model and after that use the very same examination metrics as o1 and the DeepSeek-R1.

That's quite intriguing, I was asking myself why the concerns the author exposed here are not being asked by others? I think the work they have actually done is unforgettable but at the same time I wonder why they would not put these missing out on pieces on if they are supposed to be fully open.
Why even without recreation and comprehension of the innovation they could affect a lot the marketplace in this way?

4 replies

Hi! This article is an intro to the job, not a claim that we have actually recreated R1 yet. We will absolutely share the missing out on piece when we have them, you can anticipate the models and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo

Interesting read, and it is excellent that we see more effort into this direction: more optimization and less brute force.
Also question what tool did the author usage for creating action diagram.

2 replies

Excalidraw I'm so thankful that effort like this currently exist, I'm gon na try to contribute:-RRB- 1 reply

looking forward to it! So racist articel

2 replies

WTF are your talking about?

Awesome to have this open recreation began!

For Step # 1 check out https://github.com/open-thoughts/open-thoughts!

https://x.com/ryanmart3n/status/1884284101265612856

Let's do this thing!

1 reply

It's truly cool to see how the whole open source community comes together!

Does anybody understand the real training cost of r1? I can't discover it in the paper or the announcement post. Is the 6M cost reported by media simply the number taken from v3's training expense?

2 replies

Ops ...

Has anyone asked the DeepSeek group to release their training data and code, or a minimum of share them independently with an independent duplication task like this? Have they turned down such a demand?

A devoted duplication depends upon utilizing the same dataset and hyperparameters. Otherwise, any major disparities with the released criteria would be tough to pin down-whether due to training information differences or the duplication method itself.

1 reply

Historically, they have never ever released code or datasets of their LLM training, so I would not anticipate this time to be various. If they would release it that would be remarkable of course!

In the meantime we need to make best guess price quotes and see if we can arrive ourselves.

You offer excellent replication process of Deepseek thinking training. I will attempt something comparable to it.

This is really great info, can we tweak with particular usage case when code is launched?

1 reply

Yes obviously!

Please consider eliminating biased, polluted or unaligned training information and make an effort to remove copyrighted works from the crawl from consumption. This will make the model more functional. If you recycled anthropic curation checks, this might likewise help, get rid of obviouslybiased data will likely include a lot of value. We do not want another tainted, unaligned open source design, right? And no business would ever utilize deepseek or a model that reuses it, right?
We appreciate your work for the benefit of humanity, we hope.
Miike C from NJ

1 reply

So generally you're asking to replace existing censorship with another flavour of censorship?

Can't wait! Hopefully the model will be uncensored however whatever you can do is alright! Love seeing open source structure itself up. I'm not smart adequate to actually assist but I can contribute ethical support lol

Hello guys, I am even simply trying to discover code for DeepSeek-V2, in order to fully understand multi-head hidden attention. You do not appear to have code in Hugging Face even for that. Or am I missing something? Don't see anything in src/transformers/models. MLA is not correctly described in their paper, so it would be essential to have code for this.

Subjunction06 Community

News:

Author Topic: Open-R1: a Fully Open Reproduction Of DeepSeek-R1 (Read 72 times)

GenevaKort

Open-R1: a Fully Open Reproduction Of DeepSeek-R1