Cohere's First Model for Developers

(cohere.com)

46 points | by hmokiguess 4 days ago

4 comments

moojacob 31 minutes ago
I was a fan of coheres general purpose LLM. Command A I think? Before they came out with their reasoning model.
More competition is better.
[-]
- SubiculumCode 4 minutes ago
  I always forget the VRAM requirements on these MOE things
  [-]
  - sipjca 0 minutes ago
    fwiw because of the relatively few activated params offloading to system RAM is quite feasible, you can see the endless amount of people doing this on r/localllama with qwen3.6 35a3b
tonyrice 1 hour ago
I'm excited to see more OSS models
[-]
moralestapia 2 hours ago
>Our plan to being profitable is to give mediocre stuff for free
[-]
- rdevilla 1 hour ago
  [dead]
cyanydeez 2 days ago
looks like it's just qwen 3.6 coder.
[-]
- lumost 2 hours ago
  its worse at code compared to qwen 3.6 coder.
- SubiculumCode 1 hour ago
  Do you mean it's based on qwen 3.6 coder?
  [-]
  - daemonologist 53 minutes ago
    There is no "coder" version of Qwen 3.6; I think they just mean it's a coding-focused model of similar size and performance (to Qwen 3.6 35B-A3B).
    Regular Qwen 3.6 benchmarks slightly better and has much wider software support though, so this is probably of interest only to organizations which disallow models trained in China.
    [-]
    - kadoban 29 minutes ago
      I mean, Qwen 3.6 kicks ass. I don't know who these people are, but if their first outing is "not quite as good as Qwen 3.6", that's not a bad start by any means.
      30B vs 35B isn't nothing either.
      If it ends up just being some tweaks to someone else's weights, then meh.