home.social

#santacoder — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #santacoder, aggregated by home.social.

  1. #BigCode is an open scientific collaboration working on responsible training of large language models for coding applications.

    In this organization you can find the artefacts of this collaboration:
    👉 #StarCoder, a state-of-the-art language model for code,
    👉 The #Stack, the largest available pretraining dataset with perimssive code, and 👉 #SantaCoder, a 1.1B parameter model for code.

    #StarCoder is a 15.5B parameters language model for code trained for 1T tokens on 80+ programming languages.
    It uses MQA for efficient generation, has 8,192 tokens context window and can do fill-in-the-middle.

    Chat with StarCoder here: huggingface.co/chat/?model=big

    huggingface.co/bigcode

  2. There are quite a few code-generating “AI” systems now — GitHub #CoPilot, Amazon #CodeWhisperer, BigCode #SantaCoder, Facebook #Incoder, maybe even more.

    I wonder how hard it would be to get these BS bots to play #TDD ping-pong… I write a test, then they generate code until all tests pass, then i refactor, then we repeat.