home.social

#syntheticdata — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #syntheticdata, aggregated by home.social.

  1. The preachers of the Silicon Valley Church sell the harvesting of the body as “algorithmic inevitability,” promising immortality. A neat fable. Maybe they’ll keep the data lords alive for 150 years—but half of it will be Alzheimer’s. In the end, the thermodynamic hammer still falls.
    #Transhumanizm #DataEngineering #CRISPR #EdgeAI #GenerativeAI #FederatedLearning #MachineLearning #DataScience #AITools #AIAutomation #CloudComputing #SyntheticData #SyntheticData #AntiHarari #MLOps #Longevity

  2. I'm creating #syntheticdata for teaching in the social sciences & find that #SDG with LLMs isn't for my small-scale use. While there are workflows to combine LLMs & generate more credible output ( link.springer.com/chapter/10.1 ), general-purpose models often create results that are too diverse & reflexive, even when imitating oral communication. Such data reminds me of journalism scandals à la Stephen Glass. High-quality data in my case is more messy and dull. Just look at YouTube comment sections.

  3. Oh, look! Another research paper trying to solve the problem of *literally* running out of text by using... *drumroll please*... abstract dynamical systems! Because who needs actual words when you can just invent your own with synthetic data? 😂 It's like trying to teach a dog to speak by showing it modern dance! 💃🕺
    hanseungwook.github.io/blog/nc #researchpaper #abstractdynamicalsystems #syntheticdata #humor #innovation #HackerNews #ngated

  4. 🚀 NVIDIA’s new Cosmos Transfer lets developers stream massive synthetic datasets across the Omniverse, scaling physical AI training for robotics and autonomous systems. OpenUSD‑based pipelines mean faster, reproducible simulations. Dive into how this could reshape research and benchmarks. #NVIDIAOmniverse #SyntheticData #PhysicalAI #OpenUSD

    🔗 aidailypost.com/news/nvidia-co

  5. 📊 AI training data could be fully depleted by 2032, according to recent studies. Without fresh datasets, innovation may slow, and bias risks could rise. 🛑

    💡 Can collaborative repositories or synthetic data solve the crisis? Let's discuss!

    👉 Read more: blueheadline.com/tech-news/ai-

    #AI #Tech #Innovation #AIData #BlueHeadline #FutureTech #ArtificialIntelligence #SyntheticData #DataGovernance

  6. I'm excited to see recent developments with LLMs may provide effective modeling of our aggregate values (multi-level, fine-grained), increasingly coherent over increasing context of meaning.

    LLMs such as ChatGPT and Gemini already impress us with their ability to produce coherent (but not necessarily correct) abstractions from massive collections of data relevant to our interests.

    Now, what if we run multiple instances of LLMs with diverse priors, collaboratively (at lower levels adversarial, at higher levels cooperative) selecting for increasing coherence over increasing context in the domain of present but evolving values?

    Of the two orthogonal dimensions producing the expanding space of moral (right in principle) agency —our values, and our methods for their promotion — we already have much attention on our evolving methods (science, technology) and it appears we are on the cusp of rapid development of an effective model of our evolving values!

    Cause for (Promethean) hope!

    Research remains to be performed on theoretical and practical questions involving expected signal-to-noise improvements of "synthetic data" extracted from latent information in our datasets, regression due to synthetic data added to the data set, biases, gaming of the system, and on and on… Interesting, exciting, and dangerous.

    #LLM #LargeLanguageModels #AI #GenerativeModels #SyntheticData #DecisionMaking #CollectiveDecisionMaking #Ethics #MetaEthics #Morality #ArrowOfMorality