DataCebo Revolutionizes Data Privacy with Enterprise Version of Synthetic Data Library

  • DataCebo's SDV addresses the challenge of generating data for LLMs without using PII.
  • What sets the enterprise version of SDV apart from the open-source version?
DataCebo's Enterprise Synthetic Data: Privacy Revolution

In the tech landscape DataCebo pioneers a breakthrough with the launch of the enterprise version of its widely adopted open-source synthetic data library, Synthetic Data Vault (SDV). Co-founded by MIT Data to AI Lab alumni Kalyan Veeramachaneni and Neha Patki in 2016, the company has evolved from a concept nurtured in the academic realms to a fully-fledged commercial solution, securing $8.5 million in seed funding.

The genesis of DataCebo's endeavor lies in the conviction that generative AI could extend beyond creating codes, text, and images to crafting synthetic data. Tailored for businesses that demand top-tier business data for large language models but are constrained from using Personally Identifiable Information (PII), SDV presents an intriguing proposition.

CEO Veeramachaneni asserts:

"Our software enables customers to construct a bespoke generative AI model on-premises. Subsequently, they can leverage synthetic data for diverse use cases."

This innovation finds application in healthcare and financial services, providing a secure avenue for testing and model building without exposing sensitive information.

Traditionally, companies faced the laborious task of manually creating synthetic data, a process laden with scalability challenges and error susceptibility. DataCebo addresses this pain point by employing generative AI, allowing users to describe the required data characteristics. The software then analyzes the features of the actual dataset and generates a high-quality synthetic set for testing purposes, ensuring the confidentiality of sensitive information.

DataCebo's journey commenced with the development of open-source tooling, namely the SDV, garnering significant traction with over a million downloads and an active community boasting over 1,000 participants on their Slack channel.

While the open-source version laid the groundwork, the enterprise edition distinguishes itself through enhanced scalability. Capable of handling up to 100 tables, compared to the open-source version designed for just a few tables, the enterprise solution accommodates the evolving needs of businesses building models based on a more extensive dataset, often surpassing 20 to 30 tables. Additionally, with 11 employees on board, DataCebo plans to augment its workforce to approximately 20 over the next year, contingent on business growth.

Delve into Atlasiko news to learn more about programming, AI, and technologies!

Tetiana Rafalovych
Tetiana Rafalovych
Professional author in IT Industry

Author of captivating articles and news for Atlasiko Inc. I consistently deliver engaging content that captivates readers and keeps them coming back for more. I try to ensure that every piece is well-researched and informative. Whether it's news, in-depth features, or insightful analysis, I have a knack for transforming complex information into narratives that resonate with audiences.

Share your thoughts in the comments below!

Have any ideas or suggestions about the article or website? Feel free to write it.

Any Questions?

Get in touch with us by simply filling up the form to start our fruitful cooperation right now.

Please check your email
Get a Free Estimate